boost regex比isupper和isdigit的组合慢吗?

Is boost regex slower than this combination of isupper and isdigit?

本文关键字:组合 isdigit regex isupper boost      更新时间:2023-10-16

我需要检查字符串是否仅由一组特殊字符(ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789<)组成。

我可以使用boost正则表达式或isupperisdigit的组合。当涉及到性能时,哪一个会被认为是更好的选择?我正在测试的字符串长度约为100个字符。

bool IsValid(string& rString)
{
  boost::regex regex("[0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ<]+");
  boost::cmatch match;
  return boost::regex_match(rString.c_str(), match, regex);
}
bool IsValid(string& rString)
{
        for (string::size_type i = 0; i < rString.size(); ++i)
            if (!isupper(rString[i]))
                if (!isdigit(rString[i]))
                    if (rString[i] != '<')
                        return false;
        return true;
}

YYY比YYY慢?

答案:time it

在本例中(将boost::regex替换为std::regex):

#include <vector>
#include <iostream>
#include <iomanip>
#include <regex>
bool IsValid1(std::string const& rString)
{
    static const std::regex regex("[0-9A-Z<]+");
    return std::regex_match(rString, regex);
}
bool IsValid2(std::string const& rString)
{
    for (std::string::size_type i = 0; i < rString.size(); ++i)
        if (!std::isupper(rString[i]))
            if (!std::isdigit(rString[i]))
                if (rString[i] != '<')
                    return false;
    return true;
}
auto make_samples = []()
{
    std::vector<std::string> result;
    result.reserve(100000);
    std::generate_n(std::back_inserter(result), 100000, []
                    {
                        if (rand() < (RAND_MAX / 2))
                        {
                            return std::string("ABCDEF34<63DFGS");
                        }
                        else
                        {
                            return std::string("ABCDEF34<63DfGS");
                        }
                    });
    return result;
};
int main() {
    auto samples = make_samples();
    auto time = [](const char* message, auto&& func)
    {
        clock_t tStart = clock();
        auto result = func();
        clock_t tEnd = clock();
        std::cout << message << " yields " << result << " in " << std::fixed << std::setprecision(2) << (double(tEnd - tStart) / CLOCKS_PER_SEC) << 'n';
    };

    time("regex method: ", [&]()->std::size_t
         {
             return std::count_if(samples.begin(), samples.end(), IsValid1);
         });
    time("search method: ", [&]()->std::size_t
         {
             return std::count_if(samples.begin(), samples.end(), IsValid2);
         });
}
样本结果:

regex method:  yields 49816 in 1.29
search method:  yields 49816 in 0.04