与C-string('malloc'+'memcpy')相比,'std::string'的性能真的很差

performance of `std::string` really bad compared to C-string (`malloc` + `memcpy`)

本文关键字:string 性能 真的 std malloc C-string memcpy 相比      更新时间:2023-10-16

我偶然发现了std::string非常糟糕的表现。std::string我希望从一些外部数据(例如 std::string(X.c_str()))大致相当于data = malloc(X.size())+strcpy(data, X.c_str()),有一些小的常量开销。

一些示例性能代码:

#include <string>
#include <string.h>
#include <assert.h>
static const char SampleString[] =
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
    "Hello world. Hello world. Hello world. Hello world. Hello world. "
;
static size_t N = 1000000;
// mostly for avoiding compiler optimization
void readStr(const char* _s) {
    volatile const char* s = _s;
    while(*s) ++s;
}
void cppStringLoop1() {
    for(size_t i = 0; i < N; ++i) {
        std::string tmp(SampleString);
        readStr(&tmp[0]);
    }
}
void cppStringLoop2() {
    for(size_t i = 0; i < N; ++i) {
        std::string tmp(SampleString, SampleString + sizeof(SampleString));
        readStr(&tmp[0]);
    }
}
void cStringLoop() {
    for(size_t i = 0; i < N; ++i) {
        char* tmp = (char*) malloc(sizeof(SampleString));
        memcpy(tmp, SampleString, sizeof(SampleString));
        readStr(tmp);
        free(tmp);
    }
}
int main(int argc, char** argv) {
    assert(argc >= 2);
    if(strcmp(argv[1], "-c") == 0) cStringLoop();
    else if(strcmp(argv[1], "-c++1") == 0) cppStringLoop1();
    else if(strcmp(argv[1], "-c++2") == 0) cppStringLoop2();
    else assert(false);
    return 0;
}

似乎在发布模式下使用 MSVC,我最初的假设是正确的。(发布模式 = MSVC 发布运行时库 + 优化。

但是,在调试模式(MSVC 调试运行时库 + 无优化)中,此假设似乎是错误的。开销并不小(约175%)。

也许这也是MSVC 2012 std::string实现。这里有一些数字:

$ time ./TestStringPerf.exe -c
real    0m6.879s
user    0m0.015s
sys     0m0.015s
$ time ./TestStringPerf.exe -c++1
real    0m10.524s
user    0m0.000s
sys     0m0.000s
$ time ./TestStringPerf.exe -c++2
real    0m10.106s
user    0m0.000s
sys     0m0.015s

或者,也许这只是预期的开销。

你继续使用 sizeof(SampleString) . SampleString是一个指针。因此,您的 C 代码和 cppStringLoop2 函数仅复制大约 4-8 个字符。

您需要:

  • sizeof(SampleString)的用途更改为std::strlen(SampleString);
  • 或将static const char* SampleString更改为static const char SampleString[],并在某些地方使用sizeof(SampleString),在其他地方使用sizeof(SampleString) - 1(即cppStringLoop2功能)。