与C-string('malloc'+'memcpy')相比,'std::string'的性能真的很差
performance of `std::string` really bad compared to C-string (`malloc` + `memcpy`)
我偶然发现了std::string
非常糟糕的表现。std::string
我希望从一些外部数据(例如 std::string(X.c_str())
)大致相当于data = malloc(X.size())
+strcpy(data, X.c_str())
,有一些小的常量开销。
一些示例性能代码:
#include <string>
#include <string.h>
#include <assert.h>
static const char SampleString[] =
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
"Hello world. Hello world. Hello world. Hello world. Hello world. "
;
static size_t N = 1000000;
// mostly for avoiding compiler optimization
void readStr(const char* _s) {
volatile const char* s = _s;
while(*s) ++s;
}
void cppStringLoop1() {
for(size_t i = 0; i < N; ++i) {
std::string tmp(SampleString);
readStr(&tmp[0]);
}
}
void cppStringLoop2() {
for(size_t i = 0; i < N; ++i) {
std::string tmp(SampleString, SampleString + sizeof(SampleString));
readStr(&tmp[0]);
}
}
void cStringLoop() {
for(size_t i = 0; i < N; ++i) {
char* tmp = (char*) malloc(sizeof(SampleString));
memcpy(tmp, SampleString, sizeof(SampleString));
readStr(tmp);
free(tmp);
}
}
int main(int argc, char** argv) {
assert(argc >= 2);
if(strcmp(argv[1], "-c") == 0) cStringLoop();
else if(strcmp(argv[1], "-c++1") == 0) cppStringLoop1();
else if(strcmp(argv[1], "-c++2") == 0) cppStringLoop2();
else assert(false);
return 0;
}
似乎在发布模式下使用 MSVC,我最初的假设是正确的。(发布模式 = MSVC 发布运行时库 + 优化。
但是,在调试模式(MSVC 调试运行时库 + 无优化)中,此假设似乎是错误的。开销并不小(约175%)。
也许这也是MSVC 2012 std::string
实现。这里有一些数字:
$ time ./TestStringPerf.exe -c
real 0m6.879s
user 0m0.015s
sys 0m0.015s
$ time ./TestStringPerf.exe -c++1
real 0m10.524s
user 0m0.000s
sys 0m0.000s
$ time ./TestStringPerf.exe -c++2
real 0m10.106s
user 0m0.000s
sys 0m0.015s
或者,也许这只是预期的开销。
你继续使用 sizeof(SampleString)
. SampleString
是一个指针。因此,您的 C 代码和 cppStringLoop2
函数仅复制大约 4-8 个字符。
您需要:
- 将
sizeof(SampleString)
的用途更改为std::strlen(SampleString)
; - 或将
static const char* SampleString
更改为static const char SampleString[]
,并在某些地方使用sizeof(SampleString)
,在其他地方使用sizeof(SampleString) - 1
(即cppStringLoop2
功能)。
相关文章:
- 删除一个线程上有数百万个字符串的大型哈希映射会影响另一个线程的性能
- OpenMP阵列性能较差
- 递归列出所有目录中的C++与Python与Ruby的性能
- cppcheck在const std::string[]上引发警告
- 将std::string传递给WriteConsole API
- 大小相等但成员数量不同的结构之间的性能差异
- 为什么constexpr的性能比正常表达式差
- 为std::string的某个索引赋值
- std中有类似find_last_of的函数,而string中没有
- 在类中使用随机生成器时出现性能问题
- 在main()之外初始化std::vector会导致性能下降(多线程)
- 性能比较:f(std::string&&) vs f(T&&)
- 与C-string('malloc'+'memcpy')相比,'std::string'的性能真的很差
- 我会看到使用 std::map 而不是 vector<pair<string、string> > 的性能提升吗?
- 在这个C++代码中,std::string 是否可以替换为模板参数 T,如果是这样,Meyer 关于其性能成本的论点是否仍然适用?
- C++ map<std::string> vs map<char *> 性能(我知道,"again?" )
- 性能比较:strstr()与std::string::find()
- map< "string" ,..> 和 map<int,..> 之间的性能差异?
- 将std::string转换为大写:主要性能差异
- std::string运算符+与stringstream的性能