指向字符与字符串的指针

Pointer to char vs String

本文关键字：指针字符串字符更新时间：2023-10-16

考虑这两段代码。他们将基数10转换为基数N，其中N是给定字母表中的字符数。实际上，它们生成给定字母表中字母的排列。假设1等于字母表中的第一个字母。

#include <iostream>
#include <string>
typedef unsigned long long ull;
using namespace std;
void conv(ull num, const string alpha, string *word){
    int base=alpha.size();
    *word="";
    while (num) {
        *word+=alpha[(num-1)%base];
        num=(num-1)/base;
    }
}
int main(){
    ull nu;
    const string alpha="abcdef";
    string word;
    for (nu=1;nu<=10;++nu) {
        conv(nu,alpha,&word);
        cout << word << endl;
    }
    return 0;
}

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef unsigned long long ull;
void conv(ull num, const char* alpha, char *word){
    int base=strlen(alpha);
    while (num) {
        (*word++)=alpha[(num-1)%base];
        num=(num-1)/base;
    }
}
int main() {
    char *a=calloc(10,sizeof(char)); 
    const char *alpha="abcdef";
    ull h;
    for (h=1;h<=10;++h) {
        conv(h,alpha,a);
        printf("%sn", a);
    }
}

输出相同：

b
c
d
aa
ba
ca
da

不，我没有忘记反转字符串，为了代码澄清，反转被删除了

出于某种原因，速度对我来说非常重要。我测试了根据上面的例子编译的可执行文件的速度，并注意到使用string在C++中编写的可执行程序比使用char *在C中编写的快10倍。

每个可执行文件都使用GCC的-O2标志进行编译。我运行的测试使用了更大的数字来转换，比如1e8和更多。

问题是：在这种情况下，为什么string的速度不如char *？

您的代码片段与不等价。*a='n'不会而附加到char数组。它将数组中的第一个char更改为'n'。

在C++中，std::strings应该优先于char数组，因为它们更容易使用，例如，只需使用+=运算符即可进行追加。

此外，char阵列不会自动为您管理内存。也就是说，std::string比手动管理的char阵列更不容易出错。

跟踪您得到的代码：

*a='n';
// 'n0000'
//  ^
//  a
++a;
// 'n0000'
//   ^
//   a
*a='o'
// 'no000'
//   ^
//   a

最后，a指向它的原始地址+1，wich是o。如果你打印a，你会得到"o"。

不管怎样，如果你需要"什么都不需要"而不是"不需要"怎么办？它不适合5个字符，你需要重新分配内存等。这就是字符串类在幕后为你做的事情，而且速度足够快，所以几乎在任何情况下都不是问题。

可以同时使用char*和string来处理C++中的一些文本。在我看来，字符串添加比指针添加慢得多。为什么会发生这种情况？

这是因为当您使用char数组或处理指向它的指针（char*）时，内存只分配一次。用"加法"描述的只是指向数组的指针的一次迭代。所以这只是一个指针的移动。

// Both allocate memory one time:
char test[4];
char* ptrTest = new char[4];
// This will just set the values which already exist in the array and will
// not append anything.
*(ptrTest++) = 't'
*(ptrTest++) = 'e';
*(ptrTest++) = 's';
*(ptrTest) = 't';

当您使用字符串时，+=运算符实际上会将字符附加到字符串的末尾。为了实现这一点，每次向字符串添加内容时，都会动态地分配内存。这个过程所花费的时间比仅仅迭代一个指针要长。

// This will allocate space for one character on every call of the += operator
std::string test;
test += 't';
test += 'e';
test += 's';
test += 't';

std::string a(2,' ');
a[0] = 'n';
a[1] = 'o';

在构造函数中更改字符串的大小，或者使用reserve、resize方法，这是您的选择。

在你的问题中，你混合了不同的东西，一种是字节的原始表示，可以被解释为字符串，没有语义或检查，另一种是带检查的字符串的抽象，相信我，安全性和避免segfault更重要，因为segfault可能导致代码注入和权限提升超过2ms。

从std:：string文档（此处）中可以看到，

basic_string& operator+=(charT c)

相当于在该字符串上调用push_back(c)，因此

string a;
a+='n';
a+='o';

相当于：

string a;
a.push_back('n');
a.push_back('o');

push_back确实比原始指针操作处理更多，因此速度较慢。例如，它负责字符串类的自动内存管理。