如何在C/ c++中(un)转义字符串

How to (un)escape strings in C/C++?

本文关键字:un 转义 转义字符 字符串 c++      更新时间:2023-10-16

给定一个计数字符串(字符数组,或像std::string这样的包装器),在C或c++中是否有一种"适当"的方式来转义和/或反转义它,使得"特殊"字符(如null字符)成为C风格转义和"正常"字符保持原样?

还是我必须手工做?

处理单个字符的函数:

/*
** Does not generate hex character constants.
** Always generates triple-digit octal constants.
** Always generates escapes in preference to octal.
** Escape question mark to ensure no trigraphs are generated by repetitive use.
** Handling of 0x80..0xFF is locale-dependent (might be octal, might be literal).
*/
void chr_cstrlit(unsigned char u, char *buffer, size_t buflen)
{
    if (buflen < 2)
        *buffer = '';
    else if (isprint(u) && u != ''' && u != '"' && u != '' && u != '?')
        sprintf(buffer, "%c", u);
    else if (buflen < 3)
        *buffer = '';
    else
    {
        switch (u)
        {
        case 'a':  strcpy(buffer, "\a"); break;
        case 'b':  strcpy(buffer, "\b"); break;
        case 'f':  strcpy(buffer, "\f"); break;
        case 'n':  strcpy(buffer, "\n"); break;
        case 'r':  strcpy(buffer, "\r"); break;
        case 't':  strcpy(buffer, "\t"); break;
        case 'v':  strcpy(buffer, "\v"); break;
        case '':  strcpy(buffer, "\\"); break;
        case ''':  strcpy(buffer, "'"); break;
        case '"':  strcpy(buffer, "\""); break;
        case '?':  strcpy(buffer, "\?"); break;
        default:
            if (buflen < 5)
                *buffer = '';
            else
                sprintf(buffer, "\%03o", u);
            break;
        }
    }
}

下面是处理以null结尾的字符串的代码(使用上面的函数):

void str_cstrlit(const char *str, char *buffer, size_t buflen)
{
    unsigned char u;
    size_t len;
    while ((u = (unsigned char)*str++) != '')
    {
        chr_cstrlit(u, buffer, buflen);
        if ((len = strlen(buffer)) == 0)
            return;
        buffer += len;
        buflen -= len;
    }
    *buffer = '';
}

比起分配一个新的缓冲区来包含转义字符串,我更喜欢在将字符串写入流时转义字符串。

下面的函数使代码简洁易读。

struct Escaped
{
    const char* str;
    friend inline std::ostream& operator<<(std::ostream& os, const Escaped& e)
    {
        for (const char* char_p = e.str; *char_p != ''; char_p++)
        {
            switch (*char_p)
            {
                case 'a':  os << "\a"; break;
                case 'b':  os << "\b"; break;
                case 'f':  os << "\f"; break;
                case 'n':  os << "\n"; break;
                case 'r':  os << "\r"; break;
                case 't':  os << "\t"; break;
                case 'v':  os << "\v"; break;
                case '':  os << "\\"; break;
                case ''':  os << "'"; break;
                case '"':  os << "\""; break;
                case '?':  os << "\?"; break;
                default: os << *char_p;
            }
        }
        return os;
    }
};
int main()
{
    std::cout << Escaped{ "foontbar" } << std::endl;
}

生产

foon   bar

//convert 'n' literal to escape code for 'n'
#define STRING "hello\\nworld\n"
char *p = malloc(strlen(STRING) + 1);
strcpy(p,STRING);
char *s = p;
char c;
for(;*p;++p)
{
  while(*p == '')
    {
      ++p;
      switch(*p){
      case '':
    c = '';
    goto gstat;
      case 'n':
    c = 'n';
      default:
    {
    gstat:
      strcpy(p-1,p);
      *(p-1) = c;
    }       
    break;
      }
    }
}
printf("%s",s);