计算RLE压缩字符串所需的空间

Count space required for RLE-compressed string

本文关键字:空间 字符串 RLE 压缩 计算      更新时间:2023-10-16

我正在做一个面试工作簿,正在做这个问题:实现一个函数,使用重复字符数压缩字符串(aabcccccaaa将变成a2blc5a3(。如果"压缩"不小于原始字符串,则返回原始字符串

我找到的一个解决方案首先计算压缩字符串的大小。我试着通过这个helper函数进行跟踪,但不理解这一行的逻辑任何破译的帮助都会很棒

size=size+1+std::to_string(count(.length((

count是一个int,它是重复的字符数在C++11中,to_string允许将int追加到字符串中。所以,A+1就是A1。

更好的解决方案:

int countCompression(const string& str)
{
    if (str.length() == 0)
        return 0;
    char prev = str[0];
    int size = 0;
    int count = 1;
    for (int i = 1; i < str.size(); i++)
    {
        // Repeated char
        if (str[i] == prev)
            count++;
        // Unique char
        else
        {
            prev = str[i];
            size = size + 1 + std::to_string(count).length();
            count = 1;
        }
        //cout << "The size in iteration " << i << " is now " << size << endl;
    }
    size = size + 1 + std::to_string(count).length();
    //cout << "The final size is: " << size << endl;
    return size;
}

顺便说一句,我写了这个解决方案,但之后它会检查大小。这浪费了空间,我认为:

#include <iostream>
#include <string>
std::string compress(const std::string &str)
{
    std::string comp;
    char prev = str[0];
    int count = 1;
    for (int i = 1; i < str.size(); i++)
    {
        if (prev == str[i])
            count++;
        else
        {
            comp += prev;
            comp += std::to_string(count);
            prev = str[i];
            count = 1;
        }
    }
    comp += prev;
    comp += std::to_string(count);
    if (comp.size() > str.size())
        return comp;
    else
        return str;
}

size=size+1+std::to_string(count(.length((;

这一行将重复字符本身(abc等(的size加上指定字符重复次数的数字count的字符串表示的length()

因此:

对于aa,2被添加到size:

size = size + 1 + to_string(2).length()

对于b,2被添加到size:

size = size + 1 + to_string(1).length()

对于ccccc,将2添加到大小:

size = size + 1 + to_string(5).length()

对于aaa,2被添加到size:

size = size + 1 + to_string(3).length()

因此最终压缩的CCD_ 14为8。

话虽如此,试试这个实现:

std::string compress(const std::string& str)
{
    std::string::size_type len = str.length();
    if (len > 1) // compressed size is 2 chars minimum
    {
        std::ostringstring compressed;
        char ch, prev = str[i];
        int count = 1;
        for (int i = 1; i < len; ++i)
        {
            ch = str[i];
            if (ch == prev) {
                // Repeated char
                ++count;
            }
            else {
                // Unique char
                compressed << prev << count;
                prev = ch;
                count = 1;
            }
        }
        // output the final pending char count
        compressed << prev << count;
        std::string result = compressed.str();
        if (result.length() < len)
            return result;
    }
    return str;
}

警示:

std::string::size_type compressSize(const std::string& str)
{
    std::string::size_type len = str.length();
    if (len > 1) // compressed size is 2 chars minimum
    {
        std::string::size_type size = 0;
        char ch, prev = str[i];
        int count = 1;
        for (int i = 1; i < len; ++i)
        {
            ch = str[i];
            if (ch == prev) {
                // Repeated char
                ++count;
            }
            else {
                // Unique char
                size += (1 + std::to_string(count).length());
                prev = ch;
                count = 1;
            }
        }
        // output the final pending char count
        size += (1 + std::to_string(count).length());
        if (size < len)
            return size;
    }
    return len;
}
std::string compress(const std::string& str)
{
    std::string::size_type len = str.length();
    if (compressSize(str) >= len)
        return str;
    std::ostringstring compressed;
    char ch, prev = str[i];
    int count = 1;
    for (int i = 1; i < len; ++i)
    {
        ch = str[i];
        if (ch == prev) {
            // Repeated char
            ++count;
        }
        else {
            // Unique char
            compressed << prev << count;
            prev = ch;
            count = 1;
        }
    }
    // output the final pending char count
    compressed << prev << count;
    return compressed.str();
}

或者:

bool canCompress(const std::string& str)
{
    std::string::size_type len = str.length();
    if (len <= 1) // compressed size is 2 chars minimum
        return false;
    std::string::size_type size = 0;
    char ch, prev = str[i];
    int count = 1;
    for (int i = 1; i < len; ++i)
    {
        ch = str[i];
        if (ch == prev) {
            // Repeated char
            ++count;
        }
        else {
            // Unique char
            size += (1 + std::to_string(count).length());
            if (size >= len) {
                return false;
            }
            prev = ch;
            count = 1;
        }
    }
    // output the final pending char count
    size += (1 + std::to_string(count).length());
    return (size < len);
}
std::string compress(const std::string& str)
{
    if (!canCompress(str))
        return str;
    std::ostringstring compressed;
    char ch, prev = str[i];
    int count = 1;
    std::string::size_type len = str.length();
    for (int i = 1; i < len; ++i)
    {
        ch = str[i];
        if (ch == prev) {
            // Repeated char
            ++count;
        }
        else {
            // Unique char
            compressed << prev << count;
            prev = ch;
            count = 1;
        }
    }
    // output the final pending char count
    compressed << prev << count;
    return compressed.str();
}

这意味着将下列相加的结果分配给size

  • 尺寸
  • 1
  • count的字符串表示的长度

也就是说,旧压缩表示的长度(size(加上表示当前运行所需的长度(字符为1,计数为1(。

size = size + 1 + std::to_string(count).length();

对于像aaaa => a4这样的每个元组,在这个代码行中,count中有4个
在目标字符串中,您需要放置字符a本身和数字,即1和写入数字count的长度(1表示"4",2表示"41"等(

顺便说一下。,您的原始函数有空字符串(char prev = str[0];(问题