要求文本编辑,文本格式

ask for text to edit, text formatting

本文关键字:文本 格式 文本编辑      更新时间:2023-10-16

我想做一个程序,要求文本(一个包含几个单词的段落)用逗号分隔。要转换文本并在两者之间添加标签,例如将文本格式化为 html 文本

例: word1, word2, word3<a> word1 </a>, <a> word2 </a>, <a> word3 </a>

所以我开始做这段代码,但我不知道如何继续。如何测试文本以找到单词的前面?我想用 ASCII 测试?也许有一个表格可以测试每个案例?

我不一定问完整的答案,但也许遵循的方向会有所帮助。

#include <iostream>
#include <iomanip>
#include <string> //For getline()
using namespace std;
// Creating class
class GetText
{
public:
    string text;
    string line; //Using this as a buffer
    void userText()
    {
        cout << "Please type a message: ";
        do
        {
            getline(cin, line);
            text += line;
        }
        while(line != "");
    }
    void to_string()
    {
        cout << "n" << "User's Text: " << "n" << text << endl;
    }
};

int main() {
    GetText test;
    test.userText();
    test.to_string();
    system("pause");
    return 0;
}
接下来你需要做的是用

deltimeter(在你的例子中是',')将你的输入拆分成一个向量,然后用pre和posfixes将所有内容组合在一起。C++默认情况下不支持拆分,您必须发挥创造力或搜索像这里这样的解决方案。

如果你想保持它非常简单,你可以通过一次检查两个字符来检测单词边界。这是一个工作示例。

using namespace std;
#include <iostream>
#include <string>
#include <cctype>
typedef enum boundary_type_e {
    E_BOUNDARY_TYPE_ERROR = -1,
    E_BOUNDARY_TYPE_NONE,
    E_BOUNDARY_TYPE_LEFT,
    E_BOUNDARY_TYPE_RIGHT,
} boundary_type_t;
typedef struct boundary_s {
    boundary_type_t type;
    int pos;
} boundary_t;
bool is_word_char(int c) {
    return ' ' <= c && c <= '~' && !isspace(c) && c != ',';
}
boundary_t maybe_word_boundary(string str, int pos) {
    int len = str.length();
    if (pos < 0 || pos >= len) {
        return (boundary_t){.type = E_BOUNDARY_TYPE_ERROR};
    } else {
        if (pos == 0 && is_word_char(str[pos])) {
            // if the first character is word-y, we have a left boundary at the beginning
            return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos};
        } else if (pos == len - 1 && is_word_char(str[pos])) {
            // if the last character is word-y, we have a right boundary left of the null terminator
            return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
        } else if (!is_word_char(str[pos]) && is_word_char(str[pos + 1])) {
            // if we have a delimiter followed by a word char, we have a left boundary left of the word char
            return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos + 1};
        } else if (is_word_char(str[pos]) && !is_word_char(str[pos + 1])) {
            // if we have a word char followed by a delimiter, we have a right boundary right of the word char
            return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
        }
        return (boundary_t){.type = E_BOUNDARY_TYPE_NONE};
    }
}
int main() {
    string str;
    string ins_left("<tag>");
    string ins_right("</tag>");
    getline(cin, str);
    // can't use length for the loop condition without recalculating it all the time
    for (int i = 0; str[i] != ''; i++) {
        boundary_t boundary = maybe_word_boundary(str, i);
        if (boundary.type == E_BOUNDARY_TYPE_LEFT) {
            str.insert(boundary.pos, ins_left);
            i += ins_left.length();
        } else if (boundary.type == E_BOUNDARY_TYPE_RIGHT) {
            str.insert(boundary.pos, ins_right);
            i += ins_right.length();
        }
    }
}

最好使用enum class但我忘记了符号。您也可以复制到缓冲区而不是就地生成新字符串,我只是想保持简单。随意将其扩展为基于类C++样式。要获得确切的所需输出,请先去除空格,然后将空格添加到ins_left和ins_right。