如何逐行从文件中获取单词，并在C++中用分号分隔?

How do I get words from a file line by line and seperated by a semicolon in C++?

本文关键字：C++ 并在分隔单词获取何逐行逐行文件更新时间：2023-10-16

我有下面写的代码。我正在尝试建立一个从德语到英语的词典。我有一个文本文件上的所有单词，用分号(大约 100 行(分隔，行的第一部分是德语单词，分号之后是英文翻译("Hund;狗》(。如何获取第一个单词并将其存储在变量中，忽略分号，然后将第二个单词存储在单独的变量中？

ifstream myfile("tiere_animals.txt");
if (myfile.is_open())
{
Entry Animal[661];
while (getline(myfile, line, ';'))
{
line2.push_back(';');
line2.clear();
line3.append();
}
myfile.close();

如何获取第一个单词并将其存储在变量中，忽略分号，然后将第二个单词存储在单独的变量中？

只需向上读取直到;，然后向上读取直到换行符。

std::string english, german;
while (std::getline(myfile, german, ';') && std::getline(myfile, english, 'n')) {
std::cout << german << " in english is: " << english << "n";
}

您可以使用std::regex：


#include <regex>
#include <streambuf>
#include <fstream>
using German = std::string;
using English = std::string;
std::vector<std::pair<German, English>> ParseFile(const std::string& filename)
{
std::fstream f{filename, std::fstream::in};
if (!f.is_open())
throw std::exception("failed to open the file");
// given you said that there are no motre than 100 lines, you can read the whole file at once
std::string fileContent{std::istreambuf_iterator<char>(f),
std::istreambuf_iterator<char>()};
std::regex pat{R"((?:(w+);s(w+)n))"}; \ suppose you have a format like "German; English(end of line)"
std::regex_iterator start{fileContent.cbegin(),
fileContent.cend()},
end{};
std::vector<std::pair<German, English>> out;
while (start != end)
{
const std::smatch& sm = *start;
// may check here if subgroups mathced.
out.emplace_back(sm[1], sm[2]); // sm[0] is a main group.       
++start;
}
return out;
}

但是您需要正确格式化文件。

对于那些不喜欢 STL 正则表达式的人：在这种特殊情况下，没有关于内存使用或时间效率的明确限制。除此之外，std::regex还引入了可扩展性，因为您不需要更改代码而不是模式。因此，您可以轻松地将解析算法应用于具有其他布局的文件;


// can be used as a functor
class Parser
{
std::regex pattern_;
public:
Parser(std::regex pattern)
: pattern_(pattern)
{}

static std::vector<std::pair<German, English>> Parse(const std::string& filepath); // see above;
operator std::vector<std::pair<German, English>>(const std::string& filepath) const
{
return Parse(filepath);
}

}

您可以先阅读整行，然后将其单词分开：

ifstream myfile("tiere_animals.txt");
if (myfile.is_open()) {
string line, german, english;
while (getline(myfile, line)) {
istringstream iss(line);
getline(iss, german, ';');
getline(iss >> ws, english);
...
}
myfile.close();