使用给定的字典构建C++翻译器?

Building C++ translator with given dictionary?

本文关键字:构建 C++ 翻译器 字典      更新时间:2023-10-16

我正在尝试构建一个简单的翻译器,根据给定的字典翻译句子。假设我们有两串单词

string ENG[] = {"black","coffee", "want","yesterday"};
string SPA[] = {"negro", "café", "quiero", ayer"};

如果用户给出"我想要黑咖啡",结果应该是"我?基罗·黑人咖啡馆。这意味着对于字典字符串中没有翻译的单词,旁边应该有问号。

#include <iostream>
using namespace std;
int main(int argc, char *argv[]) {
string input string ENG[] = {"black", "coffee", "want", "yesterday"};
string SPA[] = {"negro", "café", "quiero", "ayer"};
cout << "Enter a word";
cin >> input;
for (int i = 0; i < 10; ++i) {
if (ENG[i] == input) {
cout << "You entered " << SPA[i] << endl;
}
}
return 0;
}

我所写的只是文字。我怎样才能编写这段代码并使句子成为可能?

正如评论中所建议的,因为这两个单独的数组使用起来非常麻烦并且难以更新。想象一下,在中间插入一个新的值对并弄乱偏移量......

所以这里更好的解决方案是使用std::map,特别是考虑到这应该是一个简单的 1:1 映射。

因此,您可以使用std::string作为键(原始单词(和std::string作为其值(翻译(来定义std::map

使用新式C++时,初始化可能如下所示:

std::map<std::string, std::string> translations {
{"black", "negro"},
{"coffee", "café"},
// ...
};

现在,对于逐字获取输入字符串,最快的内置方法是使用std::istringstream

std::istringstream stream(myInputText);
std::string word;
while (stream >> word) {
// do something with each word
}

查找实际的翻译也变得微不足道。遍历所有翻译在后台(在std::map类内部(进行:

const auto &res = translations.find(word);
if (res == translations.end()) // nothing found
std::cout << "? ";
else
std::cout << res->second << " "; // `res->second` is the value, `res->first` would be the key, i.e. `word`

至于一个完整的小例子:

#include <iostream>
#include <string>
#include <sstream>
#include <map>
int main(int argc, char **argv) {
std::map<std::string, std::string> translations {
{"black", "negro"},
{"coffee", "café"}
};
std::string source("I'd like some black coffee");
std::istringstream stream(source);
std::string word;
while (stream >> word) {
const auto &t = translations.find(word);
if (t != translations.end()) // found
std::cout << word << ": " << t->second << "n";
else
std::cout << word << ": ???n";
}
return 0;
}

此特定示例将创建以下输出:

I'd: ???
like: ???
some: ???
black: negro
coffee: café

你去吧。

#include <iostream>
#include <string>
#include <vector>
using namespace std;
vector <string> split_sentence(const string& arg)
{
vector <string> ret;
auto it = arg.begin();
while (it != arg.end()) {
string tmp;
while (it != arg.end() && *it == ' ') ++it;
while (it != arg.end() && *it != ' ')
tmp += *it++;
if (tmp.size())
ret.push_back(tmp);
}
return ret;
}
int main(int argc, char *argv[])
{
string input = "I want a black     coffee .";
string ENG[4] = {"black","coffee", "want","yesterday"};
string SPA[4] = {"negro", "café", "quiero", "ayer"};
cout << "Enter sentencen";
/*
cin >> input;
*/
for (auto& str: split_sentence(input)) {
bool found = false;
for (int j=0; j<4 && !found; ++j) {
if (ENG[j] == str) {
cout << SPA[j] << " ";
found = true;
}
}
if (!found)
cout << str << "? ";
}
cout << endl;
}

输出:

Enter sentence
I? quiero a? negro café .?

用空格拆分句子,然后从字典中找到适当的单词。 如果你是字典是big enough你需要使用一些树状的数据结构来提高速度或排序和哈希。

编辑:

Trie will be faster for this. For each query you 
can get the appropriate word in O(m), m = length of
query(English word)