如何在C++中创建从地图到映射的倒排索引

How to create an inverted index from a map to map in C++?

本文关键字:地图 映射 倒排索引 创建 C++      更新时间:2023-10-16

我正在尝试从地图在地图中创建倒排索引。目前我有这个代码:

int main()
{
    char lineBuffer[200];
    typedef std::map<std::string, int> MapType;
    std::ifstream archiveInputStream("./hola");
    // map words to their text-frequency
    std::map<std::string, int> wordcounts;
    // read the whole archive...
    while (!archiveInputStream.eof())
    {
        //... line by line
        archiveInputStream.getline(lineBuffer, sizeof(lineBuffer));
        char* currentToken = strtok(lineBuffer, " ");
        // if there's a token...
        while (currentToken != NULL)
        {
            // ... check if there's already an element in wordcounts to be updated ...
            MapType::iterator iter = wordcounts.find(currentToken);
            if (iter != wordcounts.end())
            {
                // ... then update wordcount
                ++wordcounts[currentToken];
            }
            else
            {
                // ... or begin with a new wordcount
                wordcounts.insert(
                        std::pair<std::string, int>(currentToken, 1));
            }
            currentToken = strtok(NULL, " "); // continue with next token
        }
        // display the content
        for (MapType::const_iterator it = wordcounts.begin(); it != wordcounts.end();
                ++it)
        {
            std::cout << "Who(key = first): " << it->first;
            std::cout << " Score(value = second): " << it->second << 'n';
        }
    }
}

关于这个麻烦我不知道,因为我是使用地图结构的初学者。

非常感谢您的帮助。

我认为可能会有所帮助的是创建第二个地图,通过该索引索引索引具有相同字数索引的string列表,如下所示(类似于直方图):

std::map<int, std::list<std::string> > inverted;

因此,当您完成创建 wordcounts -map 时,您必须像这样手动将每个string插入倒排索引中(请注意,此代码未经测试!

// wordcounts to inverted index
for (std::map<std::string, int>::iterator it = wordcounts.begin();
        it != wordcounts.end(); ++it)
{
    int wordcountOfString = it->second;
    std::string currentString = it->first;
    std::map<int, std::list<std::string> >::iterator invertedIt =
            inverted.find(wordcountOfString);
    if (invertedIt == inverted.end())
    {
        // insert new list
        std::list<std::string> newList;
        newList.push_back(currentString);
        inverted.insert(
                std::make_pair<int, std::list<std::string>>(
                        wordcountOfString, newList));
    }
    else
    {
        // update existing list
        std::list<std::string>& existingList = invertedIt->second;
        existingList.push_back(currentString);
    }
}