为什么无序设置混合值

why does unordered set mix the values

本文关键字:混合 设置 无序 为什么      更新时间:2023-10-16

我正在尝试使用unordered_set从向量中删除重复项。但是我的设计创建了一个无法正确维护订单的Unordered_set。在此示例中," Z"不是最后。我究竟做错了什么?预先感谢您。

编辑:对不起,如果我不清楚自己想要的东西。我希望输出为" e,d,a,b,c,z",我想保持原始订购,但要删除重复项。我目前使用大约3种不同的循环和额外的Init vector副本来工作。如果可能的话,我只是在寻找更干净的STL功能。

产生的输出:E D A B C A A A A B B B B B C Z打印无序集E D A Z B C

#include <iostream> 
#include <iterator>     
#include <algorithm>    
#include <string>
#include <unordered_set>
using namespace std;
int main() {
    vector<string>terminals = { "e", "d", "a", "b", "c", "a", "a", "a", "a", "b","b", "b", "b", "c", "z" };
    for (vector<string>::iterator it = terminals.begin(); it != terminals.end(); it++) // print given vector
        cout << *it << " ";
    cout << endl;
    unordered_set<string> newSet;
    copy(terminals.begin(), terminals.end(), inserter(newSet, newSet.end()));
    cout << "printing unordered set" << endl;
    for (unordered_set<string>::iterator it = newSet.begin(); it != newSet.end(); it++)
        cout << *it << " ";
    cout << endl;
    //system("pause");
    return 0;
}

std :: unordered_set:

内部,元素未按任何特定顺序排序,而是 组织成水桶。将元素放入哪个水桶的依赖 完全取决于其价值。这允许快速访问 单个元素,由于一旦计算了哈希,它就指的是 确切的水桶将元件放入。

如果您需要订购唯一的终端,请使用std :: set:

#include <iostream>
#include <vector>
#include <string>
#include <set>
int main() {
    std::vector<std::string>terminals = { "e", "d", "a", "b", "c", "a", "a", "a", "a", "b","b", "b", "b", "c", "z" };
    for(const std::string& terminal : terminals) // print given vector
        std::cout << terminal << " ";
    std::cout << "n";;
    // populate the set directly from the vectors iterators:
    std::set<std::string> newSet(terminals.begin(), terminals.end());;
    std::cout << "printing the (ordered) set:" << "n";;
    for(const std::string& terminal : newSet)
        std::cout << terminal << " ";
    std::cout << "n";;
}

如果要维护原始订单,则不能将任一设置用作最终存储您的最终存储。

#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <unordered_set>
int main() {
    std::vector<std::string>terminals = { "e", "d", "a", "b", "c", "a", "a", "a", "a", "b","b", "b", "b", "c", "z" };
    for(const std::string& terminal : terminals) // print given vector
        std::cout << terminal << " ";
    std::cout << "n";;
    std::vector<std::string> newSet; // not really a set anymore
    std::unordered_set<std::string> cache; // blacklist
    // try to insert all terminals and only when an insert is successful,
    // put the terminal in newSet
    std::for_each(terminals.begin(), terminals.end(),
        [&](const std::string& terminal) {
            auto [it, inserted] = cache.insert(terminal);
            if(inserted)
                newSet.push_back(terminal);
        }
    );
    std::cout << "printing the vector of unique terminals:" << "n";;
    for(const std::string& terminal : newSet)
        std::cout << terminal << " ";
    std::cout << "n";;
}

如果您想要原始顺序不介意直接对原始terminals向量进行更改,则可以使用std::remove_ifunordered_set合并,这很不错,因为它不需要新的向量。这是@marek r答案的注释变体:

首先阅读:擦除 - 删除成语

int main() {
    std::vector<std::string>terminals = { "e", "d", "a", "b", "c", "a", "a", "a", "a", "b","b", "b", "b", "c", "z" };
    for(const std::string& terminal : terminals) // print given vector
        std::cout << terminal << " ";
    std::cout << "n";;
    std::unordered_set<std::string> cache; // blacklist
    // remove_if() moves all entries in your container, for which the
    // UnaryPredicate(*) returns true, to the end of the container. It returns
    // an iterator pointing to the first element in the vector that was
    // moved - which is a suitable starting point for a subsequent erase().
    //
    // (*) UnaryPredicate: A callable that returns true or false given a single
    //                     value.
    // auto past_new_end = std::vector<std::string>::iterator past_new_end
    auto past_new_end = std::remove_if(terminals.begin(), terminals.end(),
        // this lambda is the UnaryPredicate
        [&](const std::string& terminal) {
            // insert returns a std::pair<Iterator, bool>
            // where the bool (.second in the pair) is false
            // if the value was not inserted (=it was already present)
            return cache.insert(terminal).second == false;
        }
    );
    std::cout << "display all the entries (now with unspecified values) "
                 "that will be erased:n";
    std::copy(past_new_end, terminals.end(),
                            std::ostream_iterator<std::string>(std::cout, "<"));
    std::cout << "n";
    // erase all the moved entries
    terminals.erase(past_new_end, terminals.end());
    std::cout << "printing the unique terminals:" << "n";;
    for(const std::string& terminal : terminals)
        std::cout << terminal << " ";
    std::cout << "n";;
}

如果要维护原始订单,但是执行唯一性,则可能要:

  1. 在项目中阅读。
  2. 尝试将其插入集合
  3. 如果成功了,则不在集合中,因此也将其复制到输出
  4. 重复

如果您要排序的输出(在您的示例中,输出将是" a b c d e z"),那么您可以将项目插入std::set中,否则您可以使用std::sort,然后使用std::unique,然后获得CC_7输入中的每个唯一元素。

看起来您想使用(有序)集。

编辑:实际上看起来不像。std::vector可以起作用,但它可能不是最干净的解决方法。

您也可以使用无序的地图,然后将项目作为地图的键,索引作为对该密钥的相应值。

我正在尝试使用unordered_set从矢量中删除重复项。

为什么您认为unordered_set可以保守任何形式的订单?名称清楚地指出没有任何特定顺序。

您应使用std::unordered_set0仅在序列中找到项目。基于此,您可以从序列中删除项目,因此应该这样做:

void removeDuplicates(Data &data)
{
    std::unordered_set<std::string> foundItems;
    auto newEnd = std::remove_if(data.begin(), data.end(), [&foundItems](const auto &s)
                                 {
                                     return !foundItems.insert(s).second;
                                 });
    data.erase(newEnd, data.end());
}

https://wandbox.org/permlink/t24ufilqep0xuqhq