如果我的密钥是我价值的一部分，我应该使用地图还是集合

Should I use a map or a set if my key is part of my value?

本文关键字：我应该地图集合密钥我的如果一部分更新时间：2023-10-16

在C++中，我有一个按名称排序的类，它是std::string。我希望在std::map或std::set中每个唯一的名称只有一个。

我可以使用std::set，因为operator<会根据实例的名称对其进行排序，但是，我需要根据其名称查找实例。然而，使用键为名称的映射是直接的，我也可以使用一个集合，并用我想要查找的名称构造一个类的伪实例，以在集合中定位给定名称的类的实际实例。

我想我应该直接使用映射来编写代码，但不知道是否有办法使用集合，因为键实际上是我对象的一部分，从而避免了一些冗余。

有没有一种方法可以使用集合，并能够以一种干净的方式通过它们的键来定位对象，或者我应该只使用地图并完成它？

这是要插入的类(以草案形式)，在每个目录中都有一个节点集或映射，以节点的名称键入：

class Node {
public:
Node(Directory &parent, const std::string &name)
: _name(name),
_parent(&parent),
_isRoot(false) {
if (name.empty()) {
throw InvalidNodeNameError(name);
}
}
protected:
// This is only used for the root directory:
Node()
: _name(""),
_parent(0),
_isRoot(true) {
}
Node(const std::string &name)
: _name(name),
_parent(0),
isRoot(false) {
}
public:
virtual ~Node() {
if (parent()) {
parent()->remove(*this);
}
}
bool operator<(const Node &rhs) const {
return _name < rhs._name;
}
Directory *parent() const {
return _parent;
}
void setParent(Directory *parent) {
_parent = parent;
}
const std::string &name() const {
return _name;
}
bool isRoot() const {
return _isRoot;
}
std::string pathname() const {
std::ostringstream path;
if (parent()) {
path << parent()->pathname() << '/';
} else {
path << '/';
}
path << name();
return path.str();
}
private:
// Not defined:
Node(const Node &rhs);
Node &operator=(const Node &rhs);
private:
std::string  _name;
Directory   *_parent;
const bool   _isRoot;
};

实际上，您可以使用map<std:：string&，节点>，以多投一分为代价，但我想你可能知道这一点，这需要一些努力才能得到你想要的。

我一直认为std:：set没有显式的KeyExtractor模板参数真的很痛苦，特别是因为我看到的每个实现都在后台使用其中一个，以便在(多)映射和(多)集之间不重复代码。这里有一个快速而肮脏的破解，还没有完成，它暴露了GNU标准C++库的一些机制，以便创建一个"keyed_set"容器：

// Deriving from the tree is probably not a good idea, but it was easy.
template<typename Key, typename Val, typename Extract,
typename Compare = std::less<Key>, typename Alloc = std::allocator<Val>>
class keyed_set : public std::_Rb_tree<Key, Val, Extract, Compare, Alloc> {
using Base = std::_Rb_tree<Key, Val, Extract, Compare, Alloc>;
public:
template<typename ...Args>
auto insert(Args... args)
->decltype(Base()._M_insert_unique(std::declval<Args>()...)) {
return this->_M_insert_unique(args...);
}
typename Base::iterator insert(typename Base::const_iterator i,
const Val& val) {
return this->_M_insert_unique_(i, val);
}
Val& operator[](const Key& key) {
auto i = this->lower_bound(key);
if (i == this->end() || this->key_comp()(key, Extract()(*i))) {
i = this->_M_insert_unique_(i, Val(key));
}
return *i;
}
};

要做到这一点，你需要提供一个密钥提取器，比如：

template<class T>
struct KeyExtractor;
template<>
struct KeyExtractor<Node> {
const std::string& operator()(const Node& n) { return n.name(); }
};

要使我的运算符[]版本正常工作，您需要值类型具有一个构造函数，该构造函数将其键类型作为参数。

我遗漏了很多东西(例如擦除)；但做一个简单的测试就足够了。

从KeyExtractor的返回类型默认键类型可能会更好，但这会涉及到将模板参数按不同的顺序排列，我已经浪费了太多时间，没有注意到_M_insert_unique和_M_insert_unique拼写不同(可能是为了避免模板实例化问题)

以下是我用来检查的例子，以确保它有效；MyKeyedClass有一个名称，带有一个字符串向量，每个字符串都有一个double。(没有崇高的目标。)

int main(void) {
keyed_set<std::string, MyKeyedClass, KeyExtractor<MyKeyedClass>> repo;
for (std::string name, val; std::cin >> name >> val; ) {
try {
size_t end;
double d = std::stod(val, &end);
if (end != val.size())
throw std::invalid_argument("trailing letters");
repo[name].increment(d);
} catch (std::invalid_argument(e)) {
repo[name].push(val);
} catch (std::out_of_range(e)) {
std::cerr << "You managed to type an out of range double" << std::endl;
}
}
std::for_each(repo.begin(), repo.end(),
[](MyKeyedClass& c){ std::cout << c << std::endl; });
return 0;
}

我认为，由于Node在构建过程中需要引用Directory，因此使用虚拟节点按名称搜索集合会使Node类更加混乱。

要使用set，您可能需要在某个地方创建一个静态Directory，并将其用作新的伪构造函数Node(const std::string&)中的伪引用。如果您没有声明explicit，则可以在对set::find的调用中直接使用string。

您可以将类转换为使用指针。。。但这会改变它的内部语义：Directory&总是有效的，而Directory*不一定是有效的。问问自己，你是否想仅仅因为喜欢set容器而让读者不太清楚语义。

因此，在这种情况下，我的观点非常明确。。。您可以选择：要么使用map并保持类的整洁，要么使用set并编写一些不用于其他任何内容的支持垃圾代码。=)

我通过键的代理来实现这样的类，例如，在std：：string的情况下，我有一个名为lightweight_string的类，它实现了operator <，并且在内部指向std::string，然后我使用map，并且具有使用映射的简单性和没有2版本键的性能。

对于您的特殊情况，请检查您的编译器是否足够旧，可以使用COW(写时复制)策略实现std::string。这在C++11中发生了变化，但旧的编译器版本仍然是COW。。。这样做的优点是，使用字符串作为键和值的一部分的映射几乎不需要任何成本。但请注意，这将在未来发生变化(或已经发生变化)。。。