unordered_map / unordered_set中元组的泛型散列

Generic hash for tuples in unordered_map / unordered_set

本文关键字：unordered 泛型 map set 元组更新时间：2023-10-16

为什么std::unordered_map<tuple<int, int>, string>不跳出框框?为tuple<int, int>定义一个哈希函数是很繁琐的，例如

template<> struct do_hash<tuple<int, int>>                               
{   size_t operator()(std::tuple<int, int> const& tt) const {...}  };

用元组作为键构建无序映射(Matthieu M.)展示了如何自动执行boost::tuple。有没有办法在不使用可变模板的情况下为c++0x元组做这件事?

(

)

这可以在gcc 4.5上工作，允许所有包含标准哈希类型的c++0x元组成为成员unordered_map和unordered_set，不再赘述。(我把代码放在头文件中，只是包含它)

函数必须驻留在std命名空间中，以便被参数依赖的名称查找(ADL)。

有更简单的解决方案吗?

#include <tuple>
namespace std{
    namespace
    {
        // Code from boost
        // Reciprocal of the golden ratio helps spread entropy
        //     and handles duplicates.
        // See Mike Seymour in magic-numbers-in-boosthash-combine:
        //     http://stackoverflow.com/questions/4948780
        template <class T>
        inline void hash_combine(std::size_t& seed, T const& v)
        {
            seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }
        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };
        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }
    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              
    };
}

标准符合性代码

Yakk指出，在std命名空间中特化一些东西实际上是未定义的行为。如果您希望有一个符合标准的解决方案，那么您需要将所有这些代码移到您自己的名称空间中，并放弃ADL自动查找正确哈希实现的任何想法。而不是:

unordered_set<tuple<double, int> > test_set;

你需要

:

unordered_set<tuple<double, int>, hash_tuple::hash<tuple<double, int>>> test2;

其中hash_tuple是您自己的命名空间，而不是std::。

要做到这一点，首先必须在hash_tuple名称空间内声明一个散列实现。这将把所有非元组类型转发到std::hash:

namespace hash_tuple{
template <typename TT>
struct hash
{
    size_t
    operator()(TT const& tt) const
    {                                              
        return std::hash<TT>()(tt);                                 
    }                                              
};
}

确保hash_combine调用hash_tuple::hash而不是std::hash

namespace hash_tuple{
namespace
    {
    template <class T>
    inline void hash_combine(std::size_t& seed, T const& v)
    {
        seed ^= hash_tuple::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
    }
}

然后包括所有其他之前的代码，但把它放在namespace hash_tuple而不是std::

namespace hash_tuple{
    namespace
    {
        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };
        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }
    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              
    };
}

#include <boost/functional/hash.hpp>
#include <tuple>
namespace std
{
template<typename... T>
struct hash<tuple<T...>>
{
    size_t operator()(tuple<T...> const& arg) const noexcept
    {
        return boost::hash_value(arg);
    }
};
}

在我的c++ 0x草稿中，20.8.15说哈希是专门为内置类型(包括指针，但似乎并不意味着对它们解引用)设计的。它似乎也专门用于error_code、bitset<N>、unique_ptr<T, D>、shared_ptr<T>、typeindex、string、u16string、u32string、wstring、vector<bool, Allocator>和thread::id。(列表入迷!)

我没有使用c++ 0x variadics，所以我的格式可能偏离了，但是沿着这些行可能适用于所有元组。

size_t hash_combiner(size_t left, size_t right) //replacable
{ return left + 0x9e3779b9 + (right<<6) + (right>>2);}
template<int index, class...types>
struct hash_impl {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<index, std::tuple<types...>>::type nexttype;
        hash_impl<index-1, types...> next;
        size_t b = std::hash<nexttype>()(std::get<index>(t));
        return next(hash_combiner(a, b), t); 
    }
};
template<class...types>
struct hash_impl<0, types...> {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<0, std::tuple<types...>>::type nexttype;
        size_t b = std::hash<nexttype>()(std::get<0>(t));
        return hash_combiner(a, b); 
    }
};
template<class...types>
struct tuple_hash<std::tuple<types...>> {
    size_t operator()(const std::tuple<types...>& t) {
        const size_t begin = std::tuple_size<std::tuple<types...>>::value-1;
        return hash_impl<begin, types...>()(0, t);
    }
}

这个版本实际编译并运行

Yakk观察到直接专门化std::hash在技术上是不允许的，因为我们专门化了一个标准库模板，声明不依赖于用户定义的类型。

在c++ 20中，可以使用折叠表达式和泛型lambda来计算元组的哈希值，而不需要递归。我更喜欢依赖std::hash<uintmax_t>而不是手动组合哈希:

#include <cinttypes>
#include <cstddef>
#include <functional>
#include <tuple>
class hash_tuple {
    template<class T>
    struct component {
        const T& value;
        component(const T& value) : value(value) {}
        uintmax_t operator,(uintmax_t n) const {
            n ^= std::hash<T>()(value);
            n ^= n << (sizeof(uintmax_t) * 4 - 1);
            return n ^ std::hash<uintmax_t>()(n);
        }
    };
public:
    template<class Tuple>
    size_t operator()(const Tuple& tuple) const {
        return std::hash<uintmax_t>()(
            std::apply([](const auto& ... xs) { return (component(xs), ..., 0); }, tuple));
    }
};

sizeof(uintmax_t) * 4 - 1中的

- 1是可选的，但似乎稍微改善了哈希分布。这个类可以同时用于std::tuple和std::pair。

如果你想用一种简单的方式做到这一点。只做

std::unordered_map<std::tuple<int, int>, std::string, boost::hash<std::tuple<int, int>>> mp;