使用 C++11 <random>高效生成随机数

Efficient random number generation with C++11 <random>

本文关键字：随机数高效 random C++11 lt 使用 gt 更新时间：2023-10-16

我试图理解如何使用C++11随机数生成功能。我关心的是性能。

假设我们需要在 0..k 之间生成一系列随机整数，但每一步k变化。最好的方法是什么？

例：

for (int i=0; i < n; ++i) {
    int k = i; // of course this is more complicated in practice
    std::uniform_int_distribution<> dist(0, k);
    int random_number = dist(engine);
    // do something with random number
}

<random>标头提供的发行版非常方便。但是它们对用户来说是不透明的，所以我不容易预测它们的表现。例如，尚不清楚上述dist的构造会导致多少（如果有）运行时开销。

相反，我本可以使用类似的东西

std::uniform_real_distribution<> dist(0.0, 1.0);
for (int i=0; i < n; ++i) {
    int k = i; // of course this is more complicated in practice
    int random_number = std::floor( (k+1)*dist(engine) );
    // do something with random number
}

这避免了在每次迭代中构造一个新对象。

随机数通常用于性能很重要的数值模拟中。在这些情况下使用<random>的最佳方法是什么？

请不要回答"剖析它"。性能分析是有效优化的一部分，但对库的使用方式和该库的性能特征的良好理解也是如此。如果答案是它取决于标准库实现，或者知道的唯一方法是分析它，那么我宁愿根本不使用 <random> 中的发行版。相反，我可以使用自己的实现，这对我来说是透明的，并且在必要时更容易优化。

您可以做的一件事是拥有一个永久的分发对象，以便每次都只创建param_type对象，如下所示：

template<typename Integral>
Integral randint(Integral min, Integral max)
{
    using param_type =
        typename std::uniform_int_distribution<Integral>::param_type;
    // only create these once (per thread)
    thread_local static std::mt19937 eng {std::random_device{}()};
    thread_local static std::uniform_int_distribution<Integral> dist;
    // presumably a param_type is cheaper than a uniform_int_distribution
    return dist(eng, param_type{min, max});
}

为了最大限度地提高性能，首先考虑不同的PRNG，例如xorshift128+。据报道，64 位随机数的速度是 mt19937 的两倍多;请参阅 http://xorshift.di.unimi.it/。它可以通过几行代码实现。

此外，如果你不需要"完全平衡"的均匀分布，并且你的k远小于2^64（很可能是），我建议简单地写一些东西：

uint64_t temp = engine_64(); // generates 0 <= temp < 2^64
int random_number = temp % (k + 1); // crop temp to 0,...,k

但请注意，整数除法/取模运算并不便宜。例如，在英特尔 Haswell 处理器上，64 位数字需要 39-103 个处理器周期，这可能比调用 MT19937 或 xorshift+ 引擎要长得多。