使用C++标准库以对数时间堆积

Heapify in logarithmic time using the C++ standard library

本文关键字：时间 C++ 标准使用更新时间：2023-10-16

我有一个使用std::make_heap的堆：

std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());

现在我通过更改一个随机元素来更新堆：

v[3] = 35;

标准库中有没有办法在容器大小O(log n)n再次调整堆。基本上我正在寻找堆化功能。我知道更改了哪些元素。

我知道std::make_heapO(n log n)现在是时候了。我也经历了重复的问题，但从某种意义上说，它正在改变最大元素。因为这个问题已经给出了O(log n)复杂性的解决方案。

我正在尝试更改堆中的任何随机元素。

你可以自己做：

void modify_heap_element(std::vector<int> &heap, size_t index, int value)
{
//while value is too large for its position, bubble up
while(index > 0 && heap[(index-1)>>1] < value)
{
size_t parent = (index-1)>>1;
heap[index]=heap[parent];
index = parent;
}
//while value is too large for its position sift down
for (;;)
{
size_t left=index*2+1;
size_t right=left+1;
if (left >= heap.size())
break;
size_t bigchild = (right >= heap.size() || heap[right] < heap[left] ?
left : right );
if (!(value < heap[bigchild]))
break;
heap[index]=heap[bigchild];
index = bigchild;
}
heap[index] = value;
}

如果我们仔细看看你的陈述：

现在我通过更改堆的一个随机元素来干扰堆。

对于堆积O(log n)，您只能直接"干扰"向量的背面或正面(这在某种程度上对应于插入或删除元素(。在这些情况下，可以通过std::push_heap和std::pop_heap算法实现(重新(堆积，这些算法需要对数运行时间。

也就是后面：

v.back() = 35;
std::push_heap(v.begin(), v.end()); // heapify in O(log n)

或前面：

v.front() = 35;
// places the front at the back
std::pop_heap(v.begin(), v.end()); // O(log n)
// v.back() is now 35, but it does not belong to the heap anymore
// make the back belong to the heap again
std::push_heap(v.begin(), v.end()); // O(log n)

否则你需要用std::make_heap重新堆积整个向量，这需要线性运行时间。

总结

使用标准库(即函数模板std::push_heap和std::pop_heap(修改堆的任意元素并实现对数运行时的堆化是不可能的。但是，您始终可以自己实现堆的游弋和接收器操作，以便在对数运行时进行堆化。

我也一直面临着想要"可更新堆"的问题。但是，最后，我没有编写自定义可更新堆或类似的东西，而是以不同的方式解决了它。

若要保持对最佳元素的访问而无需显式遍历堆，可以使用要排序的元素的版本化包装器。每个唯一的 true 元素都有一个版本计数器，每次更改元素时都会增加该计数器。然后，堆中的每个包装器都携带元素的一个版本，即创建包装器时的版本：

struct HeapElemWrapper
{
HeapElem * e;
size_t version;        
double priority;
HeapElemWrapper(HeapElem * elem)
: e(elem), version(elem->currentVersion), priority(0.0)
{}
bool upToDate() const
{
return version == e->currentVersion;
}
// operator for ordering with heap / priority queue:
// smaller error -> higher priority
bool operator<(const HeapElemWrapper & other) const
{
return this->priority> other.priority;
}
};

从堆中弹出最顶层的元素时，您只需检查此包装器元素以查看它是否与原始元素保持同步。如果没有，只需将其处理掉并弹出下一个即可。这种方法非常有效，我也在其他应用程序中看到过。您唯一需要注意的是，您不时地对堆进行传递以从过时的元素中清除它(例如，每 1000 次左右(。

如果不违反堆属性，仅使用标准库提供的函数模板std::pop_heap()和std::push_heap()，就不可能在对数运行时修改堆的任意元素。

但是，您可以定义自己的类似 STL 的函数模板set_heap_element()，以实现此目的：

template<typename RandomIt, typename T, typename Cmp>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value, Cmp cmp)
{
const auto n = last - first;
*pos = std::move(value); // replace previous value
auto i = pos - first;
using std::swap;
// percolate up
while (i > 0) { // non-root node
auto parent_it = first + (i-1)/2;
if (cmp(*pos, *parent_it))
break; // parent node satisfies the heap-property 
swap(*pos, *parent_it); // swap with parent
pos = parent_it;
i = pos - first;
}
// percolate down
while (2*i + 1 < n) { // non-leaf node, since it has a left child
const auto lidx = 2*i + 1, ridx = 2*i + 2;
auto lchild_it = first + lidx; 
auto rchild_it = ridx < n? first + ridx: last;
auto it = pos;
if (cmp(*it, *lchild_it))
it = lchild_it;
if (rchild_it != last && cmp(*it, *rchild_it))
it = rchild_it;
if (pos == it)
break; // node satisfies the heap-property
swap(*pos, *it); // swap with child
pos = it;
i = pos - first;
}
}

然后，您可以为最大堆提供以下简化的set_heap_element()重载：

#include <functional> // std::less
template<typename RandomIt, typename T>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value) {
return set_heap_element(first, last, pos, value, std::less<T>{});
}

此重载使用std::less<T>对象作为原始函数模板的比较函数对象。

例

在最大堆示例中，set_heap_element()可以按如下方式使用：

std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());
// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35);

您可以使用std::is_heap()，这需要线性时间，每当您想检查max-heap 属性是否仍然满足 max-heap 属性时，v使用上面的set_heap_element()函数模板设置元素：

assert(std::is_heap(v.begin(), v.end()));

最小堆呢？

你可以通过将一个std::greater<int>对象作为函数调用的最后一个参数传递给std::make_heap()、set_heap_element()和std::is_heap()来实现相同的最小堆：

std::vector<int> v{1,2,3,5,9,20,3};
// create a min heap
std::make_heap(v.begin(), v.end(), std::greater<int>{});
// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35, std::greater<int>{});
// is the min-heap property satisfied?
assert(std::is_heap(v.begin(), v.end(), std::greater<int>{}));