Combinations of N Boost interval_set

Combinations of N Boost interval_set

本文关键字:set interval Boost of Combinations      更新时间:2023-10-16

我有一项服务在4个不同的位置出现中断。我正在将每个位置的停机建模为Boost ICL interval_set。我想知道什么时候至少有N个地点出现了活动中断。

因此,根据这个答案,我实现了一个组合算法,这样我就可以通过interval_set交集创建元素之间的组合。

当这个过程结束时,我应该有一定数量的interval_set,每一个都同时定义N个位置的中断,最后一步是将它们连接起来,以获得所需的全貌。

问题是,我目前正在调试代码,当打印每个交叉点的时间到来时,输出文本变得疯狂(即使我正在使用gdb一步一步地调试),我也看不到它们,导致大量CPU使用。

我想,不知怎么的,我发送的内存比我应该输出的要大,但我看不出问题出在哪里。

这是SSCCE:

#include <boost/icl/interval_set.hpp>
#include <algorithm>
#include <iostream>
#include <vector>

int main() {
    // Initializing data for test
    std::vector<boost::icl::interval_set<unsigned int> > outagesPerLocation;
    for(unsigned int j=0; j<4; j++){
        boost::icl::interval_set<unsigned int> outages;
        for(unsigned int i=0; i<5; i++){
            outages += boost::icl::discrete_interval<unsigned int>::closed(
                (i*10), ((i*10) + 5 - j));
        }
        std::cout << "[Location " << (j+1) << "] " << outages << std::endl;
        outagesPerLocation.push_back(outages);
    }
    // So now we have a vector of interval_sets, one per location. We will combine
    // them so we get an interval_set defined for those periods where at least
    // 2 locations have an outage (N)
    unsigned int simultaneusOutagesRequired = 2;  // (N)
    // Create a bool vector in order to filter permutations, and only get
    // the sorted permutations (which equals the combinations)
    std::vector<bool> auxVector(outagesPerLocation.size());
    std::fill(auxVector.begin() + simultaneusOutagesRequired, auxVector.end(), true);
    // Create a vector where combinations will be stored
    std::vector<boost::icl::interval_set<unsigned int> > combinations;
    // Get all the combinations of N elements
    unsigned int numCombinations = 0;
    do{
        bool firstElementSet = false;
        for(unsigned int i=0; i<auxVector.size(); i++){
            if(!auxVector[i]){
                if(!firstElementSet){
                    // First location, insert to combinations vector
                    combinations.push_back(outagesPerLocation[i]);
                    firstElementSet = true;
                }
                else{
                    // Intersect with the other locations
                    combinations[numCombinations] -= outagesPerLocation[i];
                }
            }
        }
        numCombinations++;
        std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here
    }
    while(std::next_permutation(auxVector.begin(), auxVector.end()));
    // Get the union of the intersections and see the results
    boost::icl::interval_set<unsigned int> finalOutages;
    for(std::vector<boost::icl::interval_set<unsigned int> >::iterator
        it = combinations.begin(); it != combinations.end(); it++){
        finalOutages += *it;
    }
    std::cout << finalOutages << std::endl;
    return 0;
}

有什么帮助吗?

正如我所推测的,这里有一种"高级"方法。

Boost ICL容器不仅仅是"区间起点/终点的美化对"的容器。它们的设计目的是以通用优化的方式实现合并、搜索的业务。

所以不必这么做

如果你让图书馆做它应该做的事:

using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval  = DownTimes::interval_type;
using Records   = std::vector<DownTimes>;

使用函数域typedefs会带来更高层次的方法。现在,让我们来问一个假设的"商业问题":

我们实际想如何处理每个位置的停机时间记录

好吧,我们基本上想要

  1. 对所有可辨别的时隙进行计数
  2. 筛选那些计数至少为2的
  3. 最后,我们想展示剩下的"合并"时隙

好的,工程师:实施它!


  1. 嗯。理货。这有多难?

    ❕优雅解决方案的关键是选择正确的数据结构

    using Tally     = unsigned; // or: bit mask representing affected locations?
    using DownMap   = boost::icl::interval_map<TimePoint, Tally>;
    

    现在只是批量插入:

    // We will do a tally of affected locations per time slot
    DownMap tallied;
    for (auto& location : records)
        for (auto& incident : location)
            tallied.add({incident, 1u});
    
  2. 好的,让我们过滤一下。我们只需要在DownMap上工作的谓词,对

    // define threshold where at least 2 locations have an outage
    auto exceeds_threshold = [](DownMap::value_type const& slot) {
        return slot.second >= 2;
    };
    
  3. 合并时隙!

    事实上。我们只是创造了另一个唐顿时代的场景,对吧。只是,这次不是按地点。

    数据结构的选择再次赢得胜利:

    // just printing the union of any criticals:
    DownTimes merged;
    for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
        merged.insert(slot);
    

报告!

std::cout << "Criticals: " << merged << "n";

请注意,我们在任何地方都没有接近于操纵数组索引、重叠或非重叠间隔、闭合或开放边界。或者,[eeeee k!]集合元素的蛮力排列。

我们只是说明了我们的目标,让图书馆来做这项工作。

完整演示

在Coliru上直播

#include <boost/icl/interval_set.hpp>
#include <boost/icl/interval_map.hpp>
#include <boost/range.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>
#include <boost/range/numeric.hpp>
#include <boost/range/irange.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval  = DownTimes::interval_type;
using Records   = std::vector<DownTimes>;
using Tally     = unsigned; // or: bit mask representing affected locations?
using DownMap   = boost::icl::interval_map<TimePoint, Tally>;
// Just for fun, removed the explicit loops from the generation too. Obviously,
// this is bit gratuitous :)
static DownTimes generate_downtime(int j) {
    return boost::accumulate(
            boost::irange(0, 5),
            DownTimes{},
            [j](DownTimes accum, int i) { return accum + Interval::closed((i*10), ((i*10) + 5 - j)); }
        );
}
int main() {
    // Initializing data for test
    using namespace boost::adaptors;
    auto const records = boost::copy_range<Records>(boost::irange(0,4) | transformed(generate_downtime));
    for (auto location : records | indexed()) {
        std::cout << "Location " << (location.index()+1) << " " << location.value() << std::endl;
    }
    // We will do a tally of affected locations per time slot
    DownMap tallied;
    for (auto& location : records)
        for (auto& incident : location)
            tallied.add({incident, 1u});
    // We will combine them so we get an interval_set defined for those periods
    // where at least 2 locations have an outage
    auto exceeds_threshold = [](DownMap::value_type const& slot) {
        return slot.second >= 2;
    };
    // just printing the union of any criticals:
    DownTimes merged;
    for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
        merged.insert(slot);
    std::cout << "Criticals: " << merged << "n";
}

哪个打印

Location 1 {[0,5][10,15][20,25][30,35][40,45]}
Location 2 {[0,4][10,14][20,24][30,34][40,44]}
Location 3 {[0,3][10,13][20,23][30,33][40,43]}
Location 4 {[0,2][10,12][20,22][30,32][40,42]}
Criticals: {[0,4][10,14][20,24][30,34][40,44]}

在置换循环的末尾,您写:

numCombinations++;
std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl;  // The problem appears here

我的调试器告诉我,在第一次迭代中,numCombinations在增量之前是0。但是递增它会使它超出combinations容器的范围(因为它只是一个元素,所以索引为0)。

你的意思是在使用后增加吗?有没有什么特别的理由不使用

std::cout << "[-INTERSEC-] " << combinations.back() << "n";

或者,对于c++03

std::cout << "[-INTERSEC-] " << combinations[combinations.size()-1] << "n";

甚至只是:

std::cout << "[-INTERSEC-] " << combinations.at(numCombinations) << "n";

哪个会抛出std::out_of_range


顺便说一句,我认为Boost ICL有非常更有效的方法来获得您想要的答案。让我想一想。如果我看到它,会发布另一个答案。

更新:发布其他答案显示Boost ICL 的外壳高级编码