在std::list和std::vector之间进行选择

Choosing between std::list and std::vector

本文关键字：std 行选选择之间 list vector 更新时间：2023-10-16

我写了一个代码，试图在一个向量中找到重复。在重复的情况下，它会将该位置添加到列表中。例如，100 110 90 100 140 90 100的序列将是一个二维向量。第一个维度包含唯一的字母(或数字)，并且附加了一个重复列表作为第二个维度。所以结果看起来像

100 -> [0 3 6]
110 -> [1]
90 -> [2 5]
140 -> [4]

代码相当简单

typedef unsigned long long int ulong;
typedef std::vector<ulong> aVector;
struct entry {
  entry( ulong a, ulong t ) {
    addr = a;
    time.push_back(t);
  }
  ulong addr;
  aVector time;
};
// vec contains original data
// theVec is the output vector
void compress( aVector &vec, std::vector< entry > &theVec )
{
   aVector::iterator it = vec.begin();
   aVector::iterator it_end = vec.end();
   std::vector< entry >::iterator ait;
   for (ulong i = 0; it != it_end; ++it, ++i) {  // iterate over input vector
     ulong addr = *it;
     if (theVec.empty()) {  // insert the first item
       theVec.push_back( entry(addr, i) );
       continue;
     }
     ait = find_if( theVec.begin(), theVec.end(), equal_addr(addr));
     if (ait == theVec.end()) { // entry doesn't exist in compressed vector, so insert
       theVec.push_back( entry(addr, i) );
     } else { // write down the position of the repeated number (second dimension)
       ait->time.push_back(i);
     }
   }
}

find_if会像这样查找

struct equal_addr : std::unary_function<entry,bool>
{
  equal_addr(const ulong &anAddr) : theAddr(anAddr) {}
  bool operator()(const entry &arg) const { return arg.addr == theAddr; }
  const ulong &theAddr;
};

问题是，对于中等大小的输入(我的测试是20M)，代码非常慢，可能需要一天的时间才能退出压缩函数。是否有机会加速使用std::list代替std::vec ?因为list对于顺序的事情表现得更好。但是，我只是想知道，这是否有帮助。如果它是有用的，那么我已经改变了一些其他的代码。

征求意见

你为什么不尝试一下，自己测量一下结果呢?
不，list对"顺序的事情"没有更好的表现。所有的表现都明显更差。

它唯一真正的优点是list中的元素是稳定的，并且指向元素的指针/迭代器不会随着列表的修改或增长而消除。