C++保存和加载巨大的矢量<bool>

C++ save and load huge vector<bool>

本文关键字：lt bool gt 保存加载巨大 C++ 更新时间：2023-10-16

我有一个巨大的vector<vector<bool>>（512 x 44000000位）。创建它需要4-5个小时的计算时间，显然我想保存结果，以免再次重复这个过程。当我再次运行该程序时，我只想加载相同的矢量（没有其他应用程序会使用此文件）。

我相信文本文件对于这么大的尺寸是不可能的。有没有一种简单（快速而肮脏）的方法可以做到这一点？我不使用Boost，这只是我的科学应用程序的一个小部分，所以它一定很快。我还想过在线反转它，并将其存储在Postgres数据库中（44000000条记录，512位数据），这样数据库就可以轻松处理它。我见过这样的答案需要8比特>1字节，然后保存，但由于我的新手C++经验有限，它们听起来太复杂了。有什么想法吗？

您可以将8位保存到一个字节中：

unsigned char saver(bool bits[])
{
   unsigned char output=0;
   for(int i=0;i<8;i++)
   {
           output=output|(bits[i]<<i); //probably faster than if(){output|=(1<<i);}
           //example: for the starting array 00000000
           //first iteration sets:           00000001 only if bits[0] is true
           //second sets:                    0000001x only if bits[1] is true
           //third sets:                     000001xx only third is true
           //fifth:                          00000xxx if fifth is false
           // x is the value before
   }
   return output;
}

您可以从单个字节加载8位：

void loader(unsigned char var, bool * bits)
{
   for(int i=0;i<8;i++)
   {
       bits[i] = var & (1 << i);
       // for example you loaded var as "200" which is 11001000 in binary
       // 11001000 --> zeroth iteration gets false
       // first gets false
       // second false
       // third gets true 
       //...
   }
}
1<<0 is 1  -----> 00000001
1<<1 is 2  -----> 00000010
1<<2 is 4  -----> 00000100
1<<3 is 8  -----> 00001000
1<<4 is 16  ----> 00010000
1<<5 is 32  ----> 00100000
1<<6 is 64  ----> 01000000
1<<7 is 128  ---> 10000000

编辑：使用gpgpu，一个在cpu上花费4-5小时的令人尴尬的并行算法可以在gpu上缩短到0.04-0.05小时（对于多个gpu，甚至不到一分钟）。例如，上层的"saver/loader"功能是令人尴尬的平行。

我见过这样的答案需要8比特>1字节，然后保存，但由于我的新手C++经验有限，它们听起来太复杂了。有什么想法吗？

如果您要经常阅读该文件，这将是学习逐位操作的好时机。每个bool使用一位将是大小的1/8。这将节省大量内存和I/O。

因此，将其保存为每个bool一位，然后将其分成块和/或使用映射内存（例如mmap）读取。你可以把它放在一个可用的接口后面，所以你只需要实现一次，并在需要读取值时抽象序列化的格式。

如前所述，这里vec是布尔的向量的向量，我们将子向量8x8中的所有位打包为字节，并将这些a字节推送到向量中。

 std::vector<unsigned char> buf;
 int cmp = 0;
 unsigned char output=0;
   FILE* of = fopen("out.bin")
  for_each ( auto& subvec in vec)
  {
       for_each ( auto b in subvec)
       {
            output=output | ((b ? 1 : 0) << cmp);
             cmp++;
            if(cmp==8)
             {
                 buf.push_back(output);
                 cmp = 0;
                 output = 0;
              }
          }
            fwrite(&buf[0], 1, buf.size(), of);
            buf.clear();
       }
         fclose(of);