如何在文件中节省空间地存储和检索 std：：vector<int> 值

How to space-efficiently store and retrieve std::vector<int> values in a file

本文关键字：std vector gt 检索 lt int 文件节省存储空间更新时间：2023-10-16

我有一个由值-1,0,1组成的std::vector<int>。在对这些值进行一些初始操作后，我最终得到一个可以省略 -1 值的向量。如何以有效的方式将所需的 0,1 值存储在文件中，包括空间(更重要)和时间。

似乎有 3 个推荐的选项std::vector<bool>、std::bitset和boost::dynamic_bitset但在这种情况下哪个是最好的。我可以遍历向量并if value!=-1将其添加到vector<bool>然后存储它，但这是最好的方法吗？该向量有大约 100 万个元素(操作后)。

// Initialize temp_array of size n(obtained in runtime) with value -1
std::vector<int> temp_array(n, -1);
// Do some manipulation on the temp array
// Now temp array has values containing -1,0,1 of which all occurrences of -1 can be removed without worrying about the index
std::vector<bool>final_array;
for (const auto &i : temp_array)
{
if (i != -1)
{
final_array.push_back(i);
}
}
// How to store and retrieve this in the most space efficient way

编辑：有关该问题的更多背景详细信息。空间效率是必须的，因为我正在存储邻接矩阵的压缩格式(执行一些自定义压缩)。每个节点可以有多达一百万个边(有时甚至更多)，并且大约有 1000 万个这样的节点(处理大型图形)。目的是在内存中完全加载此图的压缩形式，并支持基本查询，而无需解压缩并支持流式边缘(例如，实时日志图有 4,847,571 个节点)。

如果空间效率是一个大问题，并且您只有 0 和 1，那么您可以考虑存储二进制字符串的运行长度编码。

请看，https://en.wikipedia.org/wiki/Run-length_encoding

最坏的情况是当您有交替的 0 和 1 时。

代码应该相对简单，涉及单个传递向量。