处理将 AoS 转换为 SoA 时的组合爆炸

Handling combinatorial explosion when converting AoS to SoA

本文关键字:组合 SoA AoS 转换 处理      更新时间:2023-10-16

假设我想将结构数组转换为数组结构,并带有一个运行时参数,指示应转换源结构的哪些成员。例如:

struct SourceElement {
string member1;
float member2;
int member3;
//More members...
};
auto source_elements = ...; //A forward-iterable range of SourceElement objects
vector<string> members1;
vector<float> members2;
vector<int> members3;
for(auto& source_element : source_elements) {
if(member1_required) {
members1.push_back(source_element.member1);
}
if(member2_required) {
members2.push_back(source_element.member2);
}
if(member3_required) {
members3.push_back(source_element.member3);
}
//...and so on...
}
//Some of the vectors might be empty, which I am fine with

我想摆脱循环中的条件,希望无条件代码的运行速度会快一点。我知道的典型方法是简单地将条件移出循环,如果它只是一个条件,这很好用,但是对于多个条件,这会导致组合爆炸 - 对于N成员,我必须编写2^N不同的循环体。添加新成员还需要编写大量代码。下面是它的外观示例:

if(member1_required && !member2_required && !member3_required) {
for(auto& source_element : source_elements) {
members1.push_back(source_element.member1);
}
} else if(member1_required && member2_required && !member3_required) {
for(auto& source_element : source_elements) {
members1.push_back(source_element.member1);
members2.push_back(source_element.member2);
}
}
//... and so on

处理这类问题的好方法是什么?理想的解决方案应具有以下属性:

  • 生成的代码应尽可能接近手动卷解决方案(每个组合一个 for 循环(
  • 添加新成员应该只需要很少的努力
  • 解构源元素应该允许数据转换(例如members1.push_back(my_conversion(source_element.member1))(。一个简单的案例:SourceElement 有一个double成员,但我只想存储float数据
  • 源数据可能来自前向迭代器,因此不能假设所有数据都线性存储在内存中

您可以使用模板,示例(未经测试(:

struct DestElements
{
vector<string> members1;
vector<float> members2;
vector<int> members3;
};
template<uint32_t bitMask>
DestElements copy( const SourceElement* begin, const SourceElement* end )
{
DestElements dest;
for( ; begin < end; begin++ )
{
if constexpr( bitMask & 1 )
dest.members1.push_back( begin->member1 );
if constexpr( bitMask & 2 )
dest.members2.push_back( begin->member2 );
if constexpr( bitMask & 4 )
dest.members3.push_back( begin->member3 );
}
return std::move( dest );
}
using pfnCopy = DestElements( *)( const SourceElement* begin, const SourceElement* end );
static const std::array<pfnCopy, 8> dispatch =
{
// You can do crazy C++ metaprogramming here, std::apply, std::make_index_sequence, etc.
// When I have too large count of them, I write ~2 lines of C# in a T4 template instead.
&copy<0>, &copy<1>, &copy<2>, &copy<3>, &copy<4>, &copy<5>, &copy<6>, &copy<7>,
};
// Usage
uint32_t mask = 0;
if( member1_required ) mask |= 1;
if( member2_required ) mask |= 2;
if( member3_required ) mask |= 4;
DestElements dest = dispatch[ mask ]( source_elements.data(),
source_elements.data() + source_elements.size() );

显然,您需要将 const 指针替换为前向迭代器的类型。

但是,我不确定这会对性能产生可衡量的影响。所有现代 CPU 都进行分支预测。迭代时,这些条件不会更改。在第一次循环迭代之后,将以 100% 的准确度预测所有这些分支。

您可以在循环之前从 lambda 映射中选择要在循环中运行的函数。就像这里的其他人所说的那样,这不太可能提高性能。以下代码是一个工作示例:

#include <unordered_map>
#include<functional>
#include <iostream>
#include <vector>
#include <string>
struct component
{
float member1;
std::string member2;
int member3;
};
struct vector_builder
{
std::vector<std::function<void(component&)>> commands;
std::unordered_map<int, std::function<void(component&)>> function_map{
{1, [&](component& comp){members1.push_back(comp.member1);}},
{2, [&](component& comp){members2.push_back(comp.member2);}},
{3, [&](component& comp){members3.push_back(comp.member3);}}
};
vector_builder(std::vector<int> must_contain)
{
for(auto i : must_contain) {commands.push_back(function_map[i]);}
}
std::vector<float> members1;
std::vector<std::string> members2;
std::vector<int> members3;
void Push(component& c) {for(auto func : commands) func(c);}
};
int main()
{
// Create an iterable collection of component objects we want to transform
std::vector<component> components{{2.1, "hello", 5}, 
{3.4, "world", 6}, 
{0.5, "great", 10}};
// Let's say we want only members 2 and 3 to be made into vectors:
vector_builder builder({2, 3});
// Now the loop comprises only the two push_back functions we wanted
for (auto& comp : components) builder.Push(comp);

// Print the results
std::cout << "members1: "; // This should be empty.
for (auto& i : builder.members1) std::cout << i << " ";
std::cout << "nmembers2: ";
for (auto& i : builder.members2) std::cout << i << " ";
std::cout << "nmembers3: ";
for (auto& i : builder.members3) std::cout << i << " ";
std::cout << std::endl;
return 0;
}