用C++创建一个动态内核调度器

Creating a Dynamic Kernel Dispatcher in C++

本文关键字：动态一个内核调度 C++ 创建更新时间：2024-09-21

同一函数有多个实现，一个SIMD Accelerated，一个Cuda内核，另一个在SYCL中。用户可以使用int参数选择使用哪个内核，0表示Vanilla，1表示SIMD，2表示Cuda内核，3表示SYCL。伪代码如下(我使用的是C++(-

return type function(param1, param2, int device){
switch(device):
case Vanilla:
Normal Code.(Written right over here)
case SIMD:
Calls the SIMD Kernel
case Nvidia :
Calls the Cuda Kernel
case SYCL :
Calls the SYCL kernel
}

所有内核都存在于一个名为kernels的单独文件夹中，其子文件夹为SIMD、Cuda和SYCL。现在，人们不能指望用户拥有Cuda，而是可以运行SYCL(拥有AMD GPU(等等。因此，使用cmake选项，这些文件夹将根据用户指定的条件进行编译。最终目标是创建一个可以由用户安装的库。

因此，我不想包含包含这些内核的文件，而是创建一个动态调度器中间。我怎样才能开始设计同样的东西
TIA

可能不是最好的解决方案，但我认为在安装过程中，我可以编写一个名为installation.h的文件，如下所示-

#define __SIMD_x86_64__ 1
#define __CUDA__ 1
#define __SYCL__ 1
#define __SIMD_ARM_NEON__ 0

调度器可以有条件地包括来自各种文件夹的头，这些文件夹包含各种设备后端的内核。然后，可以使用一个分支表，根据函数使用一个调度器键来创建一个调度器。

#include "installation.h"
#include <iostream>
#include <unordered_map>
std::unordered_map <std::string, function_pointer> branch_table;
#if defined(__SIMD_x86_64__)
include "SIMD Kernels"
Add Kernels to the Map;
#endif
#if defined(__CUDA__)
include "CUDA Kernels"
Add them to the Map;
#endif

等等。我还没有测试过，可能不是最好的解决方案。但它看起来是有效的。

非常感谢@rodburns的指导。