在CUDA中选择性地编译头文件和类函数

selectively compile headers and class functions in CUDA

本文关键字：文件类函数编译 CUDA 选择性更新时间：2023-10-16

我正在尝试在CUDA中使用我的c++类。

我有一个这样的类:

#include<string>
#include<stdlib.h>
class exampleClass{
int i;
__host__ __device__ exampleClass(int _i):i(_i){};
__host__ __device__ void increment(){i++;}
__host__ __device__ string outputMessage(return itoa(i);}
};

我已经在。cu文件中设置了这个，并设置为编译CUDA c/c++

使用nvcc编译失败，因为cuda没有字符串。

我想做的是通过做这样的事情来保留CUDA的功能:

#ifndef __CUDA_ARCH__
  #include<string>
#endif
    #include<stdlib.h>
    class exampleClass{
    int i;
    __host__ __device__ exampleClass(int _i):i(_i){};
    __host__ __device__ void increment(){i++;}
#ifndef __CUDA_ARCH__
     string outputMessage(return itoa(i);}
#endif
    };

但我知道这不起作用…至少，这对我不起作用。nvcc不喜欢包含字符串，显然，也不喜欢需要字符串类型的函数。

如果示例不是一流的，请向我道歉。总之，我想做的是让核心类成员在CUDA上可执行，同时保持在主机端进行分析和输出的花哨主机操作的能力。

UPDATE:我的最终目标是有一个基类，包含几个指向多个多态类的指针类型。这个基类本身是可衍生的。我认为这在CUDA5.0中是可能的。我错了吗?

构建了以下代码，尽管我没有运行它:

class exampleClass{
int i;
public:
__host__ __device__ exampleClass(int _i):i(_i){};
__host__ __device__ void increment(){i++;}
 __host__ string outputMessage(){ return "asdf";}

};
__global__ void testkernel (                        
    exampleClass *a,
    int IH, int IW)
{
    const int i = IMUL(blockIdx.x, blockDim.x) + threadIdx.x;
    const int j = IMUL(blockIdx.y, blockDim.y) + threadIdx.y;

    if (i<IW && j<IH) 
    {
        const int i_idx = i + IMUL(j, IW);  
        exampleClass* ptr = a+i_idx;
        ptr->increment();
    }
}
__host__ void test_function(exampleClass *a,
    int IH, int IW)
{
    for (int i = 0; i < IW; i++)
        for (int j = 0; j < IH; j++)
        {
            const int i_idx = i + j*IW;
            exampleClass* ptr = a+i_idx;
            cout << ptr->outputMessage();
        }
}

请注意，您必须将类从设备移动到主机内存才能正常"工作"。如果您试图对这些类做任何奇怪的事情(例如，多态性)，这可能会失败。