在Thrust中使用函数中附加数据字段的最佳方式是什么?

What is the optimal way to use additional data fields in functors in Thrust?

本文关键字：字段最佳是什么数据方式 Thrust 函数更新时间：2023-10-16

在thrust算法(如thrust::transform)中使用的函子中使用一些常量数据的适当(或最佳)方法是什么?我使用的简单方法是在函子的operator()方法中分配所需的数组，如下所示:

struct my_functor {
    __host__ __device__
    float operator()(thrust::tuple<float, float> args) {
        float A[2][10] = {
            { 4.0, 1.0, 8.0, 6.0, 3.0, 2.0, 5.0, 8.0, 6.0, 7.0 },
            { 4.0, 1.0, 8.0, 6.0, 7.0, 9.0, 5.0, 1.0, 2.0, 3.6 }};
        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);
        float result = 0.0;
        for (int i = 0; i < 10; ++i)
            result += x1 * A[0][i] + x2 * A[1][i];
        return result;
    }
}

但这似乎不是很优雅或有效的方式。现在我必须开发相对复杂的函子，其中包含一些矩阵(常量，如上面的例子)和函子operator()方法中使用的附加方法。解决这个问题的最佳方法是什么?谢谢。

从你最后的评论来看，很明显你在这里真正问的是函子参数初始化。CUDA使用c++对象模型，因此结构具有类语义和行为。所以你的例子函子

struct my_functor {
    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float A[2] = {50., 55.6};
        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);
        return x1 * A[0]+ x2 * A[1];
    }
}

可以用带初始化列表的空构造函数重写，以将函函数内的硬编码常量转换为运行时可赋值的值:

struct my_functor {
    float A0, A1;
    __host__ __device__
    my_functor(float _a0, _a1) : A0(_a0), A1(_a1) { }
    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);
        return x1 * A0 + x2 * A1;
    }
}

您可以实例化任意多个版本的函子，每个版本都有不同的常数值，以完成与thrust库一起使用函子的任何任务。