CUDA推力-如何使用不同大小的多个设备矢量编写函数

CUDA Thrust - How can I write a function using multiple device vectors with different sizes?

本文关键字：函数推力何使用 CUDA 更新时间：2023-10-16

我一直在尝试如何使用四个推力装置矢量进行简单的熵计算。

我有四个设备向量，表示两个键值对。第一对矢量包含关键帧和该关键帧出现的次数。第二对包含与用于计算熵的仓配对的密钥。在第二个向量对中，关键帧出现多次，每个实例代表不同的bin。

它看起来像这样：

设备矢量对1

KeyVal 6 8 9

计数1 3 2

设备矢量对2

KeyVal 6 8 8 9 9

BinVal 1 1 2 1

结果矢量（包含计算的熵结果）

KeyVal 8

熵0.602

我打算做的是使用第一个向量对来检查一个键是否出现足够的次数来计算熵。如果计数足够大，则第二个向量对将用于计算具有该关键字的bin值的熵。我需要使用该特定键的所有bin值。例如，如果我想计算出现至少3次的关键帧的熵，我会在第一个向量对中发现KeyVal 8已经准备好了。然后，我将在第二对中搜索KeyVal 8的所有实例，并使用它们对应的BinVal计算熵。熵计算很简单，只需将每个相关值的BinVal*Log（BinVal）相加即可。在我的例子中，它将是熵=1*log（1）+2*log（2）。

然而，我不知道如何使这部分工作。我曾尝试使用thrust:：for_each来查找所有出现足够时间进行测试的密钥，但我认为不可能在第二个向量对中搜索密钥并在for_each函数中执行计算。

有人对实现这一目标的其他方法有什么建议吗？

谢谢你的帮助。

我考虑的两个想法是：

想法A:

计算所有熵
选择符合条件的

想法B:

选择符合条件的传入数据
计算熵

想法A似乎在做不必要的工作——计算需要或可能不需要的熵。然而，当我完成创意B的过程时，我最终添加了太多步骤（如计算前缀和）来完成创意B中的步骤1，这似乎不会更好。因此，我现在将提出想法A。也许m.s.或其他人会来发布一些更好的东西。

创意A的步骤1由thrust::reduce_by_key与适当的函子一起处理，以计算特定熵函数

创意A的步骤2由thrust::copy_if 处理

$ cat t827.cu
#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <thrust/reduce.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/discard_iterator.h>
#include <math.h>
// THRESH determines the minimum Counts value required for a KeyVal Entropy calculation to occur
#define THRESH 2
using namespace thrust::placeholders;

struct my_entropy : public thrust::unary_function<float, float>
{
  __host__ __device__
  float operator()(float val){
    return val*log10f(val);}  // if you want napierian log, change this to logf
};
int main(){
  int KeyVal1[]={6, 8, 9};
  int Counts[] ={1, 3, 2};
  int KeyVal2[]={6, 8, 8, 9, 9};
  float BinVal[] ={1, 1, 2, 1, 1};
  int dsize1 = sizeof(KeyVal1)/sizeof(int);
  int dsize2 = sizeof(KeyVal2)/sizeof(int);
  thrust::device_vector<int> d_KeyVal1(KeyVal1, KeyVal1+dsize1);
  thrust::device_vector<int> d_Counts(Counts, Counts+dsize1);
  thrust::device_vector<int> d_KeyVal2(KeyVal2, KeyVal2+dsize2);
  thrust::device_vector<float> d_BinVal(BinVal, BinVal+dsize2);

  // method 1 - just compute all entropies, then select the desired ones
  thrust::device_vector<float> entropies(dsize2);
  thrust::reduce_by_key(d_KeyVal2.begin(), d_KeyVal2.end(), thrust::make_transform_iterator(d_BinVal.begin(), my_entropy()), thrust::make_discard_iterator(), entropies.begin());
  thrust::device_vector<int> res_keys(dsize1);
  thrust::device_vector<float>res_ent(dsize1);
  int res_size = thrust::copy_if(thrust::make_zip_iterator(thrust::make_tuple(d_KeyVal1.begin(), entropies.begin())), thrust::make_zip_iterator(thrust::make_tuple(d_KeyVal1.end(), entropies.end())), d_Counts.begin(), thrust::make_zip_iterator(thrust::make_tuple(res_keys.begin(), res_ent.begin())), _1 >= THRESH) - thrust::make_zip_iterator(thrust::make_tuple(res_keys.begin(), res_ent.begin()));
  std::cout << "Counts threshold: " << THRESH << std::endl <<  "selected keys: " << std::endl;
  thrust::copy_n(res_keys.begin(), res_size, std::ostream_iterator<int>(std::cout, ","));
  std::cout << std::endl << "calculated entropies: " << std::endl;
  thrust::copy_n(res_ent.begin(), res_size, std::ostream_iterator<float>(std::cout, ","));
  std::cout << std::endl;
  return 0;
}
[bob@cluster1 misc]$ nvcc -o t827 t827.cu
$ ./t827
Counts threshold: 2
selected keys:
8,9,
calculated entropies:
0.60206,0,
$