链接CUDA中的错误

Linking error in Cuda

本文关键字：错误 CUDA 链接更新时间：2023-10-16

我遇到问题，试图构建基本的cuda/推力代码，以使GPU编程更熟悉。我可能没有正确编译它，所以我想知道我在做什么错？

我正在使用以下说明构建

nvcc -c gpu_functions.cu
nvcc gpu_functions.o gpu_test.cu -o gpu_test

但是，我会收到一个链接错误：

jim@pezbox:~/dev/analytics/src$ nvcc gpu_functions.o gpu_test.cu -o gpu_test
/tmp/tmpxft_00002383_00000000-14_gpu_test.o: In function `main':
tmpxft_00002383_00000000-3_gpu_test.cudafe1.cpp:(.text+0x6e): undefined reference to `void add<thrust::device_vector<int, thrust::device_malloc_allocator<int> > >(thrust::device_vector<int, thrust::device_malloc_allocator<int> > const&, thrust::device_vector<int, thrust::device_malloc_allocator<int> > const&, thrust::device_vector<int, thrust::device_malloc_allocator<int> >&)'
collect2: ld returned 1 exit status

我有三个文件：

gpu_functions.h（gpu函数的标头函数）
gpu_functions.cu（gpu函数的实现）
gpu_test.cu（调用我定义的gpu函数的主要循环）

gpu_functions.h

template<typename Vector>
void add(const Vector& in1, const Vector& in2, Vector& out);

gpu_functions.cu

#include "gpu_functions.h"
#include <thrust/sequence.h>
#include <thrust/transform.h>
#include <thrust/sequence.h>
#include <thrust/copy.h>
#include <thrust/fill.h>
#include <thrust/replace.h>
#include <thrust/functional.h>
using namespace thrust;
template<typename Vector>
void add(const Vector& in1, const Vector& in2, Vector& out) {
transform(in1.begin(), in1.end(), in2.begin(), out.begin(), 
          plus<typename Vector::value_type>()); 
}

gpu_test.cu

#include "piston_functions.h"
#include <thrust/device_vector.h>
#include <iostream>
#include <stdio.h>
using namespace thrust;
int main(void) {
    const int n = 100000000;
    // allocate three device_vectors with 10 elements
    device_vector<int> in1(n, 1);
    device_vector<int> in2(n, 2);
    device_vector<int> out(n, 0);
    add(in1, in2, out);
    thrust::copy(out.begin(), out.begin()+10, std::ostream_iterator<int>(std::cout,"n"));
    return 0;    
}

我可能在做一些愚蠢的事情，或者我错过了很明显的事情。

一旦声明，模板函数需要明确或隐式的实例化，即，为模板参数的特定组合生成具体函数（实例）。

在gpu_functions.cu编译单元中，您两者都缺少。换句话说，编译器没有生成函数add的实例，因此链接器找不到任何链接的内容。

您应该通过在隐式实例化的位置（即包含main函数的汇编单元）中包含模板函数声明来解决此问题。

换句话说，下面的代码将正确编译

#include <thrust/device_vector.h>
#include <iostream>
#include <stdio.h>
using namespace thrust;
template<typename Vector>
void add(const Vector& in1, const Vector& in2, Vector& out) {
transform(in1.begin(), in1.end(), in2.begin(), out.begin(), 
    plus<typename Vector::value_type>()); 
}
int main(void) {
    const int n = 100000000;
    device_vector<int> in1(n, 1);
    device_vector<int> in2(n, 2);
    device_vector<int> out(n, 0);
    add(in1, in2, out);
    thrust::copy(out.begin(), out.begin()+10, std::ostream_iterator<int>(std::cout,"n"));
    return 0;    
}

当然，您可以在单独的.cuh文件中移动模板函数声明，并通过#include指令将其包括在内。

最后，请记住要添加CUDA错误检查。