可以使用cublasDdot()在非gpu内存中使用blas操作吗?

Can you use cublasDdot() to use blas operations in non-GPU memory?

本文关键字：blas 操作 gpu cublasDdot 在非可以使内存更新时间：2023-10-16

所以我有一个执行矩阵乘法的代码，但问题是当我使用库-lcublas和编译器nvcc时，它只返回零;但是，当我使用编译器g++和库-lblas时，只需对函数名进行一些调整，代码就可以运行得很好。

你可以使用-lcublas库从内存中执行矩阵乘法，而不是在GPU上?

下面是返回0的代码:

extern "C" //external reference to function so the code compiles
{
    double cublasDdot(int *n, double *A, int *incA, double *B, int *incB);
}
//stuff happens
    cout << "Calculating/printing the contents of Matrix C for ddot...n";
            C[i][t]=cublasDdot(&n, partA, &incA, partB, &incB); //This thing isn't working for some reason (although it compiles just fine)

我使用以下命令编译它:nvcc program -lcublas

extern "C" //external reference to function so the code compiles
{
    double ddot_(int *n, double *A, int *incA, double *B, int *incB);
}
//stuff happens
C[i][t]=ddot_(&n, partA, &incA, partB, &incB);

用g++ program -lblas编译

cublas需要一个正常运行的CUDA GPU。

可能你没有做错误检查。在cublas手册中详细阅读如何进行错误检查。看看一些错误检查的示例代码。

一般使用cublas需要将数据传输到GPU，并将结果传回。