算术Cuda程序编译错误

Arithmetic Cuda program compilation error

本文关键字:错误 编译 程序 Cuda 算术      更新时间:2023-10-16

我正在开发一个CUDA程序,这是我的新手。我遇到了下面的错误,尝试修复,但失败了。有人能看一眼,告诉我可能遗漏了什么吗?任何帮助都将不胜感激。

Error   5   error : too few arguments in function call   
Error   6   error : argument of type "int *" is incompatible with parameter of type "size_t"    
Error   7   error : argument of type "unsigned int" is incompatible with parameter of type "cudaMemcpyKind" 
Error   8   error : too many arguments in function call 2010Projectslablabkernel.cu 54  1   lab
Error   9   error MSB3721: The command ""C:Program FilesNVIDIA GPU 

这是我的代码:

#include <stdio.h>
#define SIZE 500
#include <cuda.h>
__global__ void InitialAdd(int *a, int *b, int *c, int *z, int n, float aspa, float bspb, float apa, float bpb)
{
    int i = blockIdx.x + blockIdx.x * threadIdx.x;
    aspa = (-*a);
    bspb = (-*b);
    aspa = (10,*a);
    bspb = (10,*b);
    *z = (a,2) + (b,2) + aspa + bspb + apa + bpb;
    if(i < n)
        c[i] = a[i] * b[i];
}
int main(void)
{
    int *a, *b, *c, *z;
    int *d_a, *d_b, *d_c, *d_z;

    a = (int *)malloc(SIZE*sizeof(int));
    b = (int *)malloc(SIZE*sizeof(int));
    c = (int *)malloc(SIZE*sizeof(int));
    z = (int *)malloc(SIZE*sizeof(int));
    cudaMalloc( &d_a, SIZE*sizeof(int));
    cudaMalloc( &d_b, SIZE*sizeof(int));
    cudaMalloc( &d_c, SIZE*sizeof(int));
    cudaMalloc( &d_z, SIZE*sizeof(int));
    for( int i = 0; i < SIZE; i++ )
    {
        a[i] =i;
        b[i] =i;
        c[i] =0;
        z[i] =i;
    }
    cudaMemcpy( d_a, a, SIZE*sizeof(int), cudaMemcpyHostToDevice );
    cudaMemcpy( d_b, b, SIZE*sizeof(int), cudaMemcpyHostToDevice );
    cudaMemcpy( d_c, c, SIZE*sizeof(int), cudaMemcpyHostToDevice );
    cudaMemcpy( d_z, z, SIZE*sizeof(int), cudaMemcpyHostToDevice );

    InitialAdd<<< 4 , SIZE >>>( d_a, d_b, d_c, d_z, SIZE);

    cudaMemcpy( c, d_z, d_c, SIZE*sizeof(int), cudaMemcpyDeviceToHost );
    for( int i = 0; i < 1000; i++)
        printf("c[%d] = %dn", i, c[i], *z);
    free(a);
    free(b);
    free(c);
    free(z);
    cudaFree(d_a);
    cudaFree(d_b);
    cudaFree(d_c);
    cudaFree(d_z);
    return 0;
}

我可以在这行看到一个明显的问题:

cudaMemcpy( c, d_z, d_c, SIZE*sizeof(int), cudaMemcpyDeviceToHost );

您传递了5个参数,而cudaMemcpy只需要4个。我想你正试图从d_z复制到c,所以它应该是:

cudaMemcpy( c, d_z, SIZE*sizeof(int), cudaMemcpyDeviceToHost );

即移除CCD_ 4。