<<< >>> cuda in vscode

<<< >>> cuda in vscode

本文关键字:gt lt vscode cuda in      更新时间:2023-10-16

有没有办法用vscode-cpptools抑制"<<<>>>"错误。

我将"*.cu"与setting.json中的"cpp"联系起来。

// use normal c++ syntax highlighting for CUDA files
"files.associations": {"*.cu": "cpp"},

并且工作正常,除了一个问题,kernel execution configuration parameters surrounded by <<< and >>>误认为错误expected an expression

dim3 dimGrid(2, 2, 1);
dim3 dimBlock(width / 2, width / 2, 1);
MatrixMulKernel<<<dimGrid, dimBlock>>>(d_M, d_N, d_P, width);

任何建议

谷歌搜索几个小时,没有找到完美的解决方案,但有一些解决方法。

我在这里总结一下:

  • 通过编辑setting.json对 CUDA 文件使用正常的 C++ 语法突出显示
  • 在程序中包括必要的 CUDA 标头
  • 包括虚拟标头以解决智能感知
  • 问题

波纹管是一个具体的例子

  • 设置.json
"files.associations": {
"*.cu": "cpp",
"*.cuh": "cpp"
}
  • cudaDmy.cuh
#pragma once
#ifdef __INTELLISENSE__
void __syncthreads();  // workaround __syncthreads warning
#define KERNEL_ARG2(grid, block)
#define KERNEL_ARG3(grid, block, sh_mem)
#define KERNEL_ARG4(grid, block, sh_mem, stream)
#else
#define KERNEL_ARG2(grid, block) <<< grid, block >>>
#define KERNEL_ARG3(grid, block, sh_mem) <<< grid, block, sh_mem >>>
#define KERNEL_ARG4(grid, block, sh_mem, stream) <<< grid, block, sh_mem,    
stream >>>
#endif
  • matrixMul.cu
#include <stdio.h>
#include <math.h>
#include <time.h>
#include <cuda.h>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <device_functions.h>
#include <cuda_runtime_api.h>
#include "cudaDmy.cuh"
__global__ void MatrixMulKernel(float *M, float *N, float *P, int width)
{
int Row = blockIdx.y * blockDim.y + threadIdx.y;
int Col = blockIdx.x * blockDim.x + threadIdx.x;
if (Row < width && Col < width)
{
float Pvalue = 0;
for (int i = 0; i < width; ++i)
{
Pvalue += M[Row * width + i] * N[width * i + Col];
}
P[Row * width + Col] = Pvalue;
}
}
void MatMul(float *M, float *N, float *P, int width)
{
float *d_M;
float *d_N;
float *d_P;
int size = width * width * sizeof(float);
cudaMalloc((void **)&d_M, size);
cudaMemcpy(d_M, M, size, cudaMemcpyHostToDevice);
cudaMalloc((void **)&d_N, size);
cudaMemcpy(d_N, N, size, cudaMemcpyHostToDevice);
cudaMalloc((void **)&d_P, size);
dim3 dimGrid(2, 2, 1);
dim3 dimBlock(width / 2, width / 2, 1);
// <<<>>> will replace macro KERNEL_ARG2 when compiling 
MatrixMulKernel KERNEL_ARG2(dimGrid,dimBlock) (d_M, d_M, d_P, width);
cudaMemcpy(P, d_P, size, cudaMemcpyDeviceToHost);
cudaFree(d_M);
cudaFree(d_N);
cudaFree(d_P);
}
int main()
{
int elem = 100;
float *M = new float[elem];
float *N = new float[elem];
float *P = new float[elem];
for (int i = 0; i < elem; ++i)
M[i] = i;
for (int i = 0; i < elem; ++i)
N[i] = i + elem;
time_t t1 = time(NULL);
MatMul(M, N, P, sqrt(elem));
time_t t2 = time(NULL);
double seconds = difftime(t2,t1);
printf ("%.3f seconds total timen", seconds);
for (int i = 0; i < elem/1000000; ++i)
printf("%.1ft", P[i]);
printf("n");
delete[] M;
delete[] N;
delete[] P;
return 0;
}

让我们用NVCC编译它

nvcc matrixMul.cu -Xcudafe "--diag_suppress=unrecognized_pragma" -o runcuda

有用的链接:

  • https://devtalk.nvidia.com/default/topic/513485/cuda-programming-and-performance/__syncthreads-is-undefined-need-a-help/post/5189004/#5189004
  • https://stackoverflow.com/a/6182137/8037585
  • https://stackoverflow.com/a/27992604/8037585
  • https://gist.github.com/ruofeidu/df95ba27dfc6b77121b27fd4a6483426

您可以下载 vscode-cudacpp 扩展,然后在您的工作区(<>.workspace( 或用户设置(.vscode/settings.json(中启用此选项:

"settings": {
"files.associations": {
"*.cu": "cuda",
"*.cuh": "cuda"
}
}

正如sonulohani指出的cuda-cpp扩展。这很好,它是 CUDA 唯一可用的扩展。如果您想要自动完成,请尝试在Sublime文本编辑器中使用CUDA-C++包。这提供了出色的自动完成功能。

NVIDIA有一个官方扩展名叫Nsight Visual Studio Code Edition

您可以尝试将其安装在vscode中。