如何找到检测到 Cuda API 错误时程序崩溃的位置:cudaMemcpy 返回 (0xb)

How to find where does program crashed when Cuda API error detected: cudaMemcpy returned (0xb)

本文关键字:cudaMemcpy 返回 位置 0xb 程序 Cuda 检测 何找 API 错误 崩溃      更新时间:2023-10-16

我正在调试一个 cuda 程序并收到以下警告:

warning: Cuda API error detected: cudaMemcpy returned (0xb)
warning: Cuda API error detected: cudaMemcpy returned (0xb)
warning: Cuda API error detected: cudaGetLastError returned (0xb)
Error in kernel
GPUassert: invalid argument

当我在 cuda-gdb 中输入"where"时,它说"没有堆栈"。

(cuda-gdb) where
No stack.

如何找到我的程序崩溃的地方?

在这里找到答案:http://on-demand.gputechconf.com/gtc/2012/presentations/S0027A-Monday-Debugging-Experience-CUDA.pdf@第27页。

您需要首先:

(cuda-gdb) set cuda api_failures stop

然后,当错误发生时,它将停止:

Cuda API error detected: cudaMemcpy returned (0xb)
(cuda-gdb) where
#0  0x00007fffea6a06d0 in cudbgReportDriverApiError () from       /usr/lib64/nvidia/libcuda.so.1
#1  0x00007fffea6a2c36 in cudbgReportDriverInternalError () from /usr/lib64/nvidia/libcuda.so.1
#2  0x00007fffea6eed93 in cudbgGetAPIVersion () from /usr/lib64/nvidia/libcuda.so.1
...