Cuda 中的多个 GPU - 以前可以使用代码，但现在不再使用

Multiple GPUs in Cuda - Working code before, but not any more

本文关键字：代码不再可以使 GPU Cuda 更新时间：2023-10-16

我最近在Cuda应用程序中运行多个NVidia GPU时遇到了问题。附加的代码能够在Visual Studio 2013和2015(Windows 7，Cuda 9.2，Nvidia驱动程序398.26,1xGTX1080和1xGTX960(的系统上一致地重现该问题。我正在为我的卡(5.2 和 6.1(构建正确的计算功能。

具体来说，在第一个 GPU 初始化后，我无法在第二个 GPU 上获得任何函数调用来工作。错误代码始终为"CudaErrorMemoryAllocation"。它在 Nvidia 分析器以及调试和发布版本中都失败。我可以按任一顺序在 GPU 上初始化并重现问题。

在尝试扩展我当前的应用程序时出现了这个问题，该应用程序是图像处理算法的大型管道。此管道可以有多个独立的实例，并且由于内存限制，将需要多个卡。我对这个问题感到困惑的主要原因是我以前有过它的工作 - 我几年前运行了一个视觉配置文件会话，它显示我的相同卡片的行为符合预期。我知道的唯一区别是它在 Cuda 8.0 中。

有什么想法吗？

#include "cuda_runtime.h"
#include "cuda.h"
#include <thread>
#include <conio.h>
#include <iostream>
// Function for each thread to run
void gpuThread(int gpuIdx, bool* result)
{
cudaSetDevice(gpuIdx); // Set gpu index
// Create an int array on CPU
int* hostMemory = new int[1000000];
for (int i = 0; i < 1000000; i++)
hostMemory[i] = i;
// Allocate and copy to GPU
int* gpuMemory;
cudaMalloc(&gpuMemory, 1000000 * sizeof(int));
cudaMemcpy(gpuMemory, hostMemory, 1000000 * sizeof(int), cudaMemcpyHostToDevice);
// Synchronize and check errors
cudaDeviceSynchronize();
cudaError_t error = cudaGetLastError();
if (error != CUDA_SUCCESS)
{
result[0] = false;
return;
}
result[0] =  true;
}
int main()
{
bool result1 = false;
bool result2 = false;
std::thread t1(gpuThread, 0, &result1);
std::thread t2(gpuThread, 1, &result2);
t1.join();  // Wait for both threads to complete
t2.join();
if (!result1 || !result2) // Verify our threads returned success
std::cout << "Failedn";
else
std::cout << "Passedn";
std::cout << "Press a key to exit!n";
_getch();
return 0;
}

经过一天的卸载和重新安装程序，这似乎是 398.26 驱动程序的问题。较新版本 399.07 按预期工作。