在推力排序示例中崩溃

Crash in thrust sorting example

本文关键字：崩溃排序更新时间：2023-10-16

我正在尝试官方网站的例子https://developer.nvidia.com/thrust的第一个例子，并将向量大小更改为32<<23。代码如下:

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/generate.h>
#include <thrust/sort.h>
#include <thrust/copy.h>
#include <algorithm>
#include <cstdlib>
#include <time.h>
using namespace std;
int main(void){
  // generate random numbers serially
  thrust::host_vector<int> h_vec(32 << 23);
  std::generate(h_vec.begin(), h_vec.end(), rand);
  std::cout << "1." << time(NULL) << endl;
  // transfer data to the device
  thrust::device_vector<int> d_vec = h_vec;
  cout << "2." << time(NULL) << endl;
  // sort data on the device (846M keys per second on GeForce GTX 480)
  thrust::sort(d_vec.begin(), d_vec.end());
  // transfer data back to host
  thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin());
  std::cout << "3." << time(NULL) << endl;
  return 0;
}

但是当运行到thrust::sort行时程序崩溃了。我试着交替使用std::vector和std:sort，效果很好。

这是推力的bug吗??我使用的是Thrust 1.7 + Cuda 6.5 + Visual Studio 2013 Update 2。

我使用的是总内存为2048M的GeForce GT 740M。

我使用processsexplorer监视进程，看到它分配了1.0G内存。但我有2G GPU内存，16G主CPU内存。

错误信息是"一个问题导致程序停止正常工作。"Windows将关闭该程序，并通知您是否有可用的解决方案。[调试][关闭程序]"。点击[调试]后，我可以看到调用堆栈。问题来自这一行:

thrust::device_vector<int> d_vec = h_vec;

来自cuda的最后一个来源是:

testcuda.exe!thrust::system::cuda::detail::malloc<thrust::system::cuda::detail::tag>(thrust::system::cuda::detail::execution_policy<thrust::system::cuda::detail::tag> & __formal, unsigned __int64 n) Line 48  C++

这似乎是一个内存分配问题。但我有2G GPU内存，16G主CPU内存。为什么? ?

罗伯特•:

原始示例运行良好，即使对于32<<21,32 <<22。是否有GPU内存的虚拟内存管理系统?这里的连续是指物理上连续的还是虚拟的?在这种情况下是否有任何异常，然后我可以捕获它?

我的测试代码如下:https://github.com/henrywoo/wufuheng/blob/master/testcuda.cu

在我的测试中，没有异常，但是运行时错误。

sizeof(int) * 32<<23 = 4* 2^28

即您正在分配大约1 GB的GPU RAM。很可能，您的卡无法处理那么多元素。这可能是因为:

一般没有足够的GPU RAM
没有足够的连续可用的GPU RAM(这是必需的，因为矢量必须适合连续的内存)