Cuda blockDim.y always ==1
Cuda blockDim.y always ==1
我总是得到blockdim.y==1。无论我在numBlocks中设置了什么值,我总是得到相同的值。
__global__ void CalcVideo(unsigned char *original, unsigned char *candidate, int *answer)
{
printf("block id.x = %d blockid.y=%d blockdim.x = %d blockdim.y = %d Thread id= %d n",
blockIdx.x, blockIdx.y, blockDim.x, blockDim.y, threadIdx.x );
}
int ORIGINAL_FRAMES = 3;
int CANDIDATE_FRAMES = 2;
int FRAME_LENGHT = 3;
dim3 numBlocks(ORIGINAL_FRAMES, CANDIDATE_FRAMES);
dim3 threadsPerBlock(3); // 64 threads
CalcVideo << <numBlocks, threadsPerBlock >> >(original_device, candidate_device, answer_device);
y.blokcs的数量执行正确,但为什么程序给了我错误的blockdim.y大小?
block id.x = 1 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 1 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 1 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 2
block id.x = 1 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 1 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 1 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 2
block id.x = 0 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 0 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 0 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 2
block id.x = 0 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 0 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 0 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 2
block id.x = 2 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 2 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 2 blockid.y=1 blockdim.x = 3 blockdim.y = 1 Thread id= 2
block id.x = 2 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 0
block id.x = 2 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 1
block id.x = 2 blockid.y=0 blockdim.x = 3 blockdim.y = 1 Thread id= 2
blockDim
存储一个块的尺寸。在您的情况下,您将传递threadsPerBlock
作为块维度,这将使其成为3 x 1 x 1
。内核调用的第一个参数numBlocks
控制块的网格的维度—您可以在内核中以gridDim
的形式访问它。
附带说明:我认为问题中极低数量和大小的块仅用于测试目的,因为它们会使任何GPU在实践中都得不到充分利用。
相关文章:
- 如何在Visual Studio 2019中修复"[member variable that is a vulkan struct] is uninitialized. Always initiali
- 使用代码调整删除"comparison is always false"警告
- 符合要求的编译器应该能够优化哪些指针比较以"always false"?
- QPixmap,如何确保它是'always on top'
- 在Linux上使用QT,有没有办法禁用"Always on Top"?
- CUDA gridDim,blockDim总是用户定义的
- "Condition is always true"当我知道它不是
- 为什么"the adress of bool will always evaluate as true"会在这里?
- Cuda blockDim.y always ==1
- 应使用哪个 MAPI 属性来设置 "Always prompt for logon credentials" 属性?
- 比较双精度堆栈导致Always Equal
- QMap contains return always true
- 测试设备是否支持Always-On/Always-Connected (AOAC)模式或Connected Standb
- GetCommState always false
- GetLastInputInfo() always 0 (zero)
- std::vector size()-1 ALWAYS给出最后一个元素的索引吗?
- LibRdKafka: commited_offset always at -1001
- struct pcap_pkthdr len always == zero
- 任务管理器是Windows 10的特殊'Always on Top'窗口吗?
- 修复警告"comparison is always false due to limited range of data type [-Wtype-limits]"