OpenCL示例程序在CPU上的执行速度是在GPU上的10倍

OpenCL sample program executes 10 times faster on CPU than on GPU

本文关键字：速度 GPU 10倍上的执行例程程序 CPU OpenCL 更新时间：2023-10-16

我对OpenCL编程完全陌生，我决定从下载的AMD SDK中运行一些示例。我的第一选择是还原样品。每次我在CPU上执行程序时，我得到的执行时间大约是GPU上的10倍。GPU在计算方面不应该比CPU更好吗？

我的硬件：

CPU i5-2430M 2.40 Ghz
GPU AMD Radeon 6630M

在平台0（GPU）上执行：

$ Reduction.exe -x 33554432 -i 5 -q -t -p 0
Platform 0 : Advanced Micro Devices, Inc.
Platform 1 : Intel(R) Corporation
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 :        Intel(R) Core(TM) i5-2430M CPU @ 2.40GHz Device ID is 009E83A0
Executing kernel for 5 iterations
-------------------------------------------
Exec: 1.64225
| Elements | Time(sec) | (DataTransfer + Kernel)Time(sec) |
|----------|-----------|----------------------------------|
| 33554432 | 1.83705   | 1.64225                          |

在平台1（CPU）上执行：

$ Reduction.exe -x 33554432 -i 5 -q -t -p 1
Platform 0 : Advanced Micro Devices, Inc.
Platform 1 : Intel(R) Corporation
GPU not found. Falling back to CPU device
Selected Platform Vendor : Intel(R) Corporation
Device 0 :        Intel(R) Core(TM) i5-2430M CPU @ 2.40GHz Device ID is 040BEF1C
Executing kernel for 5 iterations
-------------------------------------------
Exec: 0.198049
| Elements | Time(sec) | (DataTransfer + Kernel)Time(sec) |
|----------|-----------|----------------------------------|
| 33554432 | 0.542269  | 0.198049                         |

从您的输出判断，它们似乎都在您的CPU上运行。

第一个是使用AMD平台，第二个是使用Intel，但您的CPU在两个平台上都显示为设备0。尝试使用标志-d 1（使用设备1）或--device gpu。

编辑：查看AMD网站上的系统需求页面或符合OpenCL的产品列表，您的GPU似乎不受支持。

根据这个规范页面（gpuzoo.com），6630M设备应该支持OpenCL 1.2。仔细检查您的驱动程序版本，并确保它得到支持。如果你仍然有麻烦，也可以试试老司机。driverscollection.com

试着在您的系统上运行CLInfo程序，看看它是否全部正常。这将为您提供所支持的所有设备的全部详细信息。

AMD SDK似乎选择了英特尔作为其计算设备。难怪，因为这可能是基于设备的优先级。

如何修复，

1) Retrieve deviceid's for all GPU/CPU which has OpenCL support for particular platform (In your case ONE AMD GPU + ONE INTEL CPU).
1) After getting all device id's use device info to extract required device (You can use CL_DEVICE_VENDOR flag to extract the required deviceid)
3) and then use this deviceid in all further device id reference.