OpenCL -- 不同的设备上有不同的内核"printf()"结果?

OpenCL -- Different kernel "printf()" results on different devices?

本文关键字:printf 结果 OpenCL 内核      更新时间:2023-10-16

我从运行hello_world内核中得到了一个特殊的结果,该内核只打印通过命令队列传递的缓冲区。我从同一平台上的不同设备上得到了两个不同的结果。请参阅下面控制台输出的底部:

这是我的内核代码:

__kernel void hello_world (__global char* message, int messageSize) {
    for (int i =0; i < messageSize; i++) {
        printf("%c", message[i]);
    }
}

这是我的函数调用:

  std::string message = "Hello World!";
  int messageSize = message.length();
  std::cout << "          ---> Creating Buffer... ";
  cl::Buffer buffer(CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(char) * messageSize, (char*)message.c_str());
  kernel.setArg(0,buffer);
  kernel.setArg(1,sizeof(int),&messageSize);
  std::cout << "Done!" << std::endl;
  for (cl_uint i = 0; i<m_deviceCount[m_currentPlatform]; i++) {
        std::cout << "          ---> Queuing Kernel Task on Device #"<< m_currentPlatform << "." << i << "... ";
        m_commandQueues[i].enqueueTask(kernel);
        std::cout << "Done!" << std::endl;
        std::cout << "          ---> Executing... Output:nn";
        m_commandQueues[i].finish();
        std::cout << "nn          ---> Done!" << std::endl;
    }

我的控制台输出:

Found 1 Platforms
Platform #0:
  Name:  AMD Accelerated Parallel Processing
  Found  2 Devices
      Device #0.0:
          --> Name:               Juniper
          --> Vendor:             Advanced Micro Devices, Inc.
          --> Max Compute Units:  10
          --> Max Clock Freq:     850
          --> Global Mem Size:    512 MBs
          --> Local Mem Size:     32 KBs
          --> Hardware Version:   OpenCL 1.2 AMD-APP (1800.11)
          --> Software Version:   1800.11
          --> Open CL Version:    OpenCL C 1.2 
          --> Images Supported:   YES
      Device #0.1:
          --> Name:               Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
          --> Vendor:             GenuineIntel
          --> Max Compute Units:  8
          --> Max Clock Freq:     3796
          --> Global Mem Size:    15905 MBs
          --> Local Mem Size:     32 KBs
          --> Hardware Version:   OpenCL 1.2 AMD-APP (1800.11)
          --> Software Version:   1800.11 (sse2,avx)
          --> Open CL Version:    OpenCL C 1.2 
          --> Images Supported:   YES
    Using Platform With Most Available Devices: Platform #0
          ---> Creating Context.... Done!
          ---> Creating Command Queue for Device #0.0.... Done!
          ---> Creating Command Queue for Device #0.1.... Done!
          ---> Loading Program: hello_world.cl...
                  > Compiling... Done!
          ---> Creating Buffer... Done!
          ---> Queuing Kernel Task on Device #0.0... Done!
          ---> Executing... Output:
H(null)e(null)l(null)l(null)o(null) (null)W(null)o(null)r(null)l(null)d(null)!(null)
          ---> Done!
          ---> Queuing Kernel Task on Device #0.1... Done!
          ---> Executing... Output:
Hello World!
          ---> Done!

有人知道为什么AMD GPU在字符之间插入"(null)",而英特尔CPU不知道吗?这对于AMD实现OpenCL来说是正常的吗?

我还试图在内核中实现printf。你可以参考我的程序:

https://github.com/pradyotsn/opencl_printf

感谢