About cufft R2C and C2R
About cufft R2C and C2R
我已经使用了cufft来做我的研究,但在使用它时遇到了一些问题。我的步骤如下:
- 使用R2C对图像进行前向FFT
- 将核系数与复数结果相乘
- 使用C2R对相乘结果进行逆FFT
但是,当我使用复数结果乘以核时,出现了一个严重的问题,cufft复数结果不等于fftw的结果,并且结果中有很多零。我知道R2C的结果大小是N1(N2/2+1),但我想得到完整的复杂结果。如何解决这个问题?即如何恢复R2C结果?如何将相乘的结果放入C2R中并得到正确的答案?
我的机具程序代码如下:
__global__ void MultiplyKernel(cufftComplex *data, float *data1,cufftComplex *data2, unsigned vectorSize) {
unsigned idx = blockIdx.x*blockDim.x+threadIdx.x;
if (idx < vectorSize){
data[idx].x = data2[idx].x*data1[idx];
data[idx].y = data2[idx].y*data1[idx];
}
}
__global__ void Scale(cufftReal *data, unsigned vectorSize) {
unsigned idx = blockIdx.x*blockDim.x+threadIdx.x;
if (idx < vectorSize){
data[idx] = data[idx]/vectorSize;
}
}
void ApplyKernel1(cufftReal *data2, float *ImageBuffer, float *KernelBuffer, unsigned int NX, unsigned int NY,unsigned int NZ)
{
float *Akernel;
cufftComplex *data_dev1, *data_dev2;
cufftReal *data_dev3, *data_dev;
cudaMalloc((void **)&Akernel, NX * NY * NZ * sizeof(float));
cudaMalloc((void **)&data_dev3, NX * NY * NZ * sizeof(cufftReal));
cudaMalloc((void **)&data_dev, NX * NY * NZ * sizeof(cufftComplex));
cudaMalloc((void **)&data_dev1, NX * NY * NZ * sizeof(cufftComplex));
cudaMalloc((void **)&data_dev2, NX * NY * NZ * sizeof(cufftComplex));
cudaMemset(data_dev, 0, NX * NY * NZ * sizeof(cufftReal));
cudaMemset(data_dev1, 0, NX * NY * NZ * sizeof(cufftComplex));
cudaMemset(data_dev2, 0, NX * NY * NZ * sizeof(cufftComplex));
//cufftComplex *resultFFT = (cufftComplex*)malloc(NX * NY * NZ * sizeof(cufftComplex));
//cufftReal *resultIFFT = (cufftReal*)malloc(NX * NY * NZ * sizeof(cufftReal));
cudaMemcpy(data_dev, ImageBuffer, NX * NY * NZ * sizeof(cufftReal), cudaMemcpyHostToDevice);
cufftHandle plan;
cufftPlan3d(&plan, NZ, NY, NX, CUFFT_R2C);
cufftExecR2C(plan, data_dev, data_dev1);
//Multiply kernel
cudaMemcpy(Akernel, KernelBuffer, NX * NY * NZ * sizeof(float), cudaMemcpyHostToDevice);
static const int BLOCK_SIZE = 1000;
const int blockCount = (NX*NY*NZ+BLOCK_SIZE-1)/BLOCK_SIZE;
MultiplyKernel <<<blockCount, BLOCK_SIZE>>> (data_dev2, Akernel, data_dev1, NX*NY*NZ);
cufftDestroy(plan);
//cufftPlan3d(&plan, NZ, NY, NX, CUFFT_C2R);
cufftPlan3d(&plan, NZ,NY,NX, CUFFT_C2R);
cufftExecC2R(plan, data_dev2, data_dev3 );
Scale <<<blockCount, BLOCK_SIZE>>> (data_dev3, NX*NY*NZ);
cudaMemcpy(data2, data_dev3, NZ * NY * NX * sizeof(cufftReal), cudaMemcpyDeviceToHost);
cufftDestroy(plan);
cudaFree(data_dev);
cudaFree(data_dev1);
cudaFree(data_dev2);
cudaFree(data_dev3);
cudaFree(Akernel);
}
将R2C fft的结果乘以复数时,结果不再对应于对称数组。
相关文章:
- C++核心准则 C35 对于接口类"A base class destructor should be either public and virtual, or protected and nonv
- 为什么C++逐位AND运算符在不同大小的操作数中表现为这样
- 为什么 Clang 不允许"and"作为函数名称?
- 位阵列上的快速AND运算
- 是否可以在 C++03 中定义'move-and-swap idiom'等效项
- BoostPython and CMake
- OpenSSL BIO and SSL_read
- Gurobi GRBModel and GRBmodel in C++
- std::visit and std::variant usage
- SHBrowseForFolder with BIF_BROWSEFORCOMPUTER and SHGetPathFr
- Directx12 and keystrokes
- different between int **arr =new int [ n]; and int a[i][j]?
- C++ getenv and setenv
- Inference pytorch C++ with alexnet and cv::imread image
- Visual Studio 2019 C++ and std::filesystem
- 保证逻辑 AND 表达式中的函数调用
- python ctypes and C++ pointers
- C++ const char with .begin() and .end()
- Threads with Classes and std::packaged_task
- About cufft R2C and C2R