试图理解 OpenMP 浮动数值错误

Trying to understand OpenMP floating numerical errors

本文关键字：错误 OpenMP 更新时间：2023-10-16

我最近开始使用OpenMP对我的图像处理项目进行多线程(MT)。

除了一个函数之外，我对任何函数都没有任何问题(不是计算繁重，但与其他函数中的 int 相比，浮点数操作更多)。

所以首先，假设单线程 (ST) 结果等于图像 X，并且 MT 结果是 Y。

当使用小窗口进行平均时，X == Y，但当窗口变大(5x5)时，X ！= Y。

因此，我引入了一些"打印"来查看特定像素的值，使用打印繁荣！X == Y 再次。这就是我想了解的。为什么当我打印该代码时，结果又回到了结果 X？

请注意，我试图将浮点模型(英特尔编译器)更改为精确和扩展，并且两个模型都给出了 ST 和 MT 相等，但新的 ST 结果是 Z ！= X 并且比使用默认浮点模型更长。

编辑：当前代码：

const int tileOffset = 1;
unsigned char** texturePtr = (unsigned char**)texture->getRowPtr();
short** wrkSrcPtr = (short**)wrkSrc->getRowPtr();
short** imFitAPtr = (short**)imFitA->getRowPtr();
short** imFitBPtr = (short**)imFitB->getRowPtr();
short** imFitCPtr = (short**)imFitC->getRowPtr();
// now, compute raw texture value for each pixel using the above plane equations
#pragma omp parallel num_threads(g_options->ompNumberThreads) if(g_options->ompThreaded) 
{
#pragma omp for  
for ( int i = 0; i < src->getHeight(); i = i + tileOffset ) {
for ( int j = 0; j < src->getWidth(); j = j + tileOffset ) {
bool printPoint = false;                   
int jVal = 333;
int iVal = 99;
if ( j == jVal && i == src->getHeight() - iVal - 1 ) {
printPoint = true;
printf("nnAt (%d, %d) with Thread %d n", jVal, iVal, omp_get_thread_num());
}
jVal = 343;
iVal = 204;
if ( j == jVal && i == src->getHeight() - iVal - 1 ) {
printPoint = true;
printf("nnAt (%d, %d) with Thread %d n", jVal, iVal, omp_get_thread_num());
}                    
const int ti = i * tileOffset;
const int tj = j * tileOffset;
const float planeA = imFitAPtr[i][j] / 32000.0f*255.0f;
const float planeB = imFitBPtr[i][j] / 32000.0f*255.0f;
const float planeC = imFitCPtr[i][j] / 32000.0f*255.0f;
float sum2 = 0.0f;
float sum = 0.0f;
int nbSum = 0;
if ( printPoint ) {
printf("Fit (A,B,C) = (%d, %d, %d) and In float (%f, %f, %f) n",
imFitAPtr[i][j], imFitBPtr[i][j], imFitCPtr[i][j],
planeA, planeB, planeC);
}
for ( int ri = i - halfROI; ri <= i + halfROI; ri++ ) {
for ( int rj = j - halfROI; rj <= j + halfROI; rj++ ) {
// sanity checks (image boundaries)
if ( ri < 0 || ri >= src->getHeight() || rj < 0 || rj >= src->getWidth() ) continue;
// eval the local plane at that pixel and compute the residual
const float localPlaneValue = planeA * ( rj - j ) + planeB * ( ri - i ) + planeC;
const float residual = wrkSrcPtr[ri][rj] / 32000.0f*255.0f - localPlaneValue;
const float rr = residual*residual;
if ( printPoint )
printf("Local: %f, residual: %f, resSQ: %f, sum2: %f and sum: %f n ", localPlaneValue, residual, rr, sum2, sum);
sum2 += rr;
sum += residual;
nbSum++;

if ( printPoint )
printf("Add sum2: %f, add sum: %f and nb: %d n ", sum2, sum, nbSum);

}
}
if ( printPoint )
printf("n");
// the texture for that pixel is the stdev
float texVal = 0.0f;
if ( nbSum > 1 ) {
texVal = sqrtf(max(( sum2 - sum * sum / nbSum ) / ( nbSum - 1 ), 0.0f)) * scaling;
if ( texVal > 255.0f ) texVal = 255;
}
texturePtr[ti][tj] = (unsigned char)texVal;
if ( printPoint )
printf("Final value : %d (In float: %f) nn", texturePtr[ti][tj], texVal);
}
}
} // End OMP

使用"外部打印"，我注意到平方残差 (rr) 和平方和 (sum2) 是 ST 和 MT 之间不稳定的值。

该问题似乎与Windows下的编译器有关。

此代码使用英特尔编写器 XE 2015 编译。但是当我尝试使用Visual Studio v140时，似乎代码在有和没有OMP的情况下是相似的。

我没有尝试使用较新的英特尔编译器(例如 2017)。在 Linux 下的英特尔作曲家 XE 2015 上不会出现此问题。