Openmp:无法正确计算并行 for 循环内作业的状态

Openmp: Cannot calculate the status of job inside parallel for loop properly

本文关键字:循环 for 作业 状态 并行 计算 Openmp      更新时间:2023-10-16

我正在尝试在并行循环中实现任务状态报告功能。这种 for 循环的并行化是使用 "OPENMP" 执行的。

我希望状态报告像这样执行:

Work done 70%; estimated time left 3:30:05 hour.

当然,我可以通过计算"开始时间"和"当前时间"之间的差异来计算"估计剩余时间"。但是,即使使用"静态"声明,我似乎也无法在 for 循环内准确计算"完成的工作"。

如能提供一些指导,将不胜感激。

我的代码输出:

Values of cores : 8
Outer loop =================================
Thread 0  iCount0   
 % of work done 10
Outer loop ================================= 
Thread 0  iCount1
Outer loop ================================= 
Thread 2  iCount2
Outer loop ================================= 
Thread 7  iCount3
 % of work done 40
Outer loop =================================
Thread 5  iCount4
 % of work done 50
Outer loop =================================
Thread 3  iCount5
 % of work done 60
Outer loop =================================
Thread 4  iCount6 
 % of work done 70
Outer loop =================================
Thread 1  iCount7
 % of work done 20
 % of work done 80
Outer loop ================================= 
Thread 6  iCount8 
 % of work done 90
Outer loop ================================= 
Thread 1  iCount9  
 % of work done 100
 % of work done 30

从输出的最后两行可以看出,我无法正确计算作业状态。

这是我的代码:

注意:我

故意使用"std::endl"而不是"",因为以某种方式刷新输出缓冲区会弄乱我的工作%计算。我敢肯定,如果我并行执行实际计算,也会出现类似的情况

#include "stdafx.h"
#include <iostream>     // std::cout, std::endl
#include <iomanip>      // std::setfill, std::setw
#include <math.h>       /* pow */
#include <omp.h>
int main(int argc, char** argv)
  {
    // Get the number of processors in this system
    int iCPU = omp_get_num_procs();
    // Now set the number of threads
    omp_set_num_threads(iCPU);
    std::cout << "Values of cores : " << iCPU <<" n";
    int x = 0; 
    int iTotalOuter = 10;
    static int iCount = 0;
    #pragma omp parallel for private(x) 
    for(int y = 0; y < iTotalOuter; y++) 
    { 
        std::cout << "Outer loop =================================n" ;     
        std::cout <<"nThread "<<omp_get_thread_num()<<"  iCount" << iCount<<std::endl;
        for(x = 0; x< 5; x++) 
        { 
            //std::cout << "Inner loop n" ;        
        } 
        iCount = iCount + 1;        
        std::cout <<"n % of work done " << (double)100*((double)iCount/(double)iTotalOuter)<<std::endl;
    }
  std::cin.ignore(); //Wait for user to hit enter
  return 0;
  }

更新:根据"阿维·金斯伯格"的回答,我试图这样做:

#include "stdafx.h"
#include <iostream>     // std::cout, std::endl
#include <iomanip>      // std::setfill, std::setw
#include <math.h>       /* pow */
#include <omp.h>
void ReportJobStatus(int , int );
int main(int argc, char** argv)
  {   
    // Get the number of processors in this system
    int iCPU = omp_get_num_procs();
    // Now set the number of threads
    omp_set_num_threads(iCPU);
    std::cout << "Values of cores : " << iCPU <<" n";
    int x = 0; 
    int iTotalOuter = 100;
    static int iCount = 0;
    #pragma omp parallel for private(x) 
    for(int y = 0; y < iTotalOuter; y++) 
    { 
        std::cout << "Outer loop =================================n" ;     
        for(x = 0; x< 5; x++) 
        { 
            //std::cout << "Inner loop n" ;        
        } 
        #pragma omp atomic
        iCount++;   
        std::cout<< " omp_get_thread_num(): " << omp_get_thread_num() <<"n";
        if (omp_get_thread_num() == 0){
            ReportJobStatus(iCount, iTotalOuter);
        }
    }
  std::cin.ignore(); //Wait for user to hit enter
  return 0;
  }

问题(已更新):问题是同一线程用于并发执行。因此,"已完成的工作"报告受到严重限制。如何根据数据将作业分配给不同的内核。

这是我代码的当前输出:

Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 1
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 2
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 3
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 4
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 5
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 6
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 7
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 8
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 9
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 10
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 11
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 12
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 13
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 14
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 15
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 16
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 17
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 18
Outer loop =================================
 omp_get_thread_num(): 0
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 1
 % of work done 19
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 54
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 55
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 56
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 57
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 58
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 59
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 60
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 61
Outer loop =================================
 omp_get_thread_num(): 0
 % of work done 62
Outer loop =================================
 omp_get_thread_num(): 6
Outer loop =================================
 omp_get_thread_num(): 6
Outer loop =================================
 omp_get_thread_num(): 6
Outer loop =================================
 omp_get_thread_num(): 6
Outer loop =================================
 omp_get_thread_num(): 6
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 3
Outer loop =================================
 omp_get_thread_num(): 1
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
 omp_get_thread_num(): 5
Outer loop =================================
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
Outer loop =================================
 omp_get_thread_num(): 4
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 7
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2
Outer loop =================================
 omp_get_thread_num(): 2 

在循环中使用criticalatomic

#pragma omp critical
    {
        (++prog);
    }

或更好:

#pragma omp atomic
(++prog);

并考虑只让主线程打印进度。

if(omp_get_thread_num() == 0)
{
  cout << "Progress: " << float(prog)/totalNumber;
}