Openmp:无法正确计算并行 for 循环内作业的状态
Openmp: Cannot calculate the status of job inside parallel for loop properly
我正在尝试在并行循环中实现任务状态报告功能。这种 for 循环的并行化是使用 "OPENMP" 执行的。
我希望状态报告像这样执行:
Work done 70%; estimated time left 3:30:05 hour.
当然,我可以通过计算"开始时间"和"当前时间"之间的差异来计算"估计剩余时间"。但是,即使使用"静态"声明,我似乎也无法在 for 循环内准确计算"完成的工作"。
如能提供一些指导,将不胜感激。
我的代码输出:
Values of cores : 8
Outer loop =================================
Thread 0 iCount0
% of work done 10
Outer loop =================================
Thread 0 iCount1
Outer loop =================================
Thread 2 iCount2
Outer loop =================================
Thread 7 iCount3
% of work done 40
Outer loop =================================
Thread 5 iCount4
% of work done 50
Outer loop =================================
Thread 3 iCount5
% of work done 60
Outer loop =================================
Thread 4 iCount6
% of work done 70
Outer loop =================================
Thread 1 iCount7
% of work done 20
% of work done 80
Outer loop =================================
Thread 6 iCount8
% of work done 90
Outer loop =================================
Thread 1 iCount9
% of work done 100
% of work done 30
从输出的最后两行可以看出,我无法正确计算作业状态。
这是我的代码:
注意:我故意使用"std::endl"而不是"",因为以某种方式刷新输出缓冲区会弄乱我的工作%计算。我敢肯定,如果我并行执行实际计算,也会出现类似的情况
#include "stdafx.h"
#include <iostream> // std::cout, std::endl
#include <iomanip> // std::setfill, std::setw
#include <math.h> /* pow */
#include <omp.h>
int main(int argc, char** argv)
{
// Get the number of processors in this system
int iCPU = omp_get_num_procs();
// Now set the number of threads
omp_set_num_threads(iCPU);
std::cout << "Values of cores : " << iCPU <<" n";
int x = 0;
int iTotalOuter = 10;
static int iCount = 0;
#pragma omp parallel for private(x)
for(int y = 0; y < iTotalOuter; y++)
{
std::cout << "Outer loop =================================n" ;
std::cout <<"nThread "<<omp_get_thread_num()<<" iCount" << iCount<<std::endl;
for(x = 0; x< 5; x++)
{
//std::cout << "Inner loop n" ;
}
iCount = iCount + 1;
std::cout <<"n % of work done " << (double)100*((double)iCount/(double)iTotalOuter)<<std::endl;
}
std::cin.ignore(); //Wait for user to hit enter
return 0;
}
更新:根据"阿维·金斯伯格"的回答,我试图这样做:
#include "stdafx.h"
#include <iostream> // std::cout, std::endl
#include <iomanip> // std::setfill, std::setw
#include <math.h> /* pow */
#include <omp.h>
void ReportJobStatus(int , int );
int main(int argc, char** argv)
{
// Get the number of processors in this system
int iCPU = omp_get_num_procs();
// Now set the number of threads
omp_set_num_threads(iCPU);
std::cout << "Values of cores : " << iCPU <<" n";
int x = 0;
int iTotalOuter = 100;
static int iCount = 0;
#pragma omp parallel for private(x)
for(int y = 0; y < iTotalOuter; y++)
{
std::cout << "Outer loop =================================n" ;
for(x = 0; x< 5; x++)
{
//std::cout << "Inner loop n" ;
}
#pragma omp atomic
iCount++;
std::cout<< " omp_get_thread_num(): " << omp_get_thread_num() <<"n";
if (omp_get_thread_num() == 0){
ReportJobStatus(iCount, iTotalOuter);
}
}
std::cin.ignore(); //Wait for user to hit enter
return 0;
}
问题(已更新):问题是同一线程用于并发执行。因此,"已完成的工作"报告受到严重限制。如何根据数据将作业分配给不同的内核。
这是我代码的当前输出:
Outer loop =================================
omp_get_thread_num(): 0
% of work done 1
Outer loop =================================
omp_get_thread_num(): 0
% of work done 2
Outer loop =================================
omp_get_thread_num(): 0
% of work done 3
Outer loop =================================
omp_get_thread_num(): 0
% of work done 4
Outer loop =================================
omp_get_thread_num(): 0
% of work done 5
Outer loop =================================
omp_get_thread_num(): 0
% of work done 6
Outer loop =================================
omp_get_thread_num(): 0
% of work done 7
Outer loop =================================
omp_get_thread_num(): 0
% of work done 8
Outer loop =================================
omp_get_thread_num(): 0
% of work done 9
Outer loop =================================
omp_get_thread_num(): 0
% of work done 10
Outer loop =================================
omp_get_thread_num(): 0
% of work done 11
Outer loop =================================
omp_get_thread_num(): 0
% of work done 12
Outer loop =================================
omp_get_thread_num(): 0
% of work done 13
Outer loop =================================
omp_get_thread_num(): 0
% of work done 14
Outer loop =================================
omp_get_thread_num(): 0
% of work done 15
Outer loop =================================
omp_get_thread_num(): 0
% of work done 16
Outer loop =================================
omp_get_thread_num(): 0
% of work done 17
Outer loop =================================
omp_get_thread_num(): 0
% of work done 18
Outer loop =================================
omp_get_thread_num(): 0
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 1
% of work done 19
Outer loop =================================
omp_get_thread_num(): 0
% of work done 54
Outer loop =================================
omp_get_thread_num(): 0
% of work done 55
Outer loop =================================
omp_get_thread_num(): 0
% of work done 56
Outer loop =================================
omp_get_thread_num(): 0
% of work done 57
Outer loop =================================
omp_get_thread_num(): 0
% of work done 58
Outer loop =================================
omp_get_thread_num(): 0
% of work done 59
Outer loop =================================
omp_get_thread_num(): 0
% of work done 60
Outer loop =================================
omp_get_thread_num(): 0
% of work done 61
Outer loop =================================
omp_get_thread_num(): 0
% of work done 62
Outer loop =================================
omp_get_thread_num(): 6
Outer loop =================================
omp_get_thread_num(): 6
Outer loop =================================
omp_get_thread_num(): 6
Outer loop =================================
omp_get_thread_num(): 6
Outer loop =================================
omp_get_thread_num(): 6
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 3
Outer loop =================================
omp_get_thread_num(): 1
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
omp_get_thread_num(): 5
Outer loop =================================
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
Outer loop =================================
omp_get_thread_num(): 4
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 7
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
Outer loop =================================
omp_get_thread_num(): 2
在循环中使用critical
或atomic
:
#pragma omp critical
{
(++prog);
}
或更好:
#pragma omp atomic
(++prog);
并考虑只让主线程打印进度。
if(omp_get_thread_num() == 0)
{
cout << "Progress: " << float(prog)/totalNumber;
}
相关文章:
- 如何在C++中从两个单独的for循环中添加两个数组
- 为什么我的for循环不能正确获取argv
- 在基于范围的for循环中使用结构化绑定声明
- 通过for循环使用用户输入填充列表
- 使用for循环检查数组中的重复项
- 在for循环中使用auto vs decltype(vec.size())来处理字符串的向量
- 为什么 const std::p air<K,V>& 在 std::map 上基于范围的 for 循环不起作用?
- 正在使用for循环创建QScatterSerie
- Python中的for循环与C++有何不同
- 在更改for循环的第三部分后,未使用for循环结果
- 在 for 循环中查找问题时遇到困难
- 嵌套for循环C++的问题(初学者)
- 如何用for循环在c++中生成单词三角形
- 如何在for循环中包含两个索引值的测试条件
- 带有多个独立参数的C++For循环
- C++ Python 循环"for i, num in enumerate(list):"版本
- C 多循环 for () 基础知识
- 没有条件值的 FOR 循环"for (int i = 1; ; i++)"无法正常工作
- 为用户提供循环for循环的选项
- 打破循环for循环