有选择地启用一个并行区域内的OpenMP进行循环
Selectively enable an OpenMP for loop within a parallel region
是否可以选择性地启用使用模板参数或运行时间变量的OpenMP指令?
?this (all threads work on the same for loop).
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < 10; ++i) { /*...*/ }
}
versus this (each thread works on its own for loop)
#pragma omp parallel
{
for (int i = 0; i < 10; ++i) { /*...*/ }
}
更新(测试如果子句)
test.cpp:
#include <iostream>
#include <omp.h>
int main() {
bool var = true;
#pragma omp parallel
{
#pragma omp for if (var)
for (int i = 0; i < 4; ++i) {
std::cout << omp_get_thread_num() << "n";
}
}
}
错误消息(G 6,用G test.cpp -fopenmp编译)
test.cpp: In function ‘int main()’:
test.cpp:8:25: error: ‘if’ is not valid for ‘#pragma omp for’
#pragma omp for if (var)
^~
有点工作。不知道是否可以摆脱获得线程ID的有条件。
#include <iostream>
#include <omp.h>
#include <sstream>
#include <vector>
int main() {
constexpr bool var = true;
int n_threads = omp_get_num_procs();
std::cout << "n_threads: " << n_threads << "n";
std::vector<std::stringstream> s(omp_get_num_procs());
#pragma omp parallel if (var)
{
const int thread_id0 = omp_get_thread_num();
#pragma omp parallel
{
int thread_id1;
if (var) {
thread_id1 = thread_id0;
} else {
thread_id1 = omp_get_thread_num();
}
#pragma omp for
for (int i = 0; i < 8; ++i) {
s[thread_id1] << i << ", ";
}
}
}
for (int i = 0; i < s.size(); ++i) {
std::cout << "thread " << i << ": "
<< s[i].str() << "n";
}
}
输出(当var == true
时):
n_threads: 8
thread 0: 0, 1, 2, 3, 4, 5, 6, 7,
thread 1: 0, 1, 2, 3, 4, 5, 6, 7,
thread 2: 0, 1, 2, 3, 4, 5, 6, 7,
thread 3: 0, 1, 2, 3, 4, 5, 6, 7,
thread 4: 0, 1, 2, 3, 4, 5, 6, 7,
thread 5: 0, 1, 2, 3, 4, 5, 6, 7,
thread 6: 0, 1, 2, 3, 4, 5, 6, 7,
thread 7: 0, 1, 2, 3, 4, 5, 6, 7,
输出(当var == false
时):
n_threads: 8
thread 0: 0,
thread 1: 1,
thread 2: 2,
thread 3: 3,
thread 4: 4,
thread 5: 5,
thread 6: 6,
thread 7: 7,
我认为惯用的C 解决方案是隐藏算法过载后面的其他OpenMP Pragmas。
#include <iostream>
#include <sstream>
#include <vector>
#include <omp.h>
#include <type_traits>
template <bool ALL_PARALLEL>
struct impl;
template<>
struct impl<true>
{
template<typename ITER, typename CALLABLE>
void operator()(ITER begin, ITER end, const CALLABLE& func) {
#pragma omp parallel
{
for (ITER i = begin; i != end; ++i) {
func(i);
}
}
}
};
template<>
struct impl<false>
{
template<typename ITER, typename CALLABLE>
void operator()(ITER begin, ITER end, const CALLABLE& func) {
#pragma omp parallel for
for (ITER i = begin; i < end; ++i) {
func(i);
}
}
};
// This is just so we don't have to write parallel_foreach()(...)
template <bool ALL_PARALLEL, typename ITER, typename CALLABLE>
void parallel_foreach(ITER begin, ITER end, const CALLABLE& func)
{
impl<ALL_PARALLEL>()(begin, end, func);
}
int main()
{
constexpr bool var = false;
int n_threads = omp_get_num_procs();
std::cout << "n_threads: " << n_threads << "n";
std::vector<std::stringstream> s(omp_get_num_procs());
parallel_foreach<var>(0, 8, [&s](auto i) {
s[omp_get_thread_num()] << i << ", ";
});
for (int i = 0; i < s.size(); ++i) {
std::cout << "thread " << i << ": "
<< s[i].str() << "n";
}
}
如果使用某些特定类型,则可以按类型进行超载,而不是使用bool
模板参数,并通过元素而不是数值索引循环进行迭代。请注意,您可以在OpenMP工作共享循环中使用C 随机访问迭代器!根据您的类型,您很可能能够实现迭代器,该迭代器隐藏有关内部数据访问的所有内容。
#include <omp.h>
#include <sstream>
#include <vector>
#include <iostream>
int main() {
constexpr bool var = false;
int n_threads = omp_get_num_procs();
std::cout << "n_threads: " << n_threads << "n";
std::vector<std::stringstream> s(omp_get_num_procs());
#pragma omp parallel
{
const int thread_id = omp_get_thread_num();
if (var) {
#pragma omp for
for (int i = 0; i < 8; ++i) {
s[thread_id] << i << ", ";
}
} else {
for (int i = 0; i < 8; ++i) {
s[thread_id] << i << ", ";
} // code duplication
}
}
for (int i = 0; i < s.size(); ++i) {
std::cout << "thread " << i << ": "
<< s[i].str() << "n";
}
}
相关文章:
- OpenMP阵列性能较差
- OpenMP卸载说'fatal error: could not find accel/nvptx-none/mkoffload'
- 使用 GCC 卸载的 OpenMP 卸载失败,并出现"Ptx assembly aborted due to errors"
- 两个连续的 OpenMP 并行区域会相互减慢速度
- 是否可以在并行区域中为共享 2D 数组创建选定元素的线程本地副本?(共享,私有,障碍:OPenMP)
- 跨越多个函数/对象的OpenMP并行区域
- 有选择地启用一个并行区域内的OpenMP进行循环
- 嵌套并行区域 OpenMP
- OpenMP并行区域中的std::vector push_back会导致错误共享吗
- 为什么 openMP 取消构造不取消工作共享区域
- 如何在OpenMP并行区域内找到是否
- OpenMP-主指令中的并行区域
- 在OpenMP并行区域中使用向量push_back是否安全
- 在函数外声明并行区域后的OpenMP缩减
- 如何有条件地终止OpenMP中的并行区域
- 并行区域的OpenMP迭代for循环
- 当num_threads变化时,OpenMP并行区域开销增加
- 通过多个并行区域的OpenMP线程关联
- 私有子句中的变量和OpenMP中并行区域中定义的变量之间有什么区别吗?
- 我应该在openMP并行区域(for循环,任务)内使用gnu并行模式函数吗?