pointers with OpenMP
pointers with OpenMP
我正试图在我的程序中使用OpenMP(我是使用OpenMP的新手),程序在两个位置返回错误。
下面是一个示例代码:
#include <iostream>
#include <cstdint>
#include <vector>
#include <boost/multi_array.hpp>
#include <omp.h>
class CNachbarn {
public:
CNachbarn () { a = 0; }
uint32_t Get_Next_Neighbor() { return a++; }
private:
uint32_t a;
};
class CNetwork {
public:
CNetwork ( uint32_t num_elements_ );
~CNetwork();
void Validity();
void Clean();
private:
uint32_t num_elements;
uint32_t nachbar;
std::vector<uint32_t> remove_node_v;
CNachbarn *Nachbar;
};
CNetwork::CNetwork( uint32_t num_elements_ ) {
num_elements = num_elements_;
Nachbar = new CNachbarn();
remove_node_v.reserve( num_elements );
}
CNetwork::~CNetwork() {
delete Nachbar;
}
inline void CNetwork::Validity() {
#pragma omp parallel for
for ( uint32_t i = 0 ; i < num_elements ; i++ ) {
#pragma omp critical
remove_node_v.push_back(i);
}
}
void CNetwork::Clean () {
#pragma omp parallel for
for ( uint8_t j = 0 ; j < 2 ; j++ ) {
nachbar = Nachbar->Get_Next_Neighbor();
std::cout << "i: " << i << ", neighbor: " << nachbar << std::endl;
}
remove_node_v.clear();
}
int main() {
uint32_t num_elements = 1u << 3;
uint32_t i = 0;
CNetwork Network( num_elements );
do {
Network.Validity();
Network.Clean();
} while (++i < 2);
return 0;
}
我想了解
如果#pragma omp critical是
push_back()
的好解决方案?(能解决这个问题吗?)为每个线程定义自己的向量,然后组合它们(使用insert())会更好吗?还是某种CCD_ 2?在我的原始代码中,我得到了一个运行错误:
nachbar = Nachbar->Get_Next_Neighbor( &remove_node_v[i] );
,但在本例中没有。尽管如此,我还是希望OpenMP使用CNachbarn
类作为核心的数量,因为CNachbarn
是递归计算,不应该受到其他线程的影响。问题是如何巧妙地做到这一点?(我认为每次启动for循环时定义CNachbarn
是不明智的,因为我在模拟中调用这个函数的次数超过了一百万次,时间很重要
关于您的第一个问题:您的函数Validity是在并行循环中实现低于串行性能的完美方法。然而,你已经给出了正确的答案。您应该为每个线程填充独立的向量,然后合并它们。
inline void CNetwork::Validity() {
#pragma omp parallel for
for ( uint32_t i = 0 ; i < num_elements ; i++ ) {
#pragma omp critical
remove_node_v.push_back(i);
}
}
编辑:可能的补救措施如下(如果您需要串行访问元素,则需要稍微更改循环)
inline void CNetwork::Validity() {
remove_node_v.reserve(num_elements);
#pragma omp parallel
{
std::vector<uint32_t> remove_node_v_thread_local;
uint32_t thread_id=omp_get_thread_num();
uint32_t n_threads=omp_get_num_threads();
for ( uint32_t i = thread_id ; i < num_elements ; i+=n_threads )
remove_node_v_thread_local.push_back(i);
#pragma omp critical
remove_node_v.insert(remove_node_v.end(), remove_node_v_thread_local.begin(), remove_node_v_thread_local.end());
}
}
您的第二个问题可以通过定义一个具有最大OMP线程数的CNachbarn数组来解决,并从每个线程访问数组的不同元素,如:
CNachbarn* meine_nachbarn=alle_meine_nachbarn[omp_get_thread_num()]
相关文章:
- OpenMP阵列性能较差
- OpenMP卸载说'fatal error: could not find accel/nvptx-none/mkoffload'
- 使用 GCC 卸载的 OpenMP 卸载失败,并出现"Ptx assembly aborted due to errors"
- Problems with std::cin.fail()
- OpenMP:并行更新数组总是需要减少数组吗
- 如何使用OpenMP并行这两个循环
- 从python调用openMP共享库时,未定义opnMP函数
- Qimage setPixel with openmp 并行 for 不起作用
- 在Visual Studio中使用OpenMP with Clang和CMake
- "Segfault using proj4 with OpenMP"或"How to use thread-specific globals with OpenMP"
- OpenMP 4.5 on Windows with Clang, CMake & Ninja
- paralelizing for loop with inequality (openmp c++)
- Openmp with ofstream and system command
- OpenMP with clang
- TBB concurrent_vector with openmp
- Makefile with OpenMP:不能用-c、-S或-E指定多个文件的-o
- pointers with OpenMP
- Parallel programming in c++ with openmp
- Using openmp with odeint
- Viterbi algorithm with OpenMP