带有MPI的主从模型中的死锁

Deadlock in Master-Slave model with MPI

本文关键字:死锁 模型 MPI 带有      更新时间:2023-10-16

我试图用MPI实现主/从模型,但我遇到了一个小问题。

我想做的是,奴隶应该等待主人的命令,他们不应该工作,直到主人发出命令。Master应该同时向所有奴隶发送命令,等待所有奴隶完成命令,然后再次向所有奴隶同时发送命令。

例如,我有3个处理器(1个主处理器,2个从处理器),我向从处理器发送两次订单,我想打印:

Master initialization done.
Master sends order to slave 1
Master sends order to slave 2
Slave 1 got the order from master
Slave 2 got the order from master
Master got response from Slave 1
Master got response from Slave 2
_________________________________
Master sends order to slave 1
Master sends order to slave 2
Slave 1 got the order from master
Slave 2 got the order from master
Master got response from Slave 1
Master got response from Slave 2
All done.

以下是我迄今为止所做的工作。

int count = 0;
int number;
if (procnum == 0) {
    // initialize master, slaves shouldn't be working until this ends
    std::cout << "Master initialization done." << endl;
    while (count < 2) {
        for (int i = 1; i < numprocesses; i++) {
            number = i * 2;
            std::cout << "Master sends order to slave " << i << endl;
            MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
            MPI_Recv(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            std::cout << "Master got response from Slave " << i << endl;
        }
        count++;
    }
    std::cout << "All done" << endl;
} else {
    int received;
    MPI_Recv(&received, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    std::cout << "Slave " << procnum << " got the order from master" << endl;
    MPI_Send(&received, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
}

但我得到了这个:

Master initialization done.
Master sends order to slave 1
Slave 1 got the order from master
Master got response from Slave 1
Master sends order to slave 2
Slave 2 got the order from master
Master got response from Slave 2
Master sends order to slave 1

然后它就卡住了。我做错了什么?

for (int i = 1; i < size; i++) {

应该是

for (int i = 1; i <= size; i++) {

编辑:可以,因为size是3(包括服务器)

关于序列:MPI_Send和MPI_Recv正在阻止调用,因此输出如预期(?)。

如果主控在第二轮被阻止,那是因为从控没有响应。while (count < 2)循环应该同时包含master和slave。