正在重启RPC服务

Restarting RPC service

本文关键字:服务 RPC 重启      更新时间:2023-10-16

我有一个客户端进程,它通过svc_run()方法派生一个子进程来侦听传入的rpc。我需要做的是杀死父进程的子进程,然后重新fork子进程,为它提供一个新的CLIENT*到一个新的RPC服务器。

下面是我的相关代码:

// Client Main
CLIENT* connectionToServer;
int pipe[2];
int childPID;
int parentPID;
static void usr2Signal()
{
  ServerData sd;
  clnt_destroy(connectionToServer);
  (void) read(pipe[0], &sd, sizeof(sd));

  // Kill child process.
  kill(childPID, SIGTERM);
  close(pipe[0]);

  // RPC connection to the new server
    CLIENT *newServerConn =
        clnt_create(
          sd.ip,
          sd.programNum,
          1,
          "tcp");
    if (!newServerConn)
    {
        // Connection error.
        exit(1);
    }
    connectionToServer = newServerConn;

  // Respawn child process.
  if (pipe(pipe) == -1)
  {
      // Pipe error.
      exit(2);
  }
  childPID = fork();
  if (childPID == -1)
  {
    // Fork error.
    exit(3);
  }
  if (childPID == 0)
  {
    // child closes read pipe and listens for RPCs.
      close(pipe[0]);
      parentPID = getppid();
      svc_run();
  }
  else
  {
    // parent closes write pipe and returns to event loop.
    close(pipe[1]);
  }
}
int main(int argc, char *argv[])
{
    /* Some initialization code */
    transp = svctcp_create(RPC_ANYSOCK, 0, 0);
    if (transp == NULL) {
        // TCP connection error.
        exit(1);
    }
    if (!svc_register(transp, /*other RPC program args*/, IPPROTO_TCP))
    {
        // RPC register error
        exit(1);
    }

  connectionToServer = clnt_create(
        192.168.x.xxx, // Server IP.
        0x20000123,     // Server RPC Program Number
        1,              // RPC Version
        "tcp");
  if (!connectionToServer)
  {
    // Connection error
    exit(1);
  }
  // Spawn child process first time.
  if (pipe(pipe) == -1) 
  {
    // Pipe error
    exit(1);
  }
  childPID = fork();
  if (childPID == -1)
  {
    // Fork error.
    exit(1);
  }
  if (childPID == 0)
  {
    // Close child's read pipe.
    close(pipe[0]);
    parentPID = getppid();
    // Listen for incoming RPCs.
    svc_run ();
    exit (1);
  }

  /* Signal/Communication Code */
  // Close parent write pipe.
  close(pipe[1]);
  // Parent runs in event loop infinitely until a signal is sent.
  eventLoop();
  cleanup();
}

在我的服务器代码中,我有启动新连接的服务调用。此调用由服务器上的其他操作调用。

// Server Services
void newserverconnection_1_svc(int *unused, struct svc_req *s)
{
    // This service is defined in the server code
    ServerData sd;
    /* Fill sd with data:
         Target IP: 192.168.a.aaa
         RPC Program Number: 0x20000321
         ... other data
    */
    connecttonewserver_1(&sd, connectionToServer); // A client service.
}

回到我的客户端,我有以下服务:

// Client Service
void connecttonewserver_1_svc(ServerData *sd, struct svc_req *s)
{
    // Send the new server connection data to the parent client processs
    // via the pipe and signal the parent.
    write(pipe[1], sd, sizeof(sd));
    kill(parentPID, SIGUSR2);
}

我的问题是,一切运行正常,直到我启动新的连接。我没有进入任何错误部分,但在建立新连接后大约5秒,我的客户端变得无响应。它不会崩溃,子进程似乎仍然活着,但是当鼠标单击触发父进程的事件循环中定义的事件时,我的客户端将不再接收rpc或显示任何打印语句。我可能做了一些稍微错误的事情来为子进程生成这个新的RPC循环,但我看不出是什么。什么好主意吗?

所以这个解决方案达到了我一直在寻找的结果,但绝对远非完美。

static void usr2Signal()
{
  ServerData sd;
  // clnt_destroy(connectionToServer); // Removed this as it closes the RPC connection.
  (void) read(pipe[0], &sd, sizeof(sd));

  // Removed these. Killing the child process also seems to close the
  // connection. Just let the child run.
  // kill(childPID, SIGTERM);
  // close(pipe[0]);

  // RPC connection to the new server
    CLIENT *newServerConn =
        clnt_create(
          sd.ip,
          sd.programNum,
          1,
          "tcp");
    if (!newServerConn)
    {
        // Connection error.
        exit(1);
    }
    // This is the only necessary line. Note that the old 
    // connectionToServer pointer was not deregistered/deallocated,
    // so this causes a memory leak, but is a quick fix to my issue.
    connectionToServer = newServerConn;

    // Removed the rest of the code that spawns a new child process
    // as it is not needed anymore.
}