PMPI和OTF2:在CPP程序中链接C代码

PMPI and otf2: linking C code in CPP program

本文关键字:链接 代码 程序 CPP OTF2 PMPI      更新时间:2023-10-16

我有一个由wrap.py生成的CPP程序。Wrap.py用于生产MPI程序的包装器。它将任何正常的MPI呼叫重定向到PMPI呼叫,以拦截目的,以便做到绩效分析。请在此处下载生成的代码。我使用OTF2跟踪MPI程序。

解释代码:

// test4.cpp
__attribute__((constructor)) void init(void)
{
  if(!is_init)
  {
    archive = OTF2_Archive_Open( "./",
                                 "ArchiveTest",
                                 OTF2_FILEMODE_WRITE,
                                 1024 * 1024 /* event chunk size */,
                                 4 * 1024 * 1024 /* def chunk size */,
                                 OTF2_SUBSTRATE_POSIX,
                                 OTF2_COMPRESSION_NONE );
    is_init = true;
  }
}
__attribute__((destructor))  void fini(void)
{
  if(is_init)
  {
    OTF2_Archive_Close( archive );
    is_init = false;
  }
}

我将将代码编译到.SO文件中。因此,当它导入时,constructor将被调用;当.so分离时,destructor被称为。

根据OTF2 Here的官方文档,我编译了该程序:

mpic++ -fpic -c `otf2-config --cflags` -o test4.o test4.cpp
mpic++ -shared -o libtest4.so `otf2-config --ldflags` `otf2-config --libs` test4.o

如果您扩展了上层命令行,则会得到:

mpic++ -fpic -c -I/usr/include -o test4.o test4.cpp
mpic++ -shared -o libtest4.so -L/usr/lib -lotf2 -lm test4.o

拦截的MPI程序来自此处。

做拦截:

$ mpirun -n 2 -x LD_PRELOAD=./libtest4.so ./send_recv
./send_recv: symbol lookup error: ./libtest4.so: undefined symbol: OTF2_Archive_Open
./send_recv: symbol lookup error: ./libtest4.so: undefined symbol: OTF2_Archive_Open
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
  Process name: [[20246,1],0]
  Exit code:    127
--------------------------------------------------------------------------

因此,它看起来像是混合C和CPP会导致问题。链接器无法正确生成C函数的符号,即OTF2_Archive_OpenOTF2_Archive_Close

我添加了2个声明来告诉链接器这些是C函数(下载修改后的程序):

_EXTERN_C_ OTF2_Archive* OTF2_Archive_Open ( const char *  archivePath,
const char *  archiveName,
const OTF2_FileMode   fileMode,
const uint64_t  chunkSizeEvents,
const uint64_t  chunkSizeDefs,
const OTF2_FileSubstrate  fileSubstrate,
const OTF2_Compression  compression
);
_EXTERN_C_ OTF2_ErrorCode OTF2_Archive_Close ( OTF2_Archive *  archive );

但是上面的问题一直存在。和建议?

update1 :OTF2提供.a文件,而不是.SO文件。

$ nm /usr/lib/libotf2.a| grep -i OTF2_Archive_Open
                 U otf2_archive_open
0000000000000000 T OTF2_Archive_Open
                 U otf2_archive_open_def_files
00000000000032e0 T OTF2_Archive_OpenDefFiles
                 U otf2_archive_open_evt_files
00000000000030e0 T OTF2_Archive_OpenEvtFiles
                 U otf2_archive_open_snap_files
00000000000034e0 T OTF2_Archive_OpenSnapFiles
                 U OTF2_Archive_Open
0000000000001180 T otf2_archive_open
0000000000005a40 T otf2_archive_open_def_files
                 U OTF2_Archive_OpenDefFiles
0000000000005880 T otf2_archive_open_evt_files
                 U OTF2_Archive_OpenEvtFiles
0000000000005c00 T otf2_archive_open_snap_files
                 U OTF2_Archive_OpenSnapFiles

$ ldd ./libtest4.so
    linux-vdso.so.1 =>  (0x00007ffe3a6ce000)
    libmpi_cxx.so.1 => /usr/lib/libmpi_cxx.so.1 (0x00007f4757d67000)
    libmpi.so.12 => /usr/lib/libmpi.so.12 (0x00007f4757a91000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f475770e000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f47574f8000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f475712e000)
    libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0x00007f4756f1e000)
    libopen-rte.so.12 => /usr/lib/libopen-rte.so.12 (0x00007f4756ca4000)
    libopen-pal.so.13 => /usr/lib/libopen-pal.so.13 (0x00007f4756a07000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f47567e9000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f47564e0000)
    /lib64/ld-linux-x86-64.so.2 (0x00005620bef03000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f47562dc000)
    libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007f47560a1000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f4755e99000)
    libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f4755c96000)
    libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f4755a8a000)
    libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f4755880000)

$ nm ./libtest4.so | grep -i OTF2_Archive_Open
                 U OTF2_Archive_Open

很奇怪,我在ldd的输出中没有看到任何libotf2.a。但是,如果您从其网站上尝试OTF2 MPI作者的标准示例,则可以使用。对于OTF2 MPI Writer的标准示例,ldd的输出也不包含libotf2.a

您可以在这里找到示例。

链接事项的顺序。您必须在链接的库前面有自己的库,例如

mpic++ -shared test4.o -o libtest4.so `otf2-config --ldflags` `otf2-config --libs`

链接器从左到右解析未知符号。有关更多详细信息,请参见此答案。如果不使用-fPIC构建otf2.a,那可能仍然不起作用。我建议用--enable-shared配置OTF2并改用.so