C++ Tensorflow API with TensorRT
C++ Tensorflow API with TensorRT
我的目标是在C++应用程序中运行一个张量优化的张量流图。我正在使用张量流 1.8 和张量 4。使用 python api,我能够优化图形并看到不错的性能提升。
尝试在 c++ 中运行图形失败,并显示以下错误:
Not found: Op type not registered 'TRTEngineOp' in binary running on e15ff5301262. Make sure the Op and Kernel are registered in the binary running in this process.
其他非张量图有效。我在python api上遇到了类似的错误,但通过导入tensorflow.contrib.tensorrt解决了它。从错误中,我相当确定内核和 op 没有注册,但我不知道在构建 tensorflow 后如何在应用程序中执行此操作。附带说明一下,我不能使用bazel,但需要使用cmake。到目前为止,我反对libtensorflow_cc.so
和libtensorflow_framework.so
.
谁能在这里帮我?谢谢!
更新: 使用 c 或 c++ API 加载_trt_engine_op.so
加载时不会引发错误,但无法运行
Invalid argument: No OpKernel was registered to support Op 'TRTEngineOp' with these attrs. Registered devices: [CPU,GPU], Registered kernels:
<no registered kernels>
[[Node: my_trt_op3 = TRTEngineOp[InT=[DT_FLOAT, DT_FLOAT], OutT=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], input_nodes=["tower_0/down_0/conv_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer", "tower_0/down_0/conv_skip/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer"], output_nodes=["tower_0/down_0/conv_skip/Relu", "tower_0/down_1/conv_skip/Relu", "tower_0/down_2/conv_skip/Relu", "tower_0/down_3/conv_skip/Relu"], serialized_engine="220{I 00...00 00 00"](tower_0/down_0/conv_0/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, tower_0/down_0/conv_skip/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer)]]
解决 Tensorflow 1.8 上错误"未找到:操作类型未注册'TRTEngineOp'"问题的另一种方法:
1(在文件tensorflow/contrib/tensorrt/BUILD
中,添加新部分,内容如下:
cc_library(
name = "trt_engine_op_kernel_cc",
srcs = [
"kernels/trt_calib_op.cc",
"kernels/trt_engine_op.cc",
"ops/trt_calib_op.cc",
"ops/trt_engine_op.cc",
"shape_fn/trt_shfn.cc",
],
hdrs = [
"kernels/trt_calib_op.h",
"kernels/trt_engine_op.h",
"shape_fn/trt_shfn.h",
],
copts = tf_copts(),
visibility = ["//visibility:public"],
deps = [
":trt_logging",
":trt_plugins",
":trt_resources",
"//tensorflow/core:gpu_headers_lib",
"//tensorflow/core:lib_proto_parsing",
"//tensorflow/core:stream_executor_headers_lib",
] + if_tensorrt([
"@local_config_tensorrt//:nv_infer",
]) + tf_custom_op_library_additional_deps(),
alwayslink = 1, # buildozer: disable=alwayslink-with-hdrs
)
2( 将//tensorflow/contrib/tensorrt:trt_engine_op_kernel_cc
作为依赖项添加到要构建的相应 BAZEL 项目中
PS:无需加载库_trt_engine_op.so
TF_LoadLibrary
以下是我对这个问题(Tensorflow 1.8.0,TensorRT 3.0.4(的发现(以及某种解决方案(:
我想将张量支持包含在一个库中,该库从给定的*.pb
文件加载图形。
只是将//tensorflow/contrib/tensorrt:trt_engine_op_kernel
添加到我的 Bazel BUILD 文件中并不能为我解决问题。我仍然收到一条消息,指示 Ops 未注册:
2018-05-21 12:22:07.286665: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTCalibOp" device_type: "GPU"') for unknown op: TRTCalibOp
2018-05-21 12:22:07.286856: E tensorflow/core/framework/op_kernel.cc:1242] OpKernel ('op: "TRTEngineOp" device_type: "GPU"') for unknown op: TRTEngineOp
2018-05-21 12:22:07.296024: E tensorflow/examples/tf_inference_lib/cTfInference.cpp:56] Not found: Op type not registered 'TRTEngineOp' in binary running on ***.
Make sure the Op and Kernel are registered in the binary running in this process.
解决方案是,我必须使用以下C_API在C++代码中加载Ops库(tf_custom_op_library(:
#include "tensorflow/c/c_api.h"
...
TF_Status status = TF_NewStatus();
TF_LoadLibrary("_trt_engine_op.so", status);
共享对象_trt_engine_op.so
是为 bazel 目标//tensorflow/contrib/tensorrt:python/ops/_trt_engine_op.so
创建的:
bazel build --config=opt --config=cuda --config=monolithic
//tensorflow/contrib/tensorrt:python/ops/_trt_engine_op.so
现在我只需要确保 _trt_engine_op.so 在需要时可用,例如通过LD_LIBRARY_PATH
.
如果有人有一个想法,如何以更优雅的方式做到这一点(为什么我们必须建造 2 件人工制品?我们不能只有一个吗?(,我对每一个建议都感到高兴。
TLDR
将
//tensorflow/contrib/tensorrt:trt_engine_op_kernel
作为依赖项添加到要生成的相应 BAZEL 项目中使用 C-API 在代码中加载操作库
_trt_engine_op.so
。
对于 Tensorflow r1.8,下面在两个 BUILD 文件中显示的添加和使用整体选项构建libtensorflow_cc.so
对我有用。
diff --git a/tensorflow/BUILD b/tensorflow/BUILD
index cfafffd..fb8eb31 100644
--- a/tensorflow/BUILD
+++ b/tensorflow/BUILD
@@ -525,6 +525,8 @@ tf_cc_shared_object(
"//tensorflow/cc:scope",
"//tensorflow/cc/profiler",
"//tensorflow/core:tensorflow",
+ "//tensorflow/contrib/tensorrt:trt_conversion",
+ "//tensorflow/contrib/tensorrt:trt_engine_op_kernel",
],
)
diff --git a/tensorflow/contrib/tensorrt/BUILD b/tensorflow/contrib/tensorrt/BUILD
index fd3582e..a6566b9 100644
--- a/tensorflow/contrib/tensorrt/BUILD
+++ b/tensorflow/contrib/tensorrt/BUILD
@@ -76,6 +76,8 @@ cc_library(
srcs = [
"kernels/trt_calib_op.cc",
"kernels/trt_engine_op.cc",
+ "ops/trt_calib_op.cc",
+ "ops/trt_engine_op.cc",
],
hdrs = [
"kernels/trt_calib_op.h",
@@ -86,6 +88,7 @@ cc_library(
deps = [
":trt_logging",
":trt_resources",
+ ":trt_shape_function",
"//tensorflow/core:gpu_headers_lib",
"//tensorflow/core:lib_proto_parsing",
"//tensorflow/core:stream_executor_headers_lib",
正如您提到的,当您将//tensorflow/contrib/tensorrt:trt_engine_op_kernel
添加到依赖项列表时,它应该可以工作。目前,Tensorflow-TensorRT集成仍在进行中,可能仅适用于python API;对于C++,您需要从tensorflow/contrib/tensorrt/convert/convert_graph.h
拨打ConvertGraphDefToTensorRT()
进行转换。
如果您有任何问题,请告诉我。
解决方案:添加导入
从 tensorflow.python.compiler.tensorrt 导入trt_convert作为 trt
链接讨论: https://github.com/tensorflow/tensorflow/issues/26525
这是我的解决方案,TensorFlow是1.14。 在你的 BUILD 文件中,exp,tensorflow/examples/your_workspace/BUILD:
在tf_cc_binary
:
scrs= [...,"//tensorflow/compiler/tf2tensorrt:ops/trt_engine_op.cc"]
deps=[...,"//tensorflow/compiler/tf2tensorrt:trt_op_kernels"]
- Problems with std::cin.fail()
- 应用程序崩溃并显示"symbol _ZdlPvm, version Qt_5 not defined in file libQt5Core.so.5 with link time reference"
- 这对"With a stackless coroutine, only the top-level routine may be suspended."意味着什么
- Boost.TEST with CLion: "Test framework quit unexpectedly"
- 避免碎片化的ClientHellos with OpenSSL (DTLS)
- Issues with Win32 ReadProcessMemory API
- Qt with WinAPI MouseProc
- [[maybe_unused]] with structured_binding?
- Issue with WriteProcessMemory
- TensorRT (C++ API) 对"createNvOnnxParser_INTERNAL"的未定义引用
- OpenCV RTP-Stream with FFMPEG
- "Unable to start debugging. No process is associated with this object." - 在Visual Studio Code中使用GDB
- std::adjacent_difference with std::chrono time_point
- DLL Made with CMake 使程序崩溃
- QtCreator with C 库中的链接器问题
- SHBrowseForFolder with BIF_BROWSEFORCOMPUTER and SHGetPathFr
- specialized std::default_delete with QQmlComponent
- VS2019 - Sudo Remote Debugging on Linux with Cmake project
- Inference pytorch C++ with alexnet and cv::imread image
- C++ Tensorflow API with TensorRT