计算机视觉-在TensorFlow上运行C++inception-v3时出现非法指令错误
computer vision - Illegal instruction error while running C++ inception-v3 on TensorFlow
我正在尝试使用C++API教程运行图像识别,在使用Bazel编译TensorFlow之后,在尝试执行label_image
时出现Illegal instruction
错误。
我做了以下步骤:
# After installing the bazel dependencies, I get the bazel installer
$ mkdir ~/bazel-download && cd ~/bazel-download
$ wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh -O bazel-0.3.0-installer-linux-x86_64.sh
$ chmod +x bazel-0.3.0-installer-linux-x86_64.sh
# Install bazel in ~/bin
$ ./bazel-0.3.0-installer-linux-x86_64.sh --user
# Add bazel to the path, if not done already
$ printf 'nexport PATH=$PATH:"~/bin/"n' >> ~/.bashrc
# Before this, I create a new terminal to refresh the bash PATH
$ mkdir ~/inceptionV3 && cd ~/inceptionV3
# Get a stable version of TensorFlow
$ git clone https://github.com/tensorflow/tensorflow -b r0.9
$ cd tensorflow
# Add the InceptionV3 data/models for the C++ api
$ wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip -O tensorflow/examples/label_image/data/inception_dec_2015.zip
$ unzip tensorflow/examples/label_image/data/inception_dec_2015.zip -d tensorflow/examples/label_image/data/
# Configure tensorflow: set python path, no Google Cloud Platform support, no GPU support
$ ./configure
# Run bazel build with the allocated resources
$ bazel build -c opt --copt=-mavx --verbose_failures --local_resources 2048,2.0,1.0 -j 1 tensorflow/examples/label_image/...
# -- Here's the last log output from bazel --
INFO: From Compiling tensorflow/core/common_runtime/function.cc:
tensorflow/core/common_runtime/function.cc: In lambda function:
tensorflow/core/common_runtime/function.cc:392:60: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
} else if (rets->size() != ctx->num_outputs()) {
^
INFO: Elapsed time: 6929.927s, Critical Path: 69.23s
# Look like there's no error during the compilation, but now, if I run the generated executable:
$ ./bazel-bin/tensorflow/examples/label_image/label_image
Illegal instruction
此外,我在Docker上运行Ubuntu 14.04.4 LTS x86_64容器(gcc/g++版本为4.8.4)
我尝试使用其他设置运行此程序,例如使用bazel的apt-get-install,但在使用新的编译运行可执行文件后,仍然会出现Illegal instruction
错误。
话虽如此,教程中的Python部分运行良好(使用Python 2.7.6)。有没有办法解决C++API的问题?
第1版:(添加更多关于cpu的信息)这是我从/proc/cpuinfo得到的输出。
edit2:(尝试调试tensorflow)使用此命令编译:
$ bazel build -c dbg --strip=always --copt=-mavx --verbose_failures --local_resources 2048,2.0,1.0 -j 1 tensorflow/examples/label_image/...
我试着用gdb:进行调试
$ -q bazel-bin/tensorflow/examples/label_image/label_image
Reading symbols from bazel-bin/tensorflow/examples/label_image/label_image...(no debugging symbols found)...done.
(gdb) set disable-randomization off
(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.
(gdb) backtrace
No stack.
(gdb) handle SIGILL nostop
Signal Stop Print Pass to program Description
SIGILL No Yes Yes Illegal instruction
(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.
(gdb) backtrace
No stack.
(gdb) info files
Symbols from "/root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image".
Local exec file:
`/root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image', file type elf64-x86-64.
Entry point: 0x434b10
0x0000000000400270 - 0x000000000040028c is .interp
0x000000000040028c - 0x00000000004002ac is .note.ABI-tag
0x00000000004002ac - 0x00000000004002cc is .note.gnu.build-id
0x00000000004002d0 - 0x0000000000400380 is .gnu.hash
0x0000000000400380 - 0x00000000004027e0 is .dynsym
0x00000000004027e0 - 0x0000000000404667 is .dynstr
0x0000000000404668 - 0x0000000000404970 is .gnu.version
0x0000000000404970 - 0x0000000000404b70 is .gnu.version_r
0x0000000000404b70 - 0x0000000000431360 is .rela.dyn
0x0000000000431360 - 0x00000000004334a8 is .rela.plt
0x00000000004334a8 - 0x00000000004334c2 is .init
0x00000000004334d0 - 0x0000000000434b10 is .plt
0x0000000000434b10 - 0x00000000027cfe2f is .text
0x00000000027cfe30 - 0x00000000027cfe39 is .fini
0x00000000027cfe40 - 0x0000000003890ed0 is .rodata
0x0000000003890ed0 - 0x0000000003acc1ec is .eh_frame_hdr
0x0000000003acc1f0 - 0x000000000441fc2c is .eh_frame
0x000000000441fc2c - 0x000000000444474f is .gcc_except_table
0x0000000004644dd0 - 0x0000000004644de0 is .tdata
0x0000000004644de0 - 0x0000000004644df8 is .tbss
0x0000000004644de0 - 0x0000000004645a70 is .init_array
0x0000000004645a70 - 0x0000000004645a78 is .fini_array
0x0000000004645a78 - 0x0000000004645a80 is .jcr
0x0000000004645a80 - 0x00000000046a5d50 is .data.rel.ro
0x00000000046a5d50 - 0x00000000046a5f90 is .dynamic
0x00000000046a5f90 - 0x00000000046a6000 is .got
0x00000000046a6000 - 0x00000000046a6b30 is .got.plt
0x00000000046a6b40 - 0x00000000046a70d0 is .data
0x00000000046a70e0 - 0x00000000046aae18 is .bss
(gdb) break main
Breakpoint 1 at 0x436cc0
(gdb) run
Starting program: /root/.cache/bazel/_bazel_root/b54d699ba1afcab684f4628c78701dbe/execroot/tensorflow/bazel-out/local-dbg/bin/tensorflow/examples/label_image/label_image
During startup program terminated with signal SIGILL, Illegal instruction.
(gdb) backtrace
No stack.
到目前为止,由于Illegal instruction
错误是由SIGILL信号引起的,所以我想我当前的体系结构与生成的机器代码不匹配。然而,我不知道如何处理这个特殊的问题。
经过几次搜索,--copt=-mavx
实际上是传递给gcc的一个参数,用于优化OSX机器上的架构,如图所示。所以它不可能在我的linux"PC"机器上工作。
相关文章:
- 无法编译 rtmidi 测试 cmidiin.cpp 文件, 非法指令
- C++中的移动分配出现问题.非法指令: 4.
- while 循环 c++ 中的非法指令
- 来自 VS C++ 在 Windows 上的非法指令
- AVX512 非法指令
- 在Visual "Microsoft studio 2019"上设置OpenCV 4.1.1时遇到问题?(非法指令。
- 在运行基本 Avx512 代码时获取非法指令
- exe_common.inl中的非法指令
- 从C 调用Tensorflow Lite .tflite CNN模型时,非法指令
- _mm_fmadd_pd程序收到信号SIGILL,非法指令
- 仅在64位释放模式中的位移位非法指令
- 程序收到信号Sigill非法指令
- 标准::字符串中的非法指令
- 获取列表的第一个和最后一个元素<string>给我非法指令错误
- 为什么Folloing代码在2010年Visual Studio(X64应用程序)中抛出非法指令例外
- std::p romise::get_future 提出非法指令 (SIGILL)
- 使用mpopcnt编译会导致非法指令错误
- 使用带有自定义对齐分配器实现的最新g++,使用SSE和-O3选项编译时出现非法指令(核心转储)
- 程序接收到信号SIGILL,非法指令
- 在VMWare机器上编译的程序在亚马逊服务器上运行时会因非法指令而崩溃