为什么特征向量点比手动循环慢

Why Eigen vector dot is slower than manual for loop

本文关键字：循环特征向量为什么更新时间：2023-10-16

我对特征是个新手。

我测试了本征的矢量点性能，发现它比手动循环慢。

代码如下：

#include <Eigen/Dense>
#include <iostream>
#include <vector>
int main()
{
  Eigen::VectorXf neu1 = Eigen::VectorXf::Random(100000000);
  std::vector<float> x(100000000);
  for(int i = 0; i < 100000000; ++i)
    x[i] = neu1[i];
  clock_t t1 = clock();
  float r = 0.0f;
  for(int i = 0; i < 100000000; ++i)
    r += x[i]*x[i];
  clock_t t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;
  t1 = clock();
  r = neu1.dot(neu1);
  t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;
}

结果是：

g++ test.cpp -otest -I/usr/local/include/eigen/
time: 1070000
time: 1910000
g++ test.cpp -otest -I/usr/local/include/eigen/ -Ofast -march=native
time: 0
time: 50000

而且，#define EIGEN_NO_DEBUG似乎没有任何效果。

我认为本征应该优化，并没有理由比循环慢。

我做错什么了吗？

或者，如何优化本征性能？

thx

您不会对第一次计算的结果执行任何操作，然后再分配给它。第一次计算完全优化。您可以通过在计算后打印r的值来解决此问题：

#include <iostream>
#include <vector>
#include <eigen3/Eigen/Core>
#include <time.h>
int main()
{
  Eigen::VectorXf neu1 = Eigen::VectorXf::Random(100000000);
  std::vector<float> x(100000000);
  for(int i = 0; i < 100000000; ++i)
    x[i] = neu1[i];
  clock_t t1 = clock();
  float r = 0.0f;
  for(int i = 0; i < 100000000; ++i)
    r += x[i]*x[i];
  clock_t t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;
  std::cout << r << std::endl;
  t1 = clock();
  r = neu1.dot(neu1);
  t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;  
  std::cout << r << std::endl;
  return 0;
}

以下是运行生成的示例：

/tmp $ g++ -Wall -Wextra -pedantic -O3 -std=c++14 bla.cpp
/tmp $ ./a.out 
time: 272958
1.67772e+07
time: 29003
3.29441e+07

或

/tmp $ g++ -Wall -Wextra -pedantic -Ofast -std=c++14 bla.cpp
/tmp $ ./a.out 
time: 29953
3.23292e+07
time: 28853
3.29441e+07

这种改变不会让你的基准测试变得更好，但结果不会再出现可怕的错误。

您仍然应该考虑使用不同数据集多次运行的平均值。也不要为每次运行生成不同的测试数据，因为你的结果不会是可复制的。

最后，如下所述，结果的差异可能是由于溢出和/或舍入错误造成的。建议更改为双倍精度或缩短阵列的长度，然后再次运行测试。

您遇到了过度优化：编译器比您更聪明，可以优化循环计算。

我在我的机器上得到了这些时间：

time: 0
time: 23422

如果您需要确保在基准测试中读取/写入某些内容，请使用volatile:

#include <Eigen/Dense>
#include <iostream>
#include <vector>
int main()
{
  Eigen::VectorXf neu1 = Eigen::VectorXf::Random(100000000);
  std::vector<float> x(100000000);
  for(int i = 0; i < 100000000; ++i)
    x[i] = neu1[i];
  clock_t t1 = clock();
  float temp = 0.0f;
  for(int i = 0; i < 100000000; ++i)
    temp += x[i]*x[i];
  volatile float result = temp;    
  clock_t t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;
  t1 = clock();
  result = neu1.dot(neu1);
  t2 = clock();
  std::cout<<"time: "<<t2-t1<<std::endl;
}

然后，我的机器上的时间变成：

time: 79060
time: 21542