如何在 c++ 程序中计算函数的 GFLOP

How to calculate GFLOPs for a funtion in c++ program?

本文关键字：计算函数 GFLOP 程序 c++ 更新时间：2023-10-16

我有一个 c++ 代码，它计算 int 数据类型的阶乘、浮点数据类型的添加和每个函数的执行时间，如下所示：

long Sample_C:: factorial(int n)
{
    int counter;
    long fact = 1;
    for (int counter = 1; counter <= n; counter++)
    {
        fact = fact * counter;
    }
    Sleep(100);
    return fact;
}
float Sample_C::add(float a, float b)
{
    return a+b;
}
int main(){
    Sample_C object;
    clock_t start = clock();
    object.factorial(6);
    clock_t end = clock();
    double time =(double)(end - start);// finding execution time of factorial()
    cout<< time;
    clock_t starts = clock();
    object.add(1.1,5.5);
    clock_t ends = clock();
    double total_time = (double)(ends -starts);// finding execution time of add()
    cout<< total_time;
    return 0;
}

现在，我想要"添加"功能的测量GFLOP。所以，请建议我将如何计算它。因为，我对GFLOP完全陌生，所以请告诉我，我们是否可以为只有foat数据类型的函数计算GFLOP？GFLOP值也因功能不同而变化？

如果我有兴趣估计加法运算的执行时间，我可能会从以下程序开始。但是，我仍然只相信这个程序产生的数字最多在 10 到 100 倍以内（即我并不真正信任这个程序的输出）。

#include <iostream>
#include <ctime>
int main (int argc, char** argv)
{
  // Declare these as volatile so the compiler (hopefully) doesn't
  // optimise them away.
  volatile float a = 1.0;
  volatile float b = 2.0;
  volatile float c;
  // Preform the calculation multiple times to account for a clock()
  // implementation that doesn't have a sufficient timing resolution to
  // measure the execution time of a single addition.
  const int iter = 1000;
  // Estimate the execution time of adding a and b and storing the
  // result in the variable c.
  // Depending on the compiler we might need to count this as 2 additions
  // if we count the loop variable.
  clock_t start = clock();
  for (unsigned i = 0; i < iter; ++i)
  {
    c = a + b;
  }
  clock_t end = clock();
  // Write the time for the user
  std::cout << (end - start) / ((double) CLOCKS_PER_SEC * iter)
      << " seconds" << std::endl;
  return 0;
}

如果您知道您的特定体系结构如何执行此代码，则可以尝试从执行时间估计 FLOPS，但对 FLOPS（在此类操作上）的估计可能不是很准确。

此程序的改进可能是将 for 循环替换为宏实现，或者确保编译器针对内联循环进行扩展。否则，您可能还会在测量中包含循环索引的加法运算。

我认为错误可能不会随问题大小线性扩展。例如，如果您尝试计时的操作花费了 1e9 到 1e15 倍的时间，您也许能够获得 GFLOPS 的体面估计。但是，除非你确切地知道你的编译器和架构在用你的代码做什么，否则我不会有信心尝试用像C++这样的高级语言估计GFLOPS，也许汇编可能会更好（只是一种预感）。

我并不是说它做不到，但为了准确估计，您可能需要考虑很多事情。