Constexpr在运行时的性能更差

constexpr performing worse at runtime

本文关键字：性能运行时 Constexpr 更新时间：2023-10-16

我编写了下面的代码来测试constexpr阶乘与普通方法计算所花费的时间

#include<iostream>
#include<chrono>
constexpr long int factorialC(long int x){  return x*(x <2?1 : factorialC(x-1));}
using ns = std::chrono::nanoseconds;
using get_time = std::chrono::steady_clock;
void factorial(long int x){
    long int suma=1;
    for(long int i=1; i<=x;i++)
    {
        suma=suma*i;
    }
    std::cout<<suma<<std::endl;
}
int main(){
    long int x = 13;
    std::cout<<"Now calling the constexpr"<<std::endl;
    auto start1 = get_time::now();
    std::cout<<factorialC(x)<<std::endl;
    auto end1 = get_time::now();
    std::cout<<"Now calling the normal"<<std::endl;
    auto start2 = get_time::now();
    factorial(x);
    auto end2 = get_time::now();
    std::cout<<"Elapsed time for constexpr is "<<std::chrono::duration_cast<ns>(end1-start1).count()
    <<" Elapsed time for normal is "<<std::chrono::duration_cast<ns>(end2-start2).count()<<std::endl;
}

当我运行代码时，我得到

Now calling the constexpr                                                                                                   
1932053504                                                                                                                  
Now calling the normal                                                                                                      
1932053504                                                                                                                  
Elapsed time for constexpr is 81812 Elapsed time for normal is 72428

但是constexpr的时间应该接近于"0"，因为它已经在编译时计算过了。

但令人惊讶的是，constexpr的计算比普通的阶乘要花更多的时间。我试着去理解这个问题，但是我不能在我的上下文中理解答案。

请帮我理解一下。

我编译的代码通过(文件名是constexpr.cpp)

g++ --std=c++11 constexpr.cpp

V2

: -

@rici输入后，我将第18行改为

const long int x =13;

结果现在是

Now calling the constexpr                                                                                                   
1932053504                                                                                                                  
Now calling the normal                                                                                                      
1932053504                                                                                                                  
Elapsed time for constexpr is 114653 Elapsed time for normal is 119052

似乎一旦我提到x是const，编译器就会在编译时计算阶乘c

我在windows上使用MinGW32的4.9.3版本的g++

问题是constexpr是不保证在编译时进行评估。关键字constexpr只是说它可以，但是编译器也可以在运行时自由地评估它，因为它认为合适。

在运行时间上的差异可能是因为你1)做得不够(一次迭代没什么)和2)递归没有迭代快(我认为，尽管差异很小)。

为了保证编译时求值，必须在编译器必须在编译时求值的上下文中使用，例如模板:

template<unsigned long long n>
auto evaluate() { return n; }
//...
auto start1 = get_time::now();
std::cout << evaluate<factorialC(x)>() << std::endl; //factorialC is evaluted
                                                     //at compile timme
auto end1 = get_time::now();

_{还有evaluate, std::integral_constant的标准库函数。}

long int x = 13;这不是一个常量表达式，所以编译器不能在编译时计算factorial(x);。

尝试向它发送常量值，比如constexpr值，这样它就可以进行计算了:

int main(){
    long int x = 13;
    constexpr long y = 13;
    std::cout << "Now calling the constexpr" << std::endl;
    auto start1 = get_time::now();
    // Notice the use of a constexpr value here!
    std::cout << factorialC(y) << std::endl;
    auto end1 = get_time::now();

    std::cout << "Now calling the normal" << std::endl;
    auto start2 = get_time::now();
    // Simply call your function witha runtime value.
    // Try to ensure that the compiler don't inline the obvious value of x
    std::cout << factorialC(x) << std::endl;
    auto end2 = get_time::now();
    std::cout << "Elapsed time for constexpr is "
        << std::chrono::duration_cast<ns>(end1-start1).count()
        << " Elapsed time for normal is "
        << std::chrono::duration_cast<ns>(end2-start2).count()
        << std::endl;
}

顺便说一下，在谈论性能时，你应该将苹果与苹果进行比较。

注意，它不能在编译时计算，因为编译器不知道这个函数中long int x的值:

 constexpr long int factorialC(long int x)

如果需要编译时间阶乘，可以使用模板代替。比如:

 #include <iostream>
 template<int N> inline int factorial(){ return N*factorial<N-1>(); }
 template<> inline int factorial<1>(){ return 1; }
 int main()
 {
     std::cout << factorial<13>() << std::endl;
     return 0;
 }