Constexpr深度限制与clang (fconstexpr-depth似乎不起作用)

constexpr depth limit with clang (fconstexpr-depth doesnt seem to work)

本文关键字：fconstexpr-depth 不起作用 clang 深度 Constexpr 更新时间：2023-10-16

是否有办法配置constexpr实例化深度?我正在运行- fconstexr -depth=4096(使用clang/XCode)。

但仍然编译失败，错误:Constexpr变量fib_1必须用常量表达式初始化。不管是否设置了选项- fconstexr -depth=4096，代码都会失败。

这是clang的bug还是预期的行为方式。注意:这在fib_cxpr(26)之前工作得很好，27是它开始失败的时候。

代码:

constexpr int fib_cxpr(int idx) {
    return idx == 0 ? 0 :
           idx == 1 ? 1 :
           fib_cxpr(idx-1) + fib_cxpr(idx-2); 
}
int main() {
    constexpr auto fib_1 = fib_cxpr(27);
    return 0; 
}

TL;DR:

对于clang，您需要命令行参数-fconstexpr-steps=1271242，并且不需要超过-fconstexpr-depth=27

计算斐波那契数的递归方法不需要太多的递归深度。fib(n)所需的深度实际上不超过n。这是因为最长的调用链是通过fib(i-1)递归调用。

constexpr auto fib_1 = fib_cxpr(3); // fails with -fconstexpr-depth=2, works with -fconstexpr-depth=3
constexpr auto fib_1 = fib_cxpr(4); // fails with -fconstexpr-depth=3, works with -fconstexpr-depth=4

所以我们可以得出结论，-fconstexpr-depth不是重要的设置。

此外，错误消息还指出了差异:

constexpr auto fib_1 = fib_cxpr(27);

用-fconstexpr-depth=26编译，为了确保达到限制，clang产生消息:

note: constexpr evaluation exceeded maximum depth of 26 calls

但是用足够深度的-fconstexpr-depth=27编译，会产生这样的消息:

note: constexpr evaluation hit maximum step limit; possible infinite loop?

所以我们知道clang区分了两种失败:递归深度和'步长限制'。

Google搜索"clang maximum step limit"的结果显示的是关于clang补丁实现该特性的页面，包括命令行选项-fconstexpr-steps的实现。进一步搜索这个选项，会发现没有用户级文档。

所以没有文档说明clang算作一个"步骤"或者fib(27)需要多少"步骤"。我们可以把这个定得很高，但我觉得这不是个好主意。相反，一些实验表明:

n : steps
0 : 2
1 : 2
2 : 6
3 : 10
4 : 18

表示steps(fib(n)) == steps(fib(n-1)) + steps(fib(n-2)) + 2。稍微计算一下，根据这个计算，fib(27)应该需要clang的1,271,242个步骤。因此，使用-fconstexpr-steps=1271242编译应该允许程序编译，它确实可以编译。使用-fconstexpr-steps=1271241编译会导致与之前相同的错误，因此我们知道我们有一个确切的限制。

另一种不太精确的方法是从补丁中观察到默认的步长限制是1,048,576 (2²⁰)，这对于fib(26)来说显然是足够的。直觉上，翻倍应该足够了，从之前的分析我们知道200万已经足够了。一个严格的极限是(φ·steps(fib(26)))(恰好是1,271,242)。

另一件需要注意的事情是，这些结果清楚地表明clang没有对constexpr求值进行任何记忆。GCC有，但似乎在clang中根本没有实现。尽管记忆增加了内存需求，但有时，就像在本例中一样，它可以大大减少计算所需的时间。我从中得出的两个结论是，编写需要记忆的constexpr代码以获得良好的编译时间，这对于可移植代码来说不是一个好主意，并且clang可以通过支持constexpr记忆和启用/禁用它的命令行选项来改进。

你也可以重构你的斐波那契算法，包括显式记忆，这将在clang中工作。

// Copyright 2021 Google LLC.
// SPDX-License-Identifier: Apache-2.0
#include <iostream>
template <int idx>
constexpr int fib_cxpr();
// This constexpr template value acts as the explicit memoization for the fib_cxpr function.
template <int i>
constexpr int kFib = fib_cxpr<i>();
// Arguments cannot be used in constexpr contexts (like the if constexpr),
// so idx is refactored as a template value argument instead.
template <int idx>
constexpr int fib_cxpr() {
    if constexpr (idx == 0 || idx == 1) {
        return idx;
    } else {
        return kFib<idx-1> + kFib<idx-2>;
    }      
}
int main() {
    constexpr auto fib_1 = fib_cxpr<27>();
    std::cout << fib_1 << "n";
    return 0; 
}

此版本适用于对fib_cxpr的任意输入，并且只需要4个步骤即可编译。https://godbolt.org/z/9cvz3hbaE

这不是直接回答问题，但我显然没有足够的声誉来添加这个作为评论…

与"深度限制"无关;但与斐波那契数计算密切相关。

递归可能是错误的方法，不需要。

有一个超低内存占用的超快速解决方案。

因此，我们可以使用编译时预计算所有适合64位值的斐波那契数。

斐波那契级数的一个重要性质是其值呈强指数增长。因此，所有现有的内置整数数据类型都会很快溢出。

使用Binet的公式，你可以计算出第93个斐波那契数是最后一个适合64位无符号值的数。

在编译过程中计算93个值是一个非常简单的任务。

我们首先将计算斐波那契数的默认方法定义为constexpr函数:

// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) noexcept {
    // Initialize first two even numbers 
    unsigned long long f1{ 0 }, f2{ 1 };
    // calculating Fibonacci value 
    while (index--) {
        // get next value of Fibonacci sequence 
        unsigned long long f3 = f2 + f1;
        // Move to next number
        f1 = f2;
        f2 = f3;
    }
    return f2;
}

这样，在编译时就可以很容易地计算斐波那契数。然后，我们用所有斐波那契数填充std::array。我们还使用了一个constexpr，并使其成为一个带有可变参数包的模板。

我们使用std::integer_sequence为索引0、1、2、3、4、5创建一个斐波那契数....

这很简单，也不复杂:

template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
    return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};

这个函数将被输入一个整数序列0,1,2,3,4，…并返回具有相应斐波那契数的std::array<unsigned long long, ...>。

我们知道最多可以存储93个值。因此，我们创建下一个函数，它将调用上面的整数序列1,2,3,4，…，92,93，像这样:

constexpr auto generateArray() noexcept {
    return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}

现在，最后，

constexpr auto FIB = generateArray();

将为我们提供一个编译时的std::array<unsigned long long, 93>，其名称为FIB，包含所有斐波那契数。如果我们需要第i个斐波那契数，那么我们可以简单地写FIB[i]。运行时将不进行计算。

我不认为有更快的方法来计算第n个斐波那契数。

请参阅下面的完整程序:

#include <iostream>
#include <array>
#include <utility>
// ----------------------------------------------------------------------
// All the following will be done during compile time
// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) {
    // Initialize first two even numbers 
    unsigned long long f1{ 0 }, f2{ 1 };
    // calculating Fibonacci value 
    while (index--) {
        // get next value of Fibonacci sequence 
        unsigned long long f3 = f2 + f1;
        // Move to next number
        f1 = f2;
        f2 = f3;
    }
    return f2;
}
// We will automatically build an array of Fibonacci numberscompile time
// Generate a std::array with n elements 
template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
    return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};
// Max index for Fibonaccis that for in an 64bit unsigned value (Binets formula)
constexpr size_t MaxIndexFor64BitValue = 93;
// Generate the required number of elements
constexpr auto generateArray()noexcept {
    return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}
// This is an constexpr array of all Fibonacci numbers
constexpr auto FIB = generateArray();
// ----------------------------------------------------------------------
// Test
int main() {
    // Print all possible Fibonacci numbers
    for (size_t i{}; i < MaxIndexFor64BitValue; ++i)
        std::cout << i << "t--> " << FIB[i] << 'n';
    return 0;
}

使用Microsoft Visual Studio Community 2019, Version 16.8.2开发和测试。

使用clang11.0和gcc10.2进行编译和测试

语言:C + + 17