堆上是否会分配内存以支持临时对象到常量引用的嵌套绑定

Will memory be allocated on the heap to support nested binding of temporary objects to const references?

本文关键字：常量引用绑定嵌套临时对象支持是否分配内存更新时间：2023-10-16

考虑以下代码，该代码以"嵌套"方式将临时对象绑定到const引用：

#include <iostream>
std::string foo()
{
return "abc";
}
std::string goo()
{
const std::string & a = foo();
return a;
}
int main()
{
// Is a temporary allocated on the heap to support this, even for a moment?
const std::string & b = goo();
}

我一直在努力理解编译器在内存存储方面必须做些什么才能支持这种"嵌套"结构。

我怀疑对于foo()的调用，内存分配是直接的：当函数foo()退出时，std::string的存储将在堆栈上分配。

但是，编译器必须做些什么才能支持b引用的对象的存储？函数goo的堆栈必须展开并"替换为"b所指堆栈上的对象，但为了展开goo的堆栈，编译器是否需要在堆栈上立即创建对象的副本(然后将其复制回其他位置的堆栈)？

或者，编译器是否有可能在堆上不分配任何存储的情况下(哪怕只是一瞬间)完成此构造的要求？

或者编译器是否可以为b引用的对象和a引用的对象使用相同的存储位置，而不在堆栈或堆上进行任何额外分配？

我认为有一个中间步骤您没有考虑，那就是您没有将b绑定到a，而是绑定到一个a的副本。这并不是因为任何幻想中的记忆恶作剧！

goo按值返回，因此，根据所有常见机制，该值在main内部的完整表达式的范围内可用。它要么在main的堆栈帧中，要么在其他地方，或者(在这种人为的情况下)可能完全优化。

这里唯一的神奇之处在于，它被保持在main的作用域中，直到b超出作用域，因为b是-const的引用(而不是立即被销毁)。

那么，堆会以任何方式进入其中吗？好吧，如果你有一堆，不。如果你指的是免费商店，那么，仍然，不。

这里有一个C++标准允许编译器将代码重新构建为的示例。我使用的是完整的NRVO。请注意放置new的使用，这是一个稍微模糊的C++特性。您向new传递一个指针，它在那里而不是在空闲存储中构造结果。

#include <iostream>
void __foo(void* __construct_std_string_at)
{
new(__construct_std_string_at)std::string("abc");
}
void __goo(void* __construct_std_string_at)
{
__foo(__construct_std_string_at);
}
int main()
{
unsigned char __buff[sizeof(std::string)];
// Is a temporary allocated on the heap to support this, even for a moment?
__goo(&__buff[0]);
const std::string & b = *reinterpret_cast<std::string*>(&__buff[0]);
// ... more code here using b I assume
// end of scope destructor:
reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}

如果我们在goo中阻止NRVO，它看起来会像

#include <iostream>
void __foo(void* __construct_std_string_at)
{
new(__construct_std_string_at)std::string("abc");
}
void __goo(void* __construct_std_string_at)
{
unsigned char __buff[sizeof(std::string)];
__foo(&__buff[0]);
std::string & a = *reinterpret_cast<std::string*>(&__buff[0]);
new(__construct_std_string_at)std::string(a);
// end of scope destructor:
reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}
int main()
{
unsigned char __buff[sizeof(std::string)];
// Is a temporary allocated on the heap to support this, even for a moment?
__goo(&__buff[0]);
const std::string & b = *reinterpret_cast<std::string*>(&__buff[0]);
// ... more code here using b I assume
// end of scope destructor:
reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}

基本上，编译器知道引用的生存期。因此，它可以创建"匿名变量"来存储变量的实际实例，然后创建对它的引用

我还注意到，当您调用一个函数时，您可以有效地(隐式地)传入一个指向返回值所在缓冲区的指针。因此，被调用的函数在调用方的作用域中"就地"构造对象。

使用NRVO，被调用函数范围中的命名变量实际上是在调用函数"返回值所在的位置"中构造的，这使得返回变得容易。如果没有它，您必须在本地执行所有操作，然后在return语句中通过placementnew将返回值复制到指向返回值的隐式指针。

不需要在堆上做任何事情(也称为自由存储)，因为生命周期都是容易证明的，并且是堆栈有序的。

具有预期签名的原始foo和goo必须仍然存在，因为它们具有外部链接，直到发现没有人使用它们时可能被丢弃。

以__开头的所有变量和函数仅用于说明。编译器/执行环境不需要有一个命名的变量，就像你需要一个红细胞的名称一样。(理论上，因为__是保留的，所以编译器在编译前进行这样的翻译可能是合法的，如果你真的使用了这些变量名，但它未能编译，那将是你的错，而不是编译器的错，但……这将是一个非常糟糕的编译器。；)

理论上，由于goo(以及foo)按值返回，因此a引用的变量的副本将被返回(并放置在堆栈上)。所述副本的生存期将延长b，直到b的作用域结束。

我认为您缺少的要点是您按值返回。这意味着在foo或goo返回后，它们内部的任何东西都没有区别——只剩下一个临时字符串，您可以将其绑定到const引用。

在实践中，一切都很可能得到优化。

否，不会为生存期扩展进行任何动态分配。通用实现等效于以下代码转换：

std::string goo()
{
std::string __compiler_generated_tmp = foo();
const std::string & a = __compiler_generated_tmp;
return a;
}

不需要动态分配，因为只要引用是活动的，并且根据当前作用域结束时发生的C++生存期规则，生存期就会延长。通过在作用域中放置一个未命名的(上面代码中的__compiler_generated_tmp)变量，将应用通常的生存期规则并执行您期望的操作。

在std::string goo()中，std:：字符串由值返回。

当编译器看到您在main()中调用此函数时，它会注意到返回值是std:：string，并在main的堆栈上为std:：字符串分配空间。

当goo()返回时，goo(()中的引用a不再有效，但std：：字符串a引用被复制到main()中堆栈上保留的空间中

在这种情况下，可能会有几种优化，你可以在这里阅读一个编译器可以做什么