以非多态方式调用虚拟函数的成本是多少

Whats the cost of calling a virtual function in a non-polymorphic way?

本文关键字：多少函数虚拟多态方式调用更新时间：2023-10-16

我有一个纯抽象基和两个派生类：

struct B { virtual void foo() = 0; };
struct D1 : B { void foo() override { cout << "D1::foo()" << endl; } };
struct D2 : B { void foo() override { cout << "D1::foo()" << endl; } };

在点A中调用foo的成本是否与调用非虚拟成员函数的成本相同？还是说它比D1和D2不从B派生的情况更贵？

int main() {
 D1 d1; D2 d2; 
 std::vector<B*> v = { &d1, &d2 };
 d1.foo(); d2.foo(); // Point A (polymorphism not necessary)
 for(auto&& i : v) i->foo(); // Polymorphism necessary.
 return 0;
}

答案：Andy Prowl的答案是正确的，我只是想添加gcc的组装输出（在godbolt:gcc-4.7-O2-march=native-std=c++11中测试）。直接函数调用的成本是：

mov rdi, rsp
call    D1::foo()
mov rdi, rbp
call    D2::foo()

对于多态调用：

mov rdi, QWORD PTR [rbx]
mov rax, QWORD PTR [rdi]
call    [QWORD PTR [rax]]
mov rdi, QWORD PTR [rbx+8]
mov rax, QWORD PTR [rdi]
call    [QWORD PTR [rax]]

但是，如果对象不是从B派生的，而您只是执行直接调用，则gcc将内联函数调用：

mov esi, OFFSET FLAT:.LC0
mov edi, OFFSET FLAT:std::cout
call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)

如果D1和D2不是从B派生的，这个可以启用进一步的优化，所以我猜不，它们不等价（至少对于具有这些优化的gcc版本，-O3在没有内联的情况下产生了类似的输出）。在D1和D2确实从B派生的情况下，是否有什么东西阻止编译器内联？

"修复"：使用委托（也就是自己重新实现虚拟函数）：

struct DG { // Delegate
 std::function<void(void)> foo;
 template<class C> DG(C&& c) { foo = [&](void){c.foo();}; }
};

然后创建一个委托向量：

std::vector<DG> v = { d1, d2 };

如果您以非多态的方式访问方法，这就允许内联。然而，我想访问向量将比只使用虚拟函数（还不能用godbolt测试）慢（或者至少同样快，因为std::function使用虚拟函数进行类型擦除）。

在点A中调用foo的成本与调用非虚拟成员函数的成本相同吗？

是

还是说它比D1和D2不从B派生的情况更贵？

没有

编译器将静态地解析这些函数调用，因为它们不是通过指针或引用执行的。由于在编译时调用函数的对象类型是已知的，因此编译器知道必须调用foo()的哪个实现。

最简单的解决方案是查看编译器内部。在Clang中，我们在lib/CodeGen/CGClass.cpp:中找到了canDevirtualizeMemberFunctionCall

/// canDevirtualizeMemberFunctionCall - Checks whether the given virtual member
/// function call on the given expr can be devirtualized.
static bool canDevirtualizeMemberFunctionCall(const Expr *Base, 
                                              const CXXMethodDecl *MD) {
  // If the most derived class is marked final, we know that no subclass can
  // override this member function and so we can devirtualize it. For example:
  //
  // struct A { virtual void f(); }
  // struct B final : A { };
  //
  // void f(B *b) {
  //   b->f();
  // }
  //
  const CXXRecordDecl *MostDerivedClassDecl = getMostDerivedClassDecl(Base);
  if (MostDerivedClassDecl->hasAttr<FinalAttr>())
    return true;
  // If the member function is marked 'final', we know that it can't be
  // overridden and can therefore devirtualize it.
  if (MD->hasAttr<FinalAttr>())
    return true;
  // Similarly, if the class itself is marked 'final' it can't be overridden
  // and we can therefore devirtualize the member function call.
  if (MD->getParent()->hasAttr<FinalAttr>())
    return true;
  Base = skipNoOpCastsAndParens(Base);
  if (const DeclRefExpr *DRE = dyn_cast<DeclRefExpr>(Base)) {
    if (const VarDecl *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
      // This is a record decl. We know the type and can devirtualize it.
      return VD->getType()->isRecordType();
    }
    return false;
  }
  // We can always devirtualize calls on temporary object expressions.
  if (isa<CXXConstructExpr>(Base))
    return true;
  // And calls on bound temporaries.
  if (isa<CXXBindTemporaryExpr>(Base))
    return true;
  // Check if this is a call expr that returns a record type.
  if (const CallExpr *CE = dyn_cast<CallExpr>(Base))
    return CE->getCallReturnType()->isRecordType();
  // We can't devirtualize the call.
  return false;
}

我相信代码（以及附带的注释）是不言自明的：）