CPP 中速度性能的测试功能

Testing Function for speed performance in CPP

本文关键字：测试功能性能速度 CPP 更新时间：2023-10-16

>我创建了一个简单的函数将任何小写字母 a-z 转换为大写，问题可能不是问题，但每个测试都返回 0。如果我添加系统（"暂停"），我可以看到一个新值，指示暂停的长度。

有没有更准确的方法来测试速度，或者这实际上是正确的？我想将其与其他函数进行比较，看看它的转换速度是否比标准函数快。

char* ToUppercase(char* Input)
{
    int Len = Length(Input);
    for (int i = 0; i < Len; i++)
    {
        short keycode = static_cast<short>(Input[i]);
        if (keycode >= 97 && keycode <= 122)
            Input[i] -= 32;
    }
    return Input;
}

我用来测试的当前计时器是（由其他人创建）

template<typename TimeT = std::chrono::milliseconds>
struct measure
{
    template<typename F, typename ...Args>
    static typename TimeT::rep execution(F func, Args&&... args)
    {
        auto start = std::chrono::system_clock::now();
        func(std::forward<Args>(args)...);
        auto duration = std::chrono::duration_cast< TimeT>
            (std::chrono::system_clock::now() - start);
        return duration.count();
    }
};

要调用我使用：

void Debug()
{
    char Buffer[10000] = "aaaa /..../ aaaa";
    MyStringControl::ToUppercase(Buffer);
}
int main()
{
    std::cout << measure<std::chrono::nanoseconds>::execution(Debug);
}

你看过std::chrono::high_resolution_clock吗？

下面是一个示例：

#include <iostream>
#include <ctime>
#include <ratio>
#include <chrono>
template<typename TimeT = std::chrono::milliseconds>
struct measure
{
    template<typename F, typename ...Args>
    static typename TimeT::rep execution(F func, Args&&... args)
    {
        auto start = std::chrono::high_resolution_clock::now();
        func(std::forward<Args>(args)...);
        auto duration = std::chrono::duration_cast< TimeT>
                (std::chrono::high_resolution_clock::now() - start);
        return duration.count();
    }
};
int total = 0;
void test()
{
    int foo = 0;
    for (int i=0; i<1000; ++i) ++foo;
    total += foo;
}
int main ()
{
    using namespace std::chrono;
    for (int i = 0; i < 30; ++i)
    {
        total = 0;
        auto t = measure<std::chrono::nanoseconds>::execution(test);
        std::cout << "Calculated total = " << total << " in " << t << " ns." << std::endl;
    }    
    return 0;
}

这给了：

Calculated total = 1000 in 64 ns.
Calculated total = 1000 in 21 ns.
Calculated total = 1000 in 22 ns.
Calculated total = 1000 in 21 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 13 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 13 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 13 ns.
Calculated total = 1000 in 21 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 22 ns.
Calculated total = 1000 in 21 ns.
Calculated total = 1000 in 20 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 14 ns.
Calculated total = 1000 in 20 ns.
Calculated total = 1000 in 20 ns.
Calculated total = 1000 in 21 ns.
Calculated total = 1000 in 20 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 15 ns.
Calculated total = 1000 in 14 ns.

运行函数 1000000 次，然后将结果除以 1000000。您可以使用高精度计时器，但由于硬件的怪癖，它更容易出现不准确。

编辑：

您希望对函数本身进行 1000，000 次调用，并且只调用一次计时器：

    auto start = std::chrono::system_clock::now();
    for (size_t counter = 0; counter<1000000; ++counter)
         func(std::forward<Args>(args)...);
    auto duration = std::chrono::duration_cast< TimeT>
        (std::chrono::system_clock::now() - start)/1000000;
    return duration.count();

你的函数Debug什么都不做，你的编译器可能能够弄清楚这一点，因此你所做的只是计时连续两次调用now的速度。

做一些事情来确保你试图计时的代码不会被优化掉。例如，以某种方式使用它的输出，或者给它__attribute__((noinline))（如果你不介意计时实际函数调用的成本）或其他东西。

（此外，如果您希望从计时中获得任何有用的精度，则需要函数花费比时钟分辨率长得多的时间）