如何使用C++在Linux中测量切换进程上下文的时间

How to measure a time of switching process context in Linux using C++?

本文关键字：进程换进上下文时间测量 C++ 何使用 Linux 更新时间：2023-10-16

我需要使用C++来测量上下文切换的时间。我知道我可以简单地从C++代码中访问C函数，但任务是尽可能避免使用C。我在互联网上搜索过，但只找到了使用C的方法。有什么方法可以在C++中使用操作系统吗？pipe(...)和unistd.h、sched_setaffinity(...)和sched.h的类似物有没有？

更新2017-06-30：示例代码添加

Are there any ways to work with OS in C++?

您引用的所有C函数都可以通过直接include访问。示例：

#include "pthread.h"

在C++编译中，auto神奇地获得extern"C"'d。

您的链接需要在Linux 上使用-lrt和-phread

Any analogs of pipe(...) from unistd.h, sched_setaffinity(...)

不是类似的，构建链接到真正的"C"Linux函数。

I need to measure the time of context switching using C++ means.

我通过重复一些动作1到10秒来测量持续时间，并计算循环完成的次数。

在我最新的次要基准测试中，完全用C++编写(但不使用C++11功能)，我

构建节点的链接列表
每个节点都有自己的线程
每个线程拥有2个指向pthread_mutex信号量(输入和输出)的指针
每个线程体都在等待其输入信号量发出信号(semTake())
唤醒后，线程体向其输出信号量发出信号(semGive())，并执行几乎没有别的了
N个线程的信号量被分发给节点线程，循环关闭在列表的末尾(即末尾列表节点输出信号量句柄指向开始列表节点输入信号量句柄)
主任务用semGive()启动连锁反应，等待10秒(使用usleep)，然后设置一个每个线程都能看到的标志。

在6年前的Dell上运行示例。

Compilation started at Wed Jan 15 22:31:33
./lmbm101
lmbm101: context-switch duration .. wait up to 10 seconds while measuring.
switch enforced using pthread_mutex semaphores
C5   bogomips:  5210.77   5210.77  
686.56  kilo  m_thread_switch invocations in 10.88 sec   (10000088 us)
68.6554  kilo  m_thread_switch events per second
14.5655  u seconds per m_thread_switch event
pid = 12188
now (52d760af): 22:31:43
bdtod 2014/01/15 22:31:43  minod=1351  iod=91  secod=81103  soi=104

我在C++11发布之前做了这个次要的基准测试。这段代码是用C++11编译的，但没有使用C++11任务。。。

更新2017-06-30-逾期更新。。。

我写了这个示例代码2017-04。我现在倾向于使用std:：vector来处理各种事情。之前的测量没有。类似的技术，但简化了结果报告。

#include <chrono>
#include <iomanip>
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <thread>
#include <vector>
// see EngFormat-Cpp-master.zip
#ifndef                 ENG_FORMAT_H_INCLUDED
#include "../../bag/src/eng_format.hpp"        // to_engineering_string(), from_engineering_string()
#endif
#include <cassert>

#include <semaphore.h>  // Note 1 - Ubuntu / Posix feature access, see PPLSEM_t

namespace DTB // doug's test box
{
// Note 2 - typedefs to simplify chrono access
// 'compressed' chrono access --------------vvvvvvv
typedef std::chrono::high_resolution_clock  HRClk_t; // std-chrono-hi-res-clk
typedef HRClk_t::time_point                 Time_t;  // std-chrono-hi-res-clk-time-point
typedef std::chrono::microseconds           NS_t;    // std-chrono-nanoseconds
typedef std::chrono::microseconds           US_t;    // std-chrono-microseconds
typedef std::chrono::microseconds           MS_t;    // std-chrono-milliseconds
using   namespace std::chrono_literals;          // support suffixes like 100ms, 2s, 30us
// examples:
//   Time_t testStart_us = HRClk_t::now();
//   auto  testDuration_us = std::chrono::duration_cast<US_t>(HRClk_t::now() - testStart_us);
//   auto         count_us =       testDuration_us.count();
//   or
//   std::cout << "  complete " << testDuration_us.count() << " us" << std::endl;

// C++ access to Linux semaphore via Posix
// Posix Process Semaphore, set to Local mode (unnamed, unshared)
class PPLSem_t
{
public:               // shared-between-threads--v  v--initial-value is unlocked
PPLSem_t()   { assert(0 == ::sem_init(&m_sem, 0, 1)); } // ctor
~PPLSem_t()  { assert(0 == ::sem_destroy(&m_sem));    } // dtor
int lock()   { return (::sem_wait(&m_sem)); }   // returns 0 when success, else -1
int unlock() { return (::sem_post(&m_sem)); }   // returns 0 when success, else -1
void wait()  { assert(0 == lock());   }
void post()  { assert(0 == unlock()); }
private:
::sem_t m_sem;
};
// POSIX is an api, this C++ class simplifies use
//    sem_wait and sem_post are possibly assembly for best performance

// Note 3 - locale what now?
// insert commas from right to left -- change 1234567890 to 1,234,567,890
// input 's' is the digits-to-the-left-of-the-decimal-point
// returns s contents with inserted comma's
std::string digiComma(std::string s)
{  //vvvvv--sSize must be signed int of sufficient size
int32_t sSize = static_cast<int32_t>(s.size());
if (sSize > 3)
for (int32_t indx = (sSize - 3); indx > 0; indx -= 3)
s.insert(static_cast<size_t>(indx), 1, ',');
return(s);
}

const std::string dashLine("  --------------------------------------------------------------n");

// Note 5 - thread sync to wall clock
// action: pauses a thread, resume thread action at next wall-clock-start-of-second
void sleepToWallClockStartOfSec(std::time_t t0 = 0)
{
if (0 == t0) { t0 = std::time(nullptr); }
while(t0 == std::time(nullptr))         {
std::this_thread::sleep_for(100ms);  } // good-neighbor-thread
}
// a good-neighbor-thread delay does not 'hog' a processor

// Note 4 - typedef examples to simplify
// create new types based on vector ... suffix '_t' reminds that this is a type
typedef std::vector<uint>           UintVec_t;
typedef std::vector<uint>           TIDSeqVec_t;
typedef std::vector<std::thread*>   Thread_pVec_t;

// measure -std=C++14 std::thread average context switch duration
//                                enforced with one PPLSem_t
class Q6_t
{
// private data
const uint        MaxThreads;        // thread count
const uint        MaxSecs;           // seconds of test
const std::string m_TIDSeqPFN;     // capture tid seq to ram (write to file later)
//
uint           m_thrdSwtchCount;   // count incremented by all threads
//
bool           m_done;             // main to threads: cease and desist
uint           m_rdy;              // threads to main: thread is ready! (running)
PPLSem_t       m_sem;              // one semaphore shared by all threads
//
UintVec_t      m_thrdRunCountVec;  // counts incremented per thread
TIDSeqVec_t    m_TIDSeq_Vec;       // sequence (order) of thread execution
Thread_pVec_t  m_thread_pVec;      // vector of thread pointers
public:
Q6_t()  // default ctor
: MaxThreads(10)           // total threads
, MaxSecs(10)              // controlled seconds of test
, m_TIDSeqPFN("./Q6.txt")  // where put data file
//
, m_thrdSwtchCount(0)
//
, m_done(false)            // main() to threads: cease and desist
, m_rdy(0)                 // threads to main(): thread is ready!
// m_sem                 // default ctor ok
//
// m_thrdRunCountVec     // default ctor ok
// m_TIDSeq_Vec          // default ctor ok
// m_thread_pVec         // default ctor ok
{
for (size_t i = 0; i < MaxThreads; ++i) {
m_thrdRunCountVec.push_back(0);   // 0 each per-thread counter
}
// your results -----vvvvvvvv----will vary
m_TIDSeq_Vec.reserve(45000000);  // observed as many as 42,000,000 on my old Dell
m_thread_pVec.reserve(MaxThreads);
// DO NOT start threads (m_thread_pVec) yet
} // AciveObj_t()

~Q6_t()
{
// m_TIDSeq_Vec,
while(m_thread_pVec.size()) {              // more to pop and delete
std::thread* t = m_thread_pVec.back();  // return last element
m_thread_pVec.pop_back();               // remove last element
delete t;                               // delete thread
}
// m_thrdRunCountVec;
// m_TIDSeqPFN, m_sem, m_rdy; m_done;
// m_thrdSwtchCount; MaxSecs; MaxThreads;
} // ~Q6_t()

// Q6_t::main(..)  runs in context thread 'main()', invoked in function main()
int main(std::string label)
{
std::cout << dashLine << "  " << MaxSecs << " second measure of "
<< MaxThreads << " threads, 1 PPLSem_t " << label << "n"
<< "  output: " << m_TIDSeqPFN << 'n'<< std::endl;
assert(0 == m_sem.lock());    // take posession of m_sem
// now all thread will block at critical section entry (in onceThruCritSect())
std::cout << "n  block threads at crit sect   " << std::endl;
createAndActivateThreads();
long int durationUS = 0;
releaseThreadsAndWait(durationUS); // run threads run
std::cout << "n" << std::endl
<< report(" 'thread context switch' ",
m_thrdSwtchCount, durationUS);
reportThreadActionCounts();
writeTIDSeqToQ6_txt();
reportMainStackSize();
measure_LockUnlock();        // with no context switch, no collision
return(0);
} // int main() // in 'main' context

private:

void onceThru(uint id)  // a crit section
{
assert(0 == m_sem.lock());      // critical section entry
{
m_thrdSwtchCount      += 1;     // 'work'
m_thrdRunCountVec[id] += 1;     // diagnostic - thread work-balance
m_TIDSeq_Vec.push_back(id);     // thread sequence capture
}
assert(0 == m_sem.unlock());    // critical section exit
}

// thread entry point
void threadRun(uint id)
{
std::cout << '.' << id << std::flush;  //  ".0.1.2.3.4.5.6.7.8.9"
m_rdy |= (1 << id);     // thread to main: i am ready
do {
onceThru(id);
if (m_done) break; // exit when done   tbr - FIXME -- rare hang
}while(true);
}

// main() context: create and activate std::thread's with new
void createAndActivateThreads() // main() context
{
std::cout << "  createAndActivateThreads()  ";
Time_t start_us = HRClk_t::now();
for (uint id = 0; id < MaxThreads; ++id)
{
// std::thread activates when instance created
std::thread*  thrd = new
std::thread(&Q6_t::threadRun, this, id);
// method-------^^^^^^^^^^^^^^^        ^^--single param for method
// instance*---------------------^^^^
assert(nullptr != thrd);
// create handshake mask for unique 'id' bit of m_rdy
uint mask = (1 << id);
// wait for bit set in m_rdy by thread
while ( ! (mask & m_rdy) ) {
std::this_thread::sleep_for(100ms); // not a poll
}
// thread has confirmed to main() that it is running
// capture pointer to invoke join's
m_thread_pVec.push_back(thrd);
}
auto  duration_us =
std::chrono::duration_cast<US_t>(HRClk_t::now() - start_us);
std::cout << "   (" << digiComma(std::to_string(duration_us.count()))
<< " us)" << std::endl;
sleepToWallClockStartOfSec(); // start-of-second
} // void createAndActivateThreads()

// main() context: measure average context switch duration
//    by releasing threads to run
void releaseThreadsAndWait(long int& count_us)
{
Time_t testStart_us = HRClk_t::now();
// thread 'main()' is current owner of this semaphore - see "Q6_t::main()"
assert(0 == m_sem.unlock()); // release the hounds
std::cout << "  releaseThreadsAndWait        " << std::flush;
// progress indicator to user
for (size_t i = 0; i < MaxSecs; ++i) // let threads switch for 10 seconds
{
sleepToWallClockStartOfSec();    // 'main()' sync's to wall clock
std::cout << (MaxSecs-i-1) << ' ' << std::flush; // "9 8 7 6 5 4 3 2 1 0"
}
// tbr - dedicated mutex for this single-write / multiple read ?  or std::atomic ?
m_done = true;      // command threads to exit - all threads can see m_done
auto  testDuration_us =
std::chrono::duration_cast<US_t>(HRClk_t::now() - testStart_us);
count_us = testDuration_us.count();
// tbr - main() shall confirm all threads complete
// tbr - measure how long to detect m_done

Time_t joinStart_us = HRClk_t::now();
std::cout << "n  join threads                 ";
for (size_t i = 0; i < MaxThreads; ++i)
{
m_thread_pVec[i]->join();           // main() waits here for thread[i] completion
std::cout << ". " << std::flush;
}
auto  joinDuration_us =
std::chrono::duration_cast<US_t>(HRClk_t::now() - joinStart_us);
std::cout << "   (" << digiComma(std::to_string(joinDuration_us.count()))
<< " us)" << std::endl;
} // void releaseThreadsAndWait(long int& count_us)

void reportThreadActionCounts()
{
std::cout << "n  each thread run count: n ";
uint      sum = 0;
for (auto it : m_thrdRunCountVec)
{
std::cout << std::setw(11) << digiComma(std::to_string(it));
sum += it;
}
std::cout << std::endl;
uint diff = (sum - m_thrdSwtchCount);

std::cout << ' ';
double maxPC = 0.0;
double minPC = 100.0;
for (auto it : m_thrdRunCountVec)
{
double percent = static_cast<double>(it) / static_cast<double>(sum);
if(percent > maxPC) maxPC = percent;
if(percent < minPC) minPC = percent;
std::cout << std::setw(11) << (percent * 100);
}
std::cout << "  (% of total)nn  total : " << digiComma(std::to_string(sum));
if (diff) std::cout << "  (diff: " << diff << ")";
std::cout << "   note variability --   min : " << (minPC*100)
<< "%    max : " << (maxPC*100) << "%" << std::endl;
} // void reportThreadActionCounts()

void writeTIDSeqToQ6_txt() //  m_TIDSeq_Vec - record sequence of thread access to critsect
{
size_t sz = m_TIDSeq_Vec.size();
std::cout << 'n' << dashLine << "  writing Thread ID sequence of "
<< digiComma(std::to_string(sz)) << " values to "
<< m_TIDSeqPFN << std::endl;
Time_t writeStart_us = HRClk_t::now();
do {
std::ofstream Q6cout(m_TIDSeqPFN);
if ( ! Q6cout.good() )
{
std::cerr << "not able to open for write: " << m_TIDSeqPFN << std::endl;
break;
}
size_t lnSz = 0;
for (auto it : m_TIDSeq_Vec)
{
// encode Thread ID  uints:           0   1   2   3   4   5   6   7   8   9
// to letters 'A' thru 'J': vvvvvv   'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J'
Q6cout << static_cast<char>(it+'A');
// whitespace not needed
if (++lnSz > 100) { Q6cout << std::endl; lnSz = 0; } // 100 chars per line
}
Q6cout << 'n' << std::endl;
Q6cout.close();
} while(0);
auto wDuration_us = std::chrono::duration_cast<US_t>
( HRClk_t::now() - writeStart_us );
std::cout << "  complete: "
<< digiComma(std::to_string(wDuration_us.count()))
<< " us" << std::endl;
} // writeTIDSeqToQ6_txt

std::string report(std::string lbl, uint64_t eventCount, uint64_t duration_us)
{
std::stringstream ss;
ss << "  " << to_engineering_string(static_cast<double>(eventCount),9,eng_prefixed)
<< lbl << " events in " << digiComma(std::to_string(duration_us)) << " us" << std::endl;
double eventsPerSec = (1000000.0*(static_cast<double>(eventCount))/
static_cast<double>(duration_us));
ss << "  " << to_engineering_string(eventsPerSec,9,eng_prefixed)
<< lbl << " events per secondn  "
<< to_engineering_string((1.0/eventsPerSec), 9, eng_prefixed)
<< " sec per " << lbl << " event " << std::endl;
return(ss.str());
} // std::string report(std::string lbl, uint64_t eventCount, uint64_t duration_us)

// Note 6 - stack size -> use POSIX 'pthread_attr_...' API
void reportMainStackSize()
{
pthread_attr_t tattr;
int stat = pthread_attr_init (&tattr);
assert(0 == stat);
size_t size;
stat = pthread_attr_getstacksize(&tattr, &size);
assert(0 == stat);
std::cout << 'n' << dashLine << "  Stack Size: "
<< digiComma(std::to_string(size))
<< "    [of 'main()' by pthread_attr_getstacksize]n"
<< std::endl;
stat = pthread_attr_destroy(&tattr);
assert(0 == stat);
} // void reportMainStackSize()

// Note 7 - semaphore API performance
// measure duration when no context switch (i.e. no thread 'collision')
void measure_LockUnlock()
{
//PPLSem_t*  sem1 = new PPLSem_t;
//assert(nullptr != sem1);
PPLSem_t sem1;
size_t   count1 = 0;
size_t   count2 = 0;
std::cout << dashLine << "  3 second measure of lock()/unlock()"
<< " (no collision) " << std::endl;
time_t t0 = time(0) + 3;
Time_t start_us = HRClk_t::now();
do {
assert(0 == sem1.lock());   count1 += 1;
assert(0 == sem1.unlock()); count2 += 1;
if(time(0) > t0)  break;
}while(1);
auto  duration_us = std::chrono::duration_cast<US_t>(HRClk_t::now() - start_us);
assert(count1 == count2);
std::cout << report (" 'sem lock()+unlock()' ", count1, duration_us.count());
std::cout << "n";
} // void mainMeasures_LockUnlock()
};  // class Q6_t
} // namespace DTB

int main(int argc, char* argv[] )
{
std::cout << "nargc: " << argc << 'n' << std::endl;
for (int i=0; i<argc; i+=1) std::cout  << argv[i] << "    ";
std::cout << "n" << std::endl;
setlocale(LC_ALL, "");
std::ios::sync_with_stdio(false);
{
std::time_t t0 = std::time(nullptr);
std::cout << "  " << std::asctime(std::localtime(&t0)) << std::endl;;
DTB::sleepToWallClockStartOfSec(t0);
}
DTB::Time_t main_start_us = DTB::HRClk_t::now();
int retVal = 0;
{
DTB::Q6_t  q6;
retVal  =  q6.main(" Q6::main() ");
}
auto duration_us = std::chrono::duration_cast<DTB::US_t>
(DTB::HRClk_t::now() - main_start_us);
std::cout << "  FINI  "
<< DTB::digiComma(std::to_string(duration_us.count()))
<< " us" << std::endl;
return(retVal);
}

我的旧戴尔上的典型输出。

Fri Jun 30 15:30:13 2017
--------------------------------------------------------------
10 second measure of 10 threads, 1 PPLSem_t  Q6::main() 
output: ./Q6.txt

block threads at crit sect   
createAndActivateThreads()  .0.1.2.3.4.5.6.7.8.9   (1,002,120 us)
releaseThreadsAndWait        9 8 7 6 5 4 3 2 1 0 
join threads                 . . . . . . . . . .    (2,971 us)

31.07730700 M 'thread context switch'  events in 10,021,447 us
3.101079814 M 'thread context switch'  events per second
322.4683207 n sec per  'thread context switch'  event 
each thread run count: 
3,182,496  3,252,929  3,245,473  3,150,344  3,411,918  2,936,982  2,978,690  3,029,319  3,004,926  2,884,230
10.2406    10.4672    10.4432    10.1371    10.9788    9.45057    9.58478    9.74769     9.6692    9.28082  (% of total)
total : 31,077,307   note variability --   min : 9.28082%    max : 10.9788%
--------------------------------------------------------------
writing Thread ID sequence of 31,077,307 values to ./Q6.txt
complete: 3,025,289 us
--------------------------------------------------------------
Stack Size: 8,720,384    [of 'main()' by pthread_attr_getstacksize]
--------------------------------------------------------------
3 second measure of lock()/unlock() (no collision) 
173.2359360 M 'sem lock()+unlock()'  events in 3,902,491 us
44.39111737 M 'sem lock()+unlock()'  events per second
22.52702926 n sec per  'sem lock()+unlock()'  event 
FINI  18,957,304 us

Q6.txt行的示例有100个字符长。

AABABABABAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

最后几行

BJBJBJBJBJBJBJBJBBHHHHHHHHHHHHHHHHHBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAA
AAAAAAAAAAAAAAAAAAAAAABABABABAABABBAAAAAAAAAAAAAAAAAAAAAAAAAAAAABABABBGGGGGGGGGGGGGGGBGBGBGBGBGBGBGBG
BGBGBGBGBGBGBGBGBGBBGBGBGBGBGBGBGBBHHHHHHHHHHHHHHHHBHBHBHBHBHBHBHBHBHBHBHBHBHBHBHBBHBHBHBHBBJJJJJJJJJ
JJJJJJJJJBBJBBBJBJBJBJBJBJBBJBJBJBJBJBJBJBJBJBBEEEEEEEEEEEEEEEEEBEBEBEBEBEBEBEBEBEBEBEBEBEBEBBEBEBEBE
BEBEBEBEBEBEBEBEBEBEBEBEBEBEBBEBEBEBBBBEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEBEBBEBEBEBEEEEEEEEEEEEEEEEEEEEEE
EEEEEEEEBEBEBEBEEBEBEBEBEBBIIIIIIIIIIIIIIIBBIIIBIBBFFFFFFFFFFFFFFFBBFFBBFBFBFBFBFFBBGGGGGGGGGGGGGGGGG
BBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGBGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
GGBCIHFJDAE

您引用的所有C函数都可以通过直接包含。

谢谢，我知道这个事实，但任务是避免使用C在可能的地方使用C++。

我的C++代码中没有C代码，它只是调用外部的"C"Linux提供的功能。没有一套单独的C++Linux函数调用。Linux API(到操作系统服务)由C库和头文件。我不知道如何避免或绕过Linux API，所以也许我不知道你在建议/询问什么。

我通过重复一些动作1到10秒来测量持续时间，以及计算循环完成的次数。

你能解释一下吗？

考虑片段

{
uint64_t microsecStart = getSystemMicroSecond();
//convert linear time to broken-down/calendar time
local_tm = *localtime_r (&linear_time, &local_tm); 
uint64_t microsecDuration = getSystemMicroSecond() - microsecStart;
}

这种操作通常太快而不能以这种简单的方式进行测量，本质上是一个delta微秒，转换将在微秒可能会改变。

为了如此快速地测量某个东西，我们绕着感兴趣，计算循环次数，然后踢出，比如3秒。

uint64_t microsecStart = getSystemMicroSecond();
uint32_t loopCount = 0;
time_t t0 = time(0) + 3; // loop for < 3 seconds
do
{   
//convert linear time to broken-down/calendar time
local_tm = *localtime_r (&linear_time, &local_tm); 
time_t t1 = time(0);
if(t1 != t0) break;
loopCount += 1;
} while(1);
uint64_t microsecDuration = getSystemMicroSecond() - microsecStart;

在这个循环中，time(0)函数的速度惊人。

时间(0)大约需要75纳秒(在我的戴尔桌面上)

因此时间(0)不会显著延长测量。

但速度足够快，可以准确测量本地持续时间

localtime_r耗时约335纳秒

当这些旋转完成时，测试已经创建了一个loopCount循环外的持续时间测量提供了更一致的测量持续时间。。。，然后我们可以从中计算出一个"平均值"每个事件的持续时间。

您是否忽略了进程运行的时间？

是。因为我知道上下文切换是幅度比函数调用慢，这并不困难以最小化线程活动，使其没有/影响最小关于测量。

与切换时间相比，它是次要的吗？

在这个测试中，线程增加一个数字，测试一个标志，然后执行作为好邻居(即，这些线程将处理器作为尽快)。这些小动作对上下文切换的成本。

我6岁的戴尔的数字相差3个数量级。

简单函数调用：即时间(0)<75 e-9秒

线程上下文切换<15 e-6秒使用信号强制执行

其他活动可能会影响结果，但我认为微不足道的方式。我的结果是每"线程开关14个us信号量发送"比可能的最佳结果长，但不是足够长的时间来影响我的设计决策。有可能改进这个测量，但我买不起硬件。

Linux提供了一些线程或任务优先级的想法，但我没有探索他们。当我真的想找到一个"更好"的衡量标准时，我想我会断开以太网，关闭任何繁忙的工作。。。但是我没有运行编译，也没有复制文件，也没有运行备份，也没有当我测量时，任何明显的cpu周期消耗。这台机器基本空闲。只是时钟滴答作响，计时器过期，内存刷新，以及其他一些必须继续的事情。

为了好玩或感兴趣，您可以调出System Monitor实用程序，单击%CPU标记1或2次，然后将最繁忙的任务带到顶部你应该会发现，最繁忙的任务是系统以2个cpu之一的3%进行监控。所有其他任务本质上是在等待什么，并触发0%的负载。

最后你可能会这样想。。。

你在写一个在非典型机器上运行的程序吗？还是您的目标与您的开发机器相似？

你打算关闭中断吗？i/o通道？以太网？控制优先事项或者你的目标会有用吗。

IMHO，在我有用的(linux)系统中运行的任务，当系统除了等待我的下一次击键之外什么都不做，通常在10秒测试的大部分时间里什么也不做。

我认为这些努力最重要的收获是：

function calls are more than 100x faster than context switches.