为什么valgrind(helgrind)在我的线程结构上调用虚函数时生成"Possible Data Races"

why valgrind(helgrind) generates "Possible Data Races" in case virtual function is called upon my thread struct

本文关键字:函数 Data Possible 调用 Races 结构上 helgrind valgrind 线程结构 线程 我的      更新时间:2023-10-16

当我开始学习valgrind(helgrind)工具时,我遇到了一个我没能解决的棘手问题。

简单地说,用户定义的线程类是用一个虚拟函数创建的,该函数将由线程的入口例程调用。如果是这种情况,helgrind将报告可能的数据竞赛。但在简单地省略了虚拟关键字之后,就再也不会报告这样的错误了。为什么会这样?我的代码有什么问题吗?或者有变通办法吗?

以下是演示此类问题的简单线程应用程序,包括cpp、Makefile和helgrind报告的消息。

/* main.cpp */
#include <memory.h>
#include <pthread.h>
class thread_s {
public:
  pthread_t       th;
  thread_s(void);
  ~thread_s(void);
  virtual void* routine(); /* if omit virtual, no error would be generated */
  void stop(void);
};
static void* routine(void*);
int main(int, const char*[])
{
  thread_s s_v;
  pthread_create(&s_v.th, 0, routine, &s_v);
  return 0;
}
static void* routine(void* arg)
{
  thread_s *pV = reinterpret_cast<thread_s*>(arg);
  pV->routine();
  return 0;
}
void* thread_s::routine(void)
{
  return 0;
}
thread_s::thread_s(void)
{
  th = 0;
}
thread_s::~thread_s(void)
{
  stop();
}
void thread_s::stop(void)
{
  void *v = 0;
  pthread_join(th, &v);
}

====================================

/* Makefile */
all: main test_helgrind
main: main.cpp
        g++ -o main main.cpp 
        -g -Wall -O0 
        -lpthread
test_helgrind:
        valgrind 
                --tool=helgrind 
                ./main
clean:
        rm -f main
.PHONY: clean

====================================

g++ -o main main.cpp 
        -g -Wall -O0 
        -lpthread
valgrind 
                --tool=helgrind 
                ./main
==7477== Helgrind, a thread error detector
==7477== Copyright (C) 2007-2010, and GNU GPL'd, by OpenWorks LLP et al.
==7477== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==7477== Command: ./main
==7477==
==7477== Thread #1 is the program's root thread
==7477==
==7477== Thread #2 was created
==7477==    at 0x4259728: clone (clone.S:111)
==7477==    by 0x40484B5: pthread_create@@GLIBC_2.1 (createthread.c:256)
==7477==    by 0x4026E2D: pthread_create_WRK (hg_intercepts.c:257)
==7477==    by 0x4026F8B: pthread_create@* (hg_intercepts.c:288)
==7477==    by 0x8048560: main (main.cpp:18)
==7477==
==7477== Possible data race during write of size 4 at 0xbeab24c8 by thread #1
==7477==    at 0x80485C9: thread_s::~thread_s() (main.cpp:35)
==7477==    by 0x8048571: main (main.cpp:17)
==7477==  This conflicts with a previous read of size 4 by thread #2
==7477==    at 0x804858B: routine(void*) (main.cpp:24)
==7477==    by 0x4026F60: mythread_wrapper (hg_intercepts.c:221)
==7477==    by 0x4047E98: start_thread (pthread_create.c:304)
==7477==    by 0x425973D: clone (clone.S:130)
==7477==
==7477==
==7477== For counts of detected and suppressed errors, rerun with: -v
==7477== Use --history-level=approx or =none to gain increased speed, at
==7477== the cost of reduced accuracy of conflicting-access information
==7477== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 1 from 1)

我不知道这是否是helgrind抱怨的原因,但您的程序中存在严重问题。创建一个线程,将指针传递到本地thread_s实例(main()中的s_v)。

然而,main()很快就会返回,而不会与线程进行任何同步——当线程函数routine()获取指针并使用它来调用pV->routine()时,没有什么可以确保s_v仍然有效。

查看在pthread_create()调用后添加以下内容是否可以防止helgrind抱怨:

pthread_join( s_v.th, NULL);

事实上,仔细观察helgrind的输出,这几乎肯定会消除helgrind对它的抱怨,因为日志指向thread_s析构函数作为数据竞赛的一个参与者。

在一种情况下,vptr被写入,在另一种情况中,它被读取。两者都没有锁。Helgrind不知道你的程序中是否有其他方法使这种情况不可能在两个线程中同时发生,所以它会对其进行标记。如果你能保证在另一个线程中有人试图调用对象上的函数时对象不会被破坏,那么你就可以对此生成抑制。

相关文章: