多线程文件读取为每个线程生成相同的结果

Multi-threaded file reading produces the same result for each thread

本文关键字:结果 线程 读取 文件 多线程      更新时间:2023-10-16

基本上,我遇到的问题出在标题中,我正在尝试创建一个多线程应用程序来读取和汇总文件的内容,这在一个线程上可以正常工作。但是,当引入更多时,它们会产生相同的输出。我该如何解决这个问题?

代码

void *sumThread(void *);
pthread_mutex_t keepOut = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t keepOutSum = PTHREAD_MUTEX_INITIALIZER;
int counter = 0, line_count = 0;
char* loc;
double total = 0;
void split(const string& s, char c, vector<string>& v)
{
    string::size_type i = 0;
    string::size_type j = s.find(c);
    while (j != string::npos)
    {
        v.push_back(s.substr(i, j - i));
        i = ++j;
        j = s.find(c, j);
        if (j == string::npos)
            v.push_back(s.substr(i, s.length()));
    }
}
int main(int argc, char* argv[])
{
    if (argc < 2)
    {
        cerr << "Usage: " << argv[0] << " filename" << endl;
        return 1;
    }
    string line;
    loc = argv[1];
    ifstream myfile(argv[1]);
    myfile.unsetf(ios_base::skipws);
    line_count = std::count(std::istream_iterator<char>(myfile),
                            std::istream_iterator<char>(),
                            'n');
    myfile.clear();
    myfile.seekg(-1, ios::end);
    char lastChar;
    myfile.get(lastChar);
    if (lastChar != 'r' && lastChar != 'n')
        line_count++;
    myfile.setf(ios_base::skipws);
    myfile.clear();
    myfile.seekg(0, ios::beg);
    pthread_t thread_id[NTHREADS];
    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_create(&thread_id[i], NULL, sumThread, NULL);
    }
    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_join(thread_id[i], NULL);
    }
    cout << setprecision(2) << fixed << total << endl;
    return 0;
}
void *sumThread(void *)
{
    pthread_mutex_lock(&keepOut);
    int threadNo = counter;
    counter++;
    pthread_mutex_unlock(&keepOut);
    ifstream myfile(loc);
    double runningTotal = 0;
    string line;
    if (myfile.is_open())
    {
        for (int i = threadNo; i < line_count; i += NTHREADS)
        {
            vector < string > parts;
            getline(myfile, line);
            // ... and process out the 4th element in the CSV.
            split(line, ',', parts);
            if (parts.size() != 3)
            {
                cerr << "Unable to process line " << i
                        << ", line is malformed. " << parts.size()
                        << " parts found." << endl;
                continue;
            }
            // Add this value to the account running total.
            runningTotal += atof(parts[2].c_str());
        }
        myfile.close();
    }
    else
    {
        cerr << "Unable to open file";
    }
    pthread_mutex_lock(&keepOutSum);
    cout << threadNo << ":  " << runningTotal << endl;
    total += runningTotal;
    pthread_mutex_unlock(&keepOutSum);
    pthread_exit (NULL);
}

示例输出

 2:  -46772.4
 0:  -46772.4
 1:  -46772.4
 3:  -46772.4
 -187089.72

每个线程都应该读取和汇总文件中的数字,然后在完成后将它们相加。但是,线程似乎都返回相同的数字,即使 threadNo 变量明显不同,如输出所示。

你的问题在这里:

for (int i = threadNo; i < line_count; i += NTHREADS) {
    vector<string> parts;
    getline(myfile, line);

getline(( 不知道 i 的值,所以它仍然从文件中读取相邻的行,而不跳过任何行。 因此,所有线程都在读取文件的相同前几行。