在.txt文件上计算重复单词的C 程序

C++ program that counts repeated words on a .txt file

本文关键字:单词 程序 txt 文件 计算      更新时间:2023-10-16

我正在尝试构建一个程序,该程序在.txt文件上计算重复单词,并输出重复的单词以及重复多少次。我有一种方法可以计算有多少个单词,但不是重复的单词。这是代码:

#include <iostream>
#include <string>
#include <math.h>
#include <iomanip>
#include <fstream>
#include <vector>
#include "ProcessStatistics.h"
using namespace std;
ProcessStatistics::ProcessStatistics()
{
    //constructor
}
//Finds out how many words are composed by an specific number of         characters.
void ProcessStatistics::Length(std::vector<std::string> ArrayOfWords, int numberOfWords)
{
    cout << "=== COMPUTING WORD S LENGTH ==== " << endl;
    int vectorLength[30] = {0};
    for(int i = 0; i < numberOfWords; i++)
    {
        for(int j = 0; j<20; j++)
        {
            if (ArrayOfWords [i].length()-1 == j)
                vectorLength[j] = vectorLength[j]+1;
        }
    }
    ofstream varlocal;
    remove("WORDS_LENGTH.txt");
    varlocal.open("WORDS_LENGTH.txt");
    if(varlocal.is_open())
    {
        varlocal << "Total: " << numberOfWords << endl;
        for(int i=0; i < 30; i++)
        {
            if(vectorLength[i] != 0)
            {
                varlocal << vectorLength[i] << " W " << i+1 << " CHAR " <<     " % " << setprecision(3) << vectorLength[i]*100/numberOfWords << endl;
            }
        }
    }
    varlocal.close();

这是一些示例代码,可以使用 std::map在文本文件上演示单词统计信息。

#include <algorithm>
#include <string>
#include <fstream>
#include <iostream>
using std::ifstream;
using std::cout;
using std::string;
using std::cin;
using std::map;
int main()
{
  static const char filename[] = "my_data.txt";
  ifstream input(filename);
  if (!input)
  {
    cout << "Error opening data file " << filename << "n";
    return 1;
  }
  map<string, unsigned int> word_data;
  string word;
  while (input >> word)
  {
     if (word_data.find(word) != word_data.end())
     {
       word_data[word]++;
     }
     else
     {
       word_data[word] = 1;
     }
  }
  map<string, unsigned int>::iterator iter;
  for (iter = word_data.begin(); iter != word_data.end(); ++iter)
  {
    cout << iter->second << "t" << iter->first << "n";
  }
  return 0;
}

在上述代码中,wordmap中的密钥。单词出现,计数或频率是map中的 value

如果单词存在于map中,则计数会增加。如果单词不存在,则将其添加到map中,计数为1。

读取文件后,打印了统计信息,计数后面是单词。