捕捉文本文件行中的错误,跳过和报告

Catching errors in text file lines, skipping and reporting

本文关键字:报告 错误 文本 文件      更新时间:2023-10-16

我有一个文本文件,我需要阅读,每行是一个IP,一个URL,最后是一个日期,所有由空白分隔。我希望将每条信息分配给类对象"Visitor"的适当变量。我有一个数组来存储它们。我的问题是,当我试图通过文本文件,和行…我总是看到所有文本之间的空白

class Visitors{
public:
string IP;
string URL;
string dateAccessed;
};
int main(){
Visitors hits[N];
string filename, theLine;
ifstream infile;
cout << "Enter file name (with extension):" << flush;
while(true){
    string infilename;
    getline(cin, infilename);
    infile.open(infilename.c_str());
    if(infile) break;
    cout << "Invalid file. Please enter valid file name: " << flush;
}
cout << "n";
while(!infile.eof()){
    getline(infile, theLine);
    istringstream iss(theLine);
    do{
        string ip;
        string url;
        string date;
        iss >> ip;
        iss >> url;
        iss >> date;
        if(ip != "n"){
             cout << "The IP: " << ip << endl;
        }
        if(url != "n"){
             cout << "The URL: " << url << endl;
        }
        if(date != "n"){
            cout << "The DA: " << date << endl;
        }

    }while(iss);
}
return 0;
}

我尝试使用if语句来抓取所有只是"新行"的字符串并忽略它们,但这不起作用,所以我不完全确定如何忽略它们。我还想添加检查,看看是否有任何信息是错误的(日期不是2/2/4字符长,缺少www。)

下面是一些示例输出,以更好地展示我的问题…

Enter file name (with extension):hits.txt
The IP: 192.168.1.101
The URL: www.cs.stonybrook.edu
The DA: 01/01/2013
The IP:
The URL:
The DA:
The IP: 192.168.1.101
The URL: www.cs.stonybrook.edu
The DA: 01/01/2013
The IP:
The URL:
The DA:
The IP: 123.112.15.151
The URL: www.cs.stonybrook.edu
The DA: 01/01/2013

编辑:好吧,所以我已经弄清楚如何通过每一行,打破它,并添加相应的字符串,它属于在对象数组中的类变量。现在的问题是,我想检查每一行的每个字符串的错误(例如:日期是不可能的,或者IP中的一个数字是256等等)。在发现这个错误后,我想跳到下一行,做同样的检查,如果一切正常,它将在数组中的适当位置初始化类变量。下面是我的代码,以了解我想做什么…

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#define N 50
using namespace std;
class Visitors{
public:
string IP;
string URL;
string dateAccessed;
};
int main(){
Visitors hits[N];
string infilename, filename, ip, url, date;
ifstream infile;
int i = 0;
cout << "Enter file name (with extension):" << flush;
while(true){
    infilename = "";
    getline(cin, infilename);
    infile.open(infilename.c_str());
    if(infile) break;
    cout << "Invalid file. Please enter valid file name: " << flush;
}
cout << "Loading " << infilename << "..." << endl;;
cout << "n";
while(infile.good()){
    string line;
    getline(infile, line);
    stringstream ss(line);
    if(ss >> ip >> url >> date){
        cout << "The IP: " << ip << endl;
        hits[i].IP = ip;
        cout << "The URL: " << url << endl;
        hits[i].URL = url;
        cout << "The DA: " << date << endl;
        hits[i].dateAccessed = date;
        i++;
    }
    else{
        cerr << "error" << std::endl;
    }
    /*
    if(ip.length() > 15 || ip.length() < 7){
        cout << "Found a record with an invalid IP format (not XXX.XXX.XXX.XXX)...ignoring entry";
    }
    //if(any of the numbers in the IP address are great then 255)
        //INVALID IP...IGNORE ENTRY
    else{
        cout << "The IP: " << ip << endl;
        hits[i].IP = ip;
    }
    //if(url doesnt start with www. or doesnt end with .xxx)
        //INVALID URL...IGNORE ENTRY
    else{
        cout << "The URL: " << url << endl;
        hits[i].URL = url;
    }
    //if(date.length != 10)
        //INVALID DATE FORMAT...IGNORE ENTRY
    //if(first 2 numbers in date arent between 01 and 12
         //OR if second 2 numbers arent between 01 and 31 depending on month OR etc.)
         //INVALID....IGNORE ENTRY
    else{
        cout << "The DA: " << date << endl;
        hits[i].dateAccessed = date;
    }
    i++;*/
}
return 0;
}

它显然没有组织或放在一起,它实际上是如何在程序中,但它是我想要完成的总体思路。我最大的问题是如何在不干扰的情况下跳过文件中的一行,比如我在数组中的位置,或者如果所有行都有错误,它会捕获每一个。

不需要stringstream和内部循环:

#include <string>
#include <iostream>
#include <fstream>
#include <sstream>
int main(){
    std::string filename, theLine;
    std::ifstream infile;
    std::cout << "Enter file name (with extension):" << std::flush;
    while(true){
        std::string infilename;
        getline(std::cin, infilename);
        infile.open(infilename.c_str());
        if(infile) break;
        std::cout << "Invalid file. Please enter valid file name: " 
            << std::flush;
    }
    std::cout << "n";
    std::string ip, url, date;
    while(infile.good()) {
        std::string line;
        getline(infile, line);
        std::stringstream ss(line);
        if (ss >> ip >> url >> date) {
            std::cout << "The IP: " << ip << std::endl;
            std::cout << "The URL: " << url << std::endl;
            std::cout << "The DA: " << date << std::endl;
        } else {
            std::cerr << "error" << std::endl;
        }
    }
    return 0;
}

内部循环尝试重复解析同一行而不从文件中读取新行。

cout << "Enter file name (with extension):" << flush;
string infilename;
getline(cin, infilename);
infile.open(infilename.c_str());
// You don't need to loop here. You can just exit the program. 
// But this is optional.
if(!infile) {
   cout << "Invalid file. Please enter valid file name: " << endl;
   exit(1);
}
cout << endl;
int line_nr = 1;
while(getline(infile, theLine)){
    istringstream iss(theLine);
    string ip;
    string url;
    string date;
    // A line is expected to have ip url date format. Otherwise it is error.
    if(iss >> ip >> url >> date)
        cout << "The IP: " << ip << endl;
        cout << "The URL: " << url << endl;
        cout << "The DA: " << date << endl;
     }
     else {
        cout << "Error reading data on line: " << line_nr << flush;
        break;
     }
     ++line_nr;
}

最简单的解决方案(根据您必须更改的代码数量)是与""进行比较,而不是与"n"进行比较:

if (ip   != "") { cout << "The IP: "  << ip   << endl; }
if (url  != "") { cout << "The URL: " << url  << endl; }
if (date != "") { cout << "The DA: "  << date << endl; }

(测试)