在c++中读取特定文本行之间的文本文件

Read text file between specific lines of text in c++

本文关键字：文本之间文件 c++ 读取更新时间：2023-10-16

我目前正在解析一个文件到程序内存。

要解析的文件如下所示:

    file info
    second line of file info
    date
    # col1    col2    col3    col4   col5      col6      col7    col8     col9   col10    col11     col12     col13     col14  
    firstSetOfElements
    4
           2       1       1       0       3       1       2       3       0 -49.4377        0 0       -26.9356 -24.5221
           3       2       1       0       3       2       4       3       13.7527 -43.2619       0 0       -19.3462 -28.0525
           4       3       1       0       3       4       1       3       14.2459 43.5163       0 0       33.3506 15.2988
           5       4       1       0       3       2       1       4       49.4377 0       0 0       25.0818 38.3082

    # col1      col2      col3      col4      col5      col6
    secondSetOfElements
    1
         1 4 3 4 1 2

我想做什么:

    file.open(FILENAME, ios::in); // Open file
    if (file.is_open()) 
    { 
            // Get the line number where "firstSetOfElements" is located. Store the line number.
            // Go to the line after that line, and store the integer listed there as a variable (`int noFirstElems`).
            // (this is the number of rows I will be parsing into the first array).
    // start parsing at the first line of the array (the line under the previous)     
         while (getline(file, firstLineOfFirstSetToParse)) //starting at the first row of data array elements, begin parsing into text file (this I've got handled).
         {
                 //add a row vector until you get to a blank line (which will be after 4 rows of data).
         }

在对"firstSetOfElements"数组执行上述操作之后，我将执行同样的操作将"secondSetOfElements"数据解析为数组，如上所述。

我已经解析了数据很好，但没有找到我可以理解的资源，为我要解析的行设置开始/结束点。

提前感谢!

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
using namespace std;
string trimmed(string& input)
{
    size_t firstNonWSIdx = input.find_first_not_of(" ");
    size_t lastNonWSIdx = input.find_last_not_of(" ");
    firstNonWSIdx = (firstNonWSIdx == string::npos ? 0 : firstNonWSIdx);
    return input.substr(firstNonWSIdx, lastNonWSIdx);
}
void resetStrStream(istringstream& sstr, string const& newInput)
{
    sstr.clear();
    sstr.str(newInput);
}
void readFile(char const* fileName, vector<vector<float> >& data)
{
    ifstream infile(fileName);
    string input;
    istringstream iss;
    while(infile)
    {
        getline(infile, input);
        string token = trimmed(input);
        if (token.compare("setOfElements") == 0)
        {
            getline(infile, input);
            resetStrStream(iss, trimmed(input));
            int arraySize;
            iss >> arraySize;
            vector<float> values;
            int i = 0;
            while(i++ < arraySize)
            {
                getline(infile, input);
                resetStrStream(iss, trimmed(input));
                float val;
                while (iss >> val)
                {
                   values.push_back(val);
                } 
            }
            data.push_back(values);
        }
    }    
}
void testData(vector<vector<float> >& data)
{
    for (int i = 0; i < data.size(); i++)
    {
        for (int j = 0; j < data[i].size(); j++)
        {
            cout << data[i][j] << " ";
        }
        cout << endl;
    }
}
int main()
{
    vector< vector<float> > data;
    readFile("textfile.txt", data);
    testData(data);
    return 0;
}

基于您所描述的方式的问题很容易解决:逐行读取，扫描名为"SetOfElements"的令牌，并将下一行视为2d数组的大小，然后读取size行数并将它们存储到您的数组中。

但是，你的文件布局是混乱的:你的注释行(以#开始)说将有8列，但随后的行包含14列。你的程序应该忽略最后6列吗?此外，您的元素似乎是整型和浮点数的混合，除非列1到8也是没有小数部分的浮点数。

如果事实上，数据类型是混合的，并且列1-8必须读取为整型，其余为浮点数，那么你的解析器代码将是复杂的，否则，它应该是简单的getline... scan line... continue类型的算法。

还有，你的set分隔符会改变吗?第一个集合由"firstSetOfElements"标识，第二个集合的id是"secondSetOfElements"，如果它们只是简单的"SetOfElements"会更容易，否则您将在读取每一个新的元素集合之前构造分隔符字符串。

这可能会给你一些想法:

std::string line;
int line_num = 0;
for (int i = 0; i < 3; ++i)
     assert(getline(input, line))); // ignore 3 lines
while (input >> set_name)
{
    if (set_name == '#') { input.ignore('n'); continue; }
    std::vector<std::vector<float>>>& set = sets[set_name];
    if (input >> rows)
        for (int i = 0; i < rows && getline(input, line); ++i)
        {
            std::istringstream iss(line);
            set.push_back(std::vector<float>());
            while (iss >> num) set.back().push_back(num);
        }
}