在给定特定格式的python、c++中解析文本文件

parsing a text file in python, c++, given specific format

本文关键字:c++ 文件 文本 python 定格 格式      更新时间:2023-10-16

我有一个如下格式的文件;我想用pyhton和c++解析它,并提取ImpVarNo之后的数字:这种格式有很多行。

sample.txt

Start:
abc pqr
(FF_GGGGG_CONFIRM_TR):TC:20222,SeqNum:86,ImpVarNo:1000000008234436,Id:12,oneId:66454,a/c:1,ImpValue:905,Impvar:25,actualValue:905,actualVar:25,abc pqr xyz
Impquantity:0,pgb ncr yepp
Start:
abc pqr
(FF_GGGGG_CONFIRM_TR):TC:20222,SeqNum:86,ImpVarNo:1000000008234436,Id:12,oneId:66454,a/c:1,ImpValue:905,Impvar:25,actualValue:905,actualVar:25,abc pqr xyz
Impquantity:0,pgb ncr yepp
Start:
abc pqr
(FF_GGGGG_CONFIRM_TR):TC:20222,SeqNum:86,ImpVarNo:1000000008234436,Id:12,oneId:66454,a/c:1,ImpValue:905,Impvar:25,actualValue:905,actualVar:25,abc pqr xyz
Impquantity:0,pgb ncr yepp
Start:
abc pqr
(FF_GGGGG_CONFIRM_TR):TC:20222,SeqNum:86,ImpVarNo:1000000008234436,Id:12,oneId:66454,a/c:1,ImpValue:905,Impvar:25,actualValue:905,actualVar:25,abc pqr xyz
Impquantity:0,pgb ncr yepp

所以我写了以下代码:

#!/usr/bin/env python
import sys
import re
hand = open('newlogfile.txt')
for line in hand:
    r = re.compile("ExOrderNo:(d+),") 
    print r 
import re
with  open('newlogfile.txt') as f:
    r = re.compile("ImpVarNo:(d+),")
    for line in f:
        inp = r.findall(line)
        if inp:
            print(float(inp[0]))
1000000008234436
1000000008234436
1000000008234436
1000000008234436

如果行总是以相同的东西开头:

   import re
    with  open('newlogfile.txt') as f:
        r = re.compile("ImpVarNo:(d+),")
        for line in f:
            if line.startswith("(FF_GGGGG_CONFIRM_TR)"):
                print(r.findall(line))

然而,这是最琐碎的方法,您可以使用RegEx来使代码更干净。

sample_file = open(sample.txt)
contents = sample_file.readlines()
for line in contents:
    if line.startswith("(FF_GGGGG_CONFIRM_TR)"):
        number_after_impvarno = int(line.split(",")[2][9:])