如何编写一个正则表达式来匹配((113)(96 15)(23))

How to write a regex to match ((11 3) (96 15) (2 3) )

本文关键字：正则表达式何编写一个更新时间：2023-10-16

我正在尝试制作一个正则表达式，将匹配:

((11 3) (96 15) (2 3) )

到目前为止，我有:

([^(|^)| |[A-Za-z])+

但是它只捕获了11个，而不是其余的。字符串也长得多，我只使用了一小段，所以它以相同的格式重复，但数字不同。到目前为止，我对程序至少有一部分是这样的:

regex expression("([^(|^)| |[A-Za-z])+");
string line2 = "((11 3) (96 15) (2 3) )";
if(regex_match(line2, expression))
    cout << "yes";
else
    cout << "no";

您在示例字符串中有数字，但在正则表达式中使用字母，这是有意的吗?我想我会使用像这样的正则表达式:

((([0-9]+ [0-9]+) )+)

如果我们把它分解一下，下面是我的思考过程:

(     // start with a literal "("
(      // create a group
(     // another literal "("
[0-9]+ // one or more digits
       // a space (hard to spell out here
[0-9]+ // one or more digits
       // a space (hard to spell out here
)     // a litteral ")" to match the opening
)      // close the group
+      // group must repeat one or more times
)     // final closing ")"

编辑:好的，既然你说有时第二个数字不是数字，那么我们可以很容易地调整正则表达式，看起来像这样:

((([0-9]+ [A-Za-z0-9]+) )+)

如果你需要避免字母和数字混在一起，你可以这样做:

(([0-9]+ ([A-Za-z]+|[0-9]+)) )+)

让我们"从头开始"构建你的表达式。

记住您匹配((11 3) (96 15) (2 3) )的最终目标，我们将从匹配一个更简单的模式开始，并一次推进一步:

d        matches "1"
d+       matches "11", or "3", or "96"
d+ *d+  matches "11 3" or "96 15"
(d+ *d+)           matches "(11 3)" or "(96 15)"
((d+ *d+) *)*      matches "(11 3)(96 15) (2 3)"
(((d+ *d+) *)*)  matches "((11 3) (96 15) (2 3) )"

注意:我没有测试这个答案。我依靠的是Boost。

我最近在尝试匹配类似于1-4,5,9,20-25的语法时解决了这个问题。无可否认，得到的正则表达式并不简单:

/G([0-9]++)(?:-([0-9]++))?+(?:,(?=[-0-9,]+$))?+/

这个表达式允许我收集字符串中所有的匹配项，递增。

我们可以将相同的方法应用于您的问题，但是验证和匹配给定的输入是极其困难的。)我不知道该怎么做。如果有人这么做了，我倒想看看!)但是您可以单独验证输入:

/(s*(s*((s*d+s+d+s*))s*)+s*)/

看Evan的答案如何工作。d相当于[0-9], s相当于[rnt ]。

这是一个增量匹配来提取数字:

/G(?s*(?:(s*(d+)s+(d+)s*))(?:(?=s*(s*d+s+d+s*))|s*))/

它像这样分解:

/G     # matches incrementally. G marks the beginning of the string or the beginning of the next match.
 (?s* # matches first open paren; safely ignores it and following whiespace if this is not the first match.
 (?:    # begins a new grouping - it does not save matches.
   (s* # first subgroup open paren and optional whitespace.
   (d+) # matches first number in the pair and stores it in a match variable.
   s+   # separating whitespace
   (d+) # matches second number in the pair and stores it in a match variable.
   s*) # ending whitespace and close paren
 )      # ends subgroup
 (?:    # new subgroup
   (?=  # positive lookahead - this is optional and checks that subsequent matches will work.
     s*(s*d+s+d+s*)  # look familiar?
   )    # end positive lookahead
   |    # if positive lookahead fails, try another match
   s*)s* # optional ending whitespace, close paren
 )/     # ... and end subgroup.

我还没有测试过，但我相信它会起作用。每次将表达式应用于给定字符串时，它将提取随后的每一对数字，直到看到最后一个结束括号，并且它将消耗整个字符串，或者在出现输入错误时停止。您可能需要对Boost::regex进行微调。