如何使用boost::sprit来解析嵌套的for循环?

How can boost::sprit be used to parse nested for-loops?

本文关键字:嵌套 for 循环 何使用 boost sprit      更新时间:2023-10-16

我试图解析循环有这种类型的语法:

for(loop = 1:10) {

}

在我的语法中,我有这样的规则:

genericString %= lexeme[+(char_("a-zA-Z"))];
intRule %= int_;
commandString %= lexeme[+(char_ - '}')];
forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')' >> '{'
        >> (forLoop | commandString)
        >> '}';

虽然这对上面的简单示例有效,但它无法解析下面嵌套的示例:

for(loop = 1:10) {
    for(inner = 1:10) {
    }
}

我猜这是由于解析器与大括号位置"混淆"。我想我需要做一些类似于http://boost-spirit.com/distrib/spirit_1_7_0/libs/spirit/example/fundamental/lazy_parser.cpp上展示的东西(唉,我发现很难遵循)。

欢呼,

本。编辑1:

我现在认为最好处理commandString(下面称为nestedBlock)的递归,而不是在forLoop中,即:

forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')' 
        >> nestedBlock;
nestedBlock %= lexeme['{' >> -(char_ - '}' - '{')
                          >> -nestedBlock
                          >> -(char_ - '}' - '{')
                          >> '}'];

是失败的大量boost::spriti错误。规则定义为:

    qi::rule<Iterator, std::string(), ascii::space_type> nestedBlock;
    qi::rule<Iterator, Function(), ascii::space_type> forLoop;

函数是boost:: variables的结构体

编辑2:

所以这就是我现在拥有的(设计用于使用或不使用嵌套结构):

commandCollection %= *start;
forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')'
        >> '{'
        >>       commandCollection
        >> '}';
start %= loadParams  | restoreGenomeData | openGenomeData | initNeat | initEvo |
                 initAllPositions | initAllAgents | initCoreSimulationPointers |
                 resetSimulationKernel | writeStats | restoreSimState |
                 run | simulate | removeObjects | setGeneration |
                 setParam | getParam | pause | create | reset |
                 loadAgents | getAgent | setAgent | listParams | loadScript | forLoop
                 | wait | commentFunc | var | add | sub | mult | div | query;

我声明commandCollection规则如下:

qi::rule<Iterator, boost::fusion::vector<Function>, ascii::space_type> commandCollection;

我认为这会如我所愿。commandCollection定义为0个或多个命令,这些命令应该存储在boost::fusion::vector中。然而,当我从Function()结构体中提取向量时(记住开始规则使用Function()迭代器),由于某种原因,该类型未被标识为boost::fusion::vector,因此无法提取。我不知道为什么……

然而,如果我只有

commandCollection %= start;

并将规则声明为

qi::rule<Iterator, Function(), ascii::space_type> commandCollection;

,然后尝试提取数据作为一个单一的Function()结构,它工作得很好。但我希望它存储多个命令(即*开始)在某种容器。我也尝试使用std::vector,但这也失败了。

您的命令字符串不喜欢内循环中的空体。

修改+*:

commandString %= lexeme[*(char_ - '}')];
或者,如果您希望匹配一个可选的块,而不是一个可能为空的块,请考虑@llonesmiz提到的修复。测试用例:

#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
// #include <boost/spirit/include/phoenix.hpp>
namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx   = boost::phoenix;
typedef boost::variant<int, std::string> Value;
typedef std::pair<Value, Value> Range;
typedef std::pair<std::string, Range> Iteration;
typedef Iteration attr_t;
template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, attr_t(), Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;
        genericString %= lexeme[+(char_("a-zA-Z"))];// variable e.g. c
        intRule %= int_;
        commandString %= lexeme[*(char_ - '}')];
        value = intRule | genericString;
        range = value >> ':' >> value;
        forLoop %= lit("for")
                >> '(' >> genericString >> '=' >> range >> ')' 
                >> '{'
                >>      (forLoop | commandString)
                >> '}';
        start = forLoop;
        BOOST_SPIRIT_DEBUG_NODES(
                (start)(intRule)(genericString)(commandString)(forLoop)(value)(range)
                 );
    }
  private:
    qi::rule<It, std::string(), Skipper> genericString, commandString;
    qi::rule<It, int(), Skipper> intRule;
    qi::rule<It, Value(), Skipper> value;
    qi::rule<It, Range(), Skipper> range;
    qi::rule<It, attr_t(), Skipper> forLoop, start;
};
bool doParse(const std::string& input)
{
    typedef std::string::const_iterator It;
    auto f(begin(input)), l(end(input));
    parser<It, qi::space_type> p;
    attr_t data;
    try
    {
        bool ok = qi::phrase_parse(f,l,p,qi::space,data);
        if (ok)   
        {
            std::cout << "parse successn";
        }
        else      std::cerr << "parse failed: '" << std::string(f,l) << "'n";
        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'n";
        return ok;
    } catch(const qi::expectation_failure<It>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'n";
    }
    return false;
}
int main()
{
    bool ok = doParse(
            "for(loop = 1:10) {n"
            "   for(inner = 1:10) {n"
            "   }n"
            "}"
            );
    return ok? 0 : 255;
}

我衷心推荐查看显示解析失败原因的DEBUG输出:

<forLoop>
  <try>n   }n}</try>
  <fail/>
</forLoop>
<commandString>
  <try>n   }n}</try>
  <fail/>
</commandString>
<fail/>