正在分析boost::spirit上的递归结构

Parsing recursive structure on boost::spirit

本文关键字:递归 结构 spirit boost      更新时间:2023-10-16

我将解析类似于"text{<>}"的结构。Spirit文档内容类似AST示例。用于解析像这样的字符串

<tag1>text1<tag2>text2</tag1></tag2>

这个代码工作:

    templ     = (tree | text)       [_val = _1];
    start_tag = '<' 
            >> !lit('/') 
            >> lexeme[+(char_- '>') [_val += _1]] 
            >>'>'; 
    end_tag   =  "</" 
            >> string(_r1) 
            >> '>'; 
    tree =  start_tag          [at_c<1>(_val) = _1]
            >> *templ          [push_back(at_c<0>(_val), _1) ]
            >> end_tag(at_c<1>(_val) )
            ;

用于解析像这样的字符串

<tag<tag>some_text>

此代码不起作用:

    templ     = (tree | text)       [_val = _1];

    tree =  '<'
            >> *templ          [push_back(at_c<0>(_val), _1) ]
            >> '>'
            ;

templ是解析结构,内部有递归_ rapper:

namespace client {
   struct tmp;
   typedef boost::variant <
        boost::recursive_wrapper<tmp>,
        std::string
   > tmp_node;
   struct tmp {
     std::vector<tmp_node> content;
     std::string text;
   };
}
BOOST_FUSION_ADAPT_STRUCT(
     tmp_view::tmp,
     (std::vector<tmp_view::tmp_node>, content)
     (std::string,text)
)

谁能解释为什么会发生这种事?也许谁知道类似的解析器在boost::spirit上写道?

只是猜测您实际上根本不想解析XML,而是想为分层文本解析某种混合内容标记语言,我会做

        simple = +~qi::char_("><");
        nested = '<' >> *soup >> '>';
        soup   = nested|simple;

AST/规则定义为

typedef boost::make_recursive_variant<
        boost::variant<std::string, std::vector<boost::recursive_variant_> > 
    >::type tag_soup;
qi::rule<It, std::string()>           simple;
qi::rule<It, std::vector<tag_soup>()> nested;
qi::rule<It, tag_soup()>              soup;

查看Coliru直播

////  #define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/variant/recursive_variant.hpp>
#include <iostream>
#include <fstream>
namespace client
{
    typedef boost::make_recursive_variant<
            boost::variant<std::string, std::vector<boost::recursive_variant_> > 
        >::type tag_soup;
    namespace qi = boost::spirit::qi;
    template <typename It>
    struct parser : qi::grammar<It, tag_soup()>
    {
        parser() : parser::base_type(soup)
        {
            simple = +~qi::char_("><");
            nested = '<' >> *soup >> '>';
            soup   = nested|simple;
            BOOST_SPIRIT_DEBUG_NODES((simple)(nested)(soup))
        }
      private:
        qi::rule<It, std::string()>           simple;
        qi::rule<It, std::vector<tag_soup>()> nested;
        qi::rule<It, tag_soup()>              soup;
    };
}
namespace boost { // leverage ADL on variant<>
    static std::ostream& operator<<(std::ostream& os, std::vector<client::tag_soup> const& soup)
    {
        os << "<";
        std::copy(soup.begin(), soup.end(), std::ostream_iterator<client::tag_soup>(os));
        return os << ">";
    }
}
int main(int argc, char **argv)
{
    if (argc < 2) {
        std::cerr << "Error: No input file provided.n";
        return 1;
    }
    std::ifstream in(argv[1]);
    std::string const storage(std::istreambuf_iterator<char>(in), {}); // We will read the contents here.
    if (!(in || in.eof())) {
        std::cerr << "Error: Could not read from input filen";
        return 1;
    }
    static const client::parser<std::string::const_iterator> p;
    client::tag_soup ast; // Our tree
    bool ok = parse(storage.begin(), storage.end(), p, ast);
    if (ok) std::cout << "Parsing succeedednData: " << ast << "n";
    else    std::cout << "Parsing failedn";
    return ok? 0 : 1;
}

如果定义BOOstrongPIRIT_DEBUG,您将获得解析过程的详细输出。

对于输入

<some text with nested <tags <etc...> >more text>

打印

Parsing succeeded
Data: <some text with nested <tags <etc...> >more text>

请注意,输出是从变体中打印出来的,而不是原始文本。