用提升灵气解析空字符

Parsing null character with boost spirit qi

本文关键字:字符      更新时间:2023-10-16

我正在尝试解析带有增强灵气的字符串,其形式如下:

"help@masonlive.gmu.edutestrn"

使用以下语法: 这是 hpp:

class EmailGrammar :
public boost::spirit::qi::grammar< const boost::uint8_t*,
boost::tuple< boost::iterator_range< const boost::uint8_t*>,
boost::iterator_range< const boost::uint8_t*> >()>
{
public:
const static EmailGrammar instance;
EmailGrammar ();    
/* omitting uninteresting stuff here i.e. constructors and assignment */
private:
boost::spirit::qi::rule< const boost::uint8_t*,
boost::tuple<
boost::iterator_range< const boost::uint8_t*>,
boost::iterator_range< const boost::uint8_t* >()> m_start;
};

语法的 CPP 如下所示:

EmailGrammar::EmailGrammar() :
EmailGrammar::base_type(m_start),
m_start()
{
namespace qi = boost::spirit::qi;
m_start = 
(
qi::lit('')
>> (
qi::raw[*(qi::char_ - qi::lit(''))]
)
>> qi::lit('')
>> (
qi::raw[*(qi::char_ - qi::eol)]
)
>> qi::eol >> qi::eoi
);
}

我打算用它来解析两个字符串并将它们分解为两个单独的迭代器范围。

然后这样称呼:

int main()
{
typedef typename EmailGrammar::start_type::attr_type attr;
std::string testStr("help@masonlive.gmu.edutestrn");
// this is not done this way in the real code just as a test
boost::iterator_range<const boost::uint8_t*> data =
boost::make_iterator_range(
reinterpret_cast< const boost::uint8_t* >(testStr.data()),
reinterpret_cast< const boost::uint8_t* >(testStr.data() + testStr.length()));
attr exposedAttribute;
if (boost::spirit::qi::parse(data.begin(),
data.end(),
EmailGrammar::instance,
exposedAttribute)
{
std::cout << "success" << std::endl;
}
}

问题似乎出在解析空终止符时。我认为这是因为当我将debug(m_rule);添加到代码中时,我得到了 xml 输出:

<unnamed-rule>
<try></try>
<fail/>
</unnamed-rule>

但是,如果我显式擦除例如第一个空终止符,我会得到输出:

<unnamed-rule>
<try>help@masonlive.gmu.e</try>
<fail/>
</unnamed-rule>

这就引出了以下问题:

  • 如何用精神解析空终止符 我一直在寻找文档,除了在本页最底部提到以 null 结尾的字符串外,我找不到任何关于它的信息,该字符串提到了精神上的默认模型。

  • 精神是否确实以一种方式向前看,如果解析器在向前看中看到它没有正确结束,它会自动失败?

  • 我缺少任何可用于阅读此类行为的文档吗?

整个问题很可能起源于这里:

std::string testStr("help@masonlive.gmu.edutestrn");

不做你想的那样。它创建一个空字符串。相反,请指定原始文本/缓冲区的长度:

std::string testStr("help@masonlive.gmu.edutestrn", 31);

奖金

如果你不想做数学/计数(你不应该!),做一个助手:

template <typename Char, size_t N>
std::string bake(Char const (&p)[N], bool include_terminator = false) {
return { p, p + N - (include_terminator?0:1) };
}

然后你可以像这样使用:

std::string const testStr = bake("help@masonlive.gmu.edutestrn");

住在科里鲁

#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/boost_tuple.hpp>
namespace qi = boost::spirit::qi;
using It = uint8_t const*;
using Range = boost::iterator_range<It>;
using Attribute = boost::tuple<Range, Range>;
class EmailGrammar : public qi::grammar<It, Attribute()> {
public:
const static EmailGrammar instance;
EmailGrammar() : EmailGrammar::base_type(m_start)
{
using namespace qi;
m_start = 
'' >> raw[*(char_ - '')] >> 
'' >> raw[*(char_ - eol)] >> 
eol >> eoi
;
BOOST_SPIRIT_DEBUG_NODES((m_start))
}
private:
qi::rule<It, Attribute()> m_start;
};
const EmailGrammar EmailGrammar::instance {};
template <typename Char, size_t N>
std::string bake(Char const (&p)[N], bool include_terminator = false) {
return { p, p + N - (include_terminator?0:1) };
}
int main() {
std::string const testStr = bake("help@masonlive.gmu.edutestrn");
It f = reinterpret_cast<It>(testStr.data()),
l = f + testStr.length();
Attribute exposedAttribute;
if (boost::spirit::qi::parse(f, l, EmailGrammar::instance, exposedAttribute)) {
std::cout << "success" << std::endl;
}
}

指纹

<m_start>
<try></try>
<success></success>
<attributes>[[[h, e, l, p, @, m, a, s, o, n, l, i, v, e, ., g, m, u, ., e, d, u], [t, e, s, t]]]</attributes>
</m_start>
success