Spirit X3,如何在非ASCII输入上失败

Spirit X3, How to fail parse on non-ascii input?

本文关键字:ASCII 输入 失败 X3 Spirit      更新时间:2023-10-16

因此,目的是不要在输入字符串中从80h到FFH忍受字符。我的印象是

using ascii::char_;

会照顾好这个。但是,正如您在示例代码中看到的那样,它将很乐意打印解析成功。

在以下Spirit邮件列表帖子中,乔尔建议让解析在这些非ASCII角色上失败。但是我不确定他是否继续这样做。[Spirit-General] ASCII编码无效输入的断言...

在这里我的示例代码:

#include <iostream>
#include <boost/spirit/home/x3.hpp>
namespace client::parser
{
    namespace x3 = boost::spirit::x3;
    namespace ascii = boost::spirit::x3::ascii;
    using ascii::char_;
    using ascii::space;
    using x3::lexeme;
    using x3::skip;
    const auto quoted_string = lexeme[char_('"') >> *(char_ - '"') >> char_('"')];
    const auto entry_point = skip(space) [ quoted_string ];
}
int main()
{
    for(std::string const input : { ""naughty x80" "bla bla bla"" }) {
        std::string output;
        if (parse(input.begin(), input.end(), client::parser::entry_point, output)) {
            std::cout << "Parsing succeededn";
            std::cout << "input:  " << input << "n";
            std::cout << "output: " << output << "n";
        } else {
            std::cout << "Parsing failedn";
        }
    }
}

我该如何改变示例以使精神在此无效的输入上失败?

此外,我想知道如何使用定义char_set编码的字符解析器。您知道x3文档的 char_(charset):字符解析器开发分支。

缺乏强烈描述基本功能的文档。为什么提升最高级别的人不能迫使图书馆作者至少在cppreference.com的层面上带有文档?

这里的文档没有什么不好的。这只是一个库错误。

any_char的代码说:

template <typename Char, typename Context>
bool test(Char ch_, Context const&) const
{
    return ((sizeof(Char) <= sizeof(char_type)) || encoding::ischar(ch_));
}

应该说

template <typename Char, typename Context>
bool test(Char ch_, Context const&) const
{
    return ((sizeof(Char) <= sizeof(char_type)) && encoding::ischar(ch_));
}

使您的计划的行为按预期和要求。该行为也与Qi行为相匹配:

活在coliru

#include <boost/spirit/include/qi.hpp>
int main() {
    namespace qi = boost::spirit::qi;
    char const* input = "x80";
    assert(!qi::parse(input, input+1, qi::ascii::char_));
}

在此处提交错误:https://github.com/boostorg/spirit/issues/520

您可以使用print解析器:

实现这一目标
#include <iostream>
#include <boost/spirit/home/x3.hpp>
namespace client::parser
{
    namespace x3 = boost::spirit::x3;
    namespace ascii = boost::spirit::x3::ascii;
    using ascii::char_;
    using ascii::print;
    using ascii::space;
    using x3::lexeme;
    using x3::skip;
    const auto quoted_string = lexeme[char_('"') >> *(print - '"') >> char_('"')];
    const auto entry_point = skip(space) [ quoted_string ];
}
int main()
{
    for(std::string const input : { ""naughty x80"", ""bla bla bla"" }) {
        std::string output;
        std::cout << "input:  " << input << "n";
        if (parse(input.begin(), input.end(), client::parser::entry_point, output)) {
            std::cout << "output: " << output << "n";
            std::cout << "Parsing succeededn";
        } else {
            std::cout << "Parsing failedn";
        }
    }
}

输出:

input:  "naughty �"
Parsing failed
input:  "bla bla bla"
output: "bla bla bla"
Parsing succeeded

https://wandbox.org/permlink/hsob8uqmc3wme5yi


令人惊讶的事实是,出于某种原因,仅当sizeof(iterator char type) > sizeof(char)

检查char_检查
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <string>
#include <boost/core/demangle.hpp>
#include <typeinfo>
namespace x3 = boost::spirit::x3;
template <typename Char>
void test(Char const* str)
{
    std::basic_string<Char> s = str;
    std::cout << boost::core::demangle(typeid(Char).name()) << ":t";
    Char c;
    auto it = s.begin();
    if (x3::parse(it, s.end(), x3::ascii::char_, c) && it == s.end())
        std::cout << "OK: " << int(c) << "n";
    else
        std::cout << "Failedn";
}
int main()
{
    test("x80");
    test(L"x80");
    test(u8"x80");
    test(u"x80");
    test(U"x80");
}

输出:

char:   OK: -128
wchar_t:    Failed
char8_t:    OK: 128
char16_t:   Failed
char32_t:   Failed

https://wandbox.org/permlink/j9pqervngzqeelfa