如何在C 中使用类似壳的规则拆分字符串

How to split a string using shell-like rules in C++?

本文关键字:规则 字符串 拆分      更新时间:2023-10-16

我的字符串看起来像外壳命令行:

string c = "/path/to/binary arg1 arg2 "arg3 has multiple words"";
string c2 = "/path/to/binary arg1 'arg2 could be single-quoted also';

我的目标很简单:我只想以类似于命令行壳的方式分开字符串。我不是在寻找诸如通配符或环境变量扩展之类的精美功能(尚未)。我只想将每个字符串分为各个部分:

vector<string> pieces = split_shell(c);
// pieces[0] == "/path/to/binary"
// pieces[1] == "arg1"
// pieces[2] == "arg2"
// pieces[3] == "arg3 has multiple words"
vector<string> pieces2 = split_shell(c2);
// pieces2[0] == "/path/to/binary"
// pieces2[1] == "arg1"
// pieces2[2] == "arg2 could be single-quoted also"

这显然很难通过将字符串沿墙壁划分,然后在代币上迭代以合并被引号包围的范围的那些,但是除非我必须重塑车轮。有没有干净的方法来执行此操作(在C 03中)?我愿意使用Boost库;我怀疑使用boost.spirit可能会有一个简单的实现,但我对此不太熟悉,无法确定。

请查看boost.program_options

实际上,您可以通过正则表达式执行此操作,因为C 03不支持REGEX(C 11确实),我们可以使用Boost :: Regex完成工作。<</p>

#include <string>
#include <vector>
#include <iostream>
#include "boost/regex.hpp"
int main()
{
    //std::string str = "/path/to/binary arg1 arg2 "arg3 has multiple words"";
    std::string str = "/path/to/binary arg1 'arg2 could be single-quoted also'";
    //std::regex rx("([^("|')]\S*|("|').+?("|'))\s*");
    boost::regex rx("([^("|')]\S*|("|').+?("|'))\s*");
    boost::smatch res;
    while (boost::regex_search (str,res,rx))
    {
        std::cout <<res[0] << std::endl;
        str = res.suffix().str();
    }
    return 0;
}