打印C++中字符出现之间/之后的所有子字符串

Printing all substrings between/after occurrence of a character in C++

本文关键字：之后字符串之间 C++ 字符打印更新时间：2023-10-16

我试图抓取某个字符出现之间或之后的所有子字符串。特别是对于搜索查询网址(抓取选项(，例如，如果我有：

std::string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8"

我需要在"&"字符之间和之后(最后一次出现(输出字符串所以输出将是：

rlz=1C1CHBF_enUS851US851 
oq=i+need+help
aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7
sourceid=chrome 
ie=UTF-8

我了解如何用一个字符串执行此操作，但我被困在尝试将其实现到循环中。这必须使用多个不同长度和选项数量的 url 来完成。

到目前为止，我只能在字符的第一次和第二次出现之间抓取一个子字符串，但我需要在任何给定的 url 中抓取所有这些子字符串。

int a = url.find("&") + 1;
int b = url.find("&", url.find("&") + 1);
int c = (b - a);
std::string option = url.substr(a, c);

只需在循环中找到前一个&的下一个，如果找不到更多&，请退出循环并处理第一个元素：

vector<string> foo(const string& url)
{
vector<string> result;
auto a = url.find("?");
if (a == string::npos) return result;
auto b = url.find("&");
if (b == string::npos)
{
result.push_back(url.substr(a + 1, string::npos));
return result;
}
result.push_back(url.substr(a + 1, b - a - 1));
do
{
a = b;
b = url.find("&", a + 1);
result.push_back(url.substr(a + 1, b - a - 1));
} while (b != string::npos);
return result;
}

适用于您的示例：https://ideone.com/SiRZQB

Tbh 我真的认为您应该为此工作使用适当的 URI 解析器，因为可能有很多边缘情况。但是你去吧：

#include <iostream>
#include <string>
int main()
{
std::string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8";
char delimiter = '&';
size_t start = url.find(delimiter);
size_t end;
while (start != std::string::npos) {
end = url.find(delimiter, start + 1);
std::cout << url.substr(start + 1, end - start - 1) << std::endl;
start = end;
}
}

游乐场：http://cpp.sh/8pshy7

您可以尝试以下代码，该代码使用正则表达式来解析 url。

#include <regex>
#include <iostream>
#include <string>
using namespace std;
int main(){
string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8";
regex rg("[?&](([^&]+)=([^&]+))");
for(smatch sm; regex_search(url, sm, rg); url=sm.suffix())
cout << sm[1] <<endl;
return 0;
}

你可以只使用一个普通的 for 循环，例如

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
int main() 
{
std::string url = "https://www.google.com/search?q=i+need+help"
"&rlz=1C1CHBF_enUS851US851"
"&oq=i+need+help"
"&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7"
"&sourceid=chrome&ie=UTF-8";
char c = '&';
size_t n = std::count_if( std::begin( url ), std::end( url ),
[=]( const auto &item )
{
return item == c;
} );
std::vector<std::string> v;
v.reserve( n );
for ( auto pos = url.find( c, 0 );  pos != std::string::npos; )
{
auto next = url.find( c, ++pos );
auto n = ( next == std::string::npos ? url.size() : next ) - pos;
v.push_back( url.substr( pos, n ) ); 
pos = next;                    
}
for ( const auto &s : v ) std::cout << s << 'n';
}

程序输出为

rlz=1C1CHBF_enUS851US851
oq=i+need+help
aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7
sourceid=chrome
ie=UTF-8

或者你可以写一个单独的函数，例如

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
std::vector<std::string> split_url( const std::string &url, char c = '&' )
{
size_t n = std::count_if( std::begin( url ), std::end( url ),
[=]( const auto &item )
{
return item == c;
} );
std::vector<std::string> v;
v.reserve( n );
for ( auto pos = url.find( c, 0 );  pos != std::string::npos; )
{
auto next = url.find( c, ++pos );
auto n = ( next == std::string::npos ? url.size() : next ) - pos;
v.push_back( url.substr( pos, n ) ); 
pos = next;                    
}
return v;
}
int main() 
{
std::string url = "https://www.google.com/search?q=i+need+help"
"&rlz=1C1CHBF_enUS851US851"
"&oq=i+need+help"
"&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7"
"&sourceid=chrome&ie=UTF-8";

auto v = split_url(url );
for ( const auto &s : v ) std::cout << s << 'n';
}