打印C++中字符出现之间/之后的所有子字符串

Printing all substrings between/after occurrence of a character in C++

本文关键字:之后 字符串 之间 C++ 字符 打印      更新时间:2023-10-16

我试图抓取某个字符出现之间或之后的所有子字符串。 特别是对于搜索查询网址(抓取选项(,例如,如果我有:

std::string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8"

我需要在"&"字符之间和之后(最后一次出现(输出字符串 所以输出将是:

rlz=1C1CHBF_enUS851US851 
oq=i+need+help
aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7
sourceid=chrome 
ie=UTF-8

我了解如何用一个字符串执行此操作,但我被困在尝试将其实现到循环中。这必须使用多个不同长度和选项数量的 url 来完成。

到目前为止,我只能在字符的第一次和第二次出现之间抓取一个子字符串,但我需要在任何给定的 url 中抓取所有这些子字符串。

int a = url.find("&") + 1;
int b = url.find("&", url.find("&") + 1);
int c = (b - a);
std::string option = url.substr(a, c);

只需在循环中找到前一个&的下一个,如果找不到更多&,请退出循环并处理第一个元素:

vector<string> foo(const string& url)
{
vector<string> result;
auto a = url.find("?");
if (a == string::npos) return result;
auto b = url.find("&");
if (b == string::npos)
{
result.push_back(url.substr(a + 1, string::npos));
return result;
}
result.push_back(url.substr(a + 1, b - a - 1));
do
{
a = b;
b = url.find("&", a + 1);
result.push_back(url.substr(a + 1, b - a - 1));
} while (b != string::npos);
return result;
}

适用于您的示例:https://ideone.com/SiRZQB

Tbh 我真的认为您应该为此工作使用适当的 URI 解析器,因为可能有很多边缘情况。但是你去吧:

#include <iostream>
#include <string>
int main()
{
std::string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8";
char delimiter = '&';
size_t start = url.find(delimiter);
size_t end;
while (start != std::string::npos) {
end = url.find(delimiter, start + 1);
std::cout << url.substr(start + 1, end - start - 1) << std::endl;
start = end;
}
}

游乐场:http://cpp.sh/8pshy7

您可以尝试以下代码,该代码使用正则表达式来解析 url。

#include <regex>
#include <iostream>
#include <string>
using namespace std;
int main(){
string url = "https://www.google.com/search?q=i+need+help&rlz=1C1CHBF_enUS851US851&oq=i+need+help&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7&sourceid=chrome&ie=UTF-8";
regex rg("[?&](([^&]+)=([^&]+))");
for(smatch sm; regex_search(url, sm, rg); url=sm.suffix())
cout << sm[1] <<endl;
return 0;
}

你可以只使用一个普通的 for 循环,例如

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
int main() 
{
std::string url = "https://www.google.com/search?q=i+need+help"
"&rlz=1C1CHBF_enUS851US851"
"&oq=i+need+help"
"&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7"
"&sourceid=chrome&ie=UTF-8";
char c = '&';
size_t n = std::count_if( std::begin( url ), std::end( url ),
[=]( const auto &item )
{
return item == c;
} );
std::vector<std::string> v;
v.reserve( n );
for ( auto pos = url.find( c, 0 );  pos != std::string::npos; )
{
auto next = url.find( c, ++pos );
auto n = ( next == std::string::npos ? url.size() : next ) - pos;
v.push_back( url.substr( pos, n ) ); 
pos = next;                    
}
for ( const auto &s : v ) std::cout << s << 'n';
}

程序输出为

rlz=1C1CHBF_enUS851US851
oq=i+need+help
aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7
sourceid=chrome
ie=UTF-8

或者你可以写一个单独的函数,例如

#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
std::vector<std::string> split_url( const std::string &url, char c = '&' )
{
size_t n = std::count_if( std::begin( url ), std::end( url ),
[=]( const auto &item )
{
return item == c;
} );
std::vector<std::string> v;
v.reserve( n );
for ( auto pos = url.find( c, 0 );  pos != std::string::npos; )
{
auto next = url.find( c, ++pos );
auto n = ( next == std::string::npos ? url.size() : next ) - pos;
v.push_back( url.substr( pos, n ) ); 
pos = next;                    
}
return v;
}
int main() 
{
std::string url = "https://www.google.com/search?q=i+need+help"
"&rlz=1C1CHBF_enUS851US851"
"&oq=i+need+help"
"&aqs=chrome.0.69i59j0l3j69i60l2.4646j0j7"
"&sourceid=chrome&ie=UTF-8";

auto v = split_url(url );
for ( const auto &s : v ) std::cout << s << 'n';
}