pcre不能支持多个子组

pcre can not support mutiple subgroups

本文关键字:不能 支持 pcre      更新时间:2023-10-16

这是关于pcre多个子组的,主题是:

const char* subject = "http://mail.google.com:443";

我想找到协议&域&端口,我的regex是这样的,但pcre_exec返回0

const char* regex_str = "([^/]+)//([^:]+):(\d+)";

但是当这样修改时,pcre_exec返回2:

const char* regex_str = "[^/]+//([^:]+):\d+";

怎么了?

#include <stdio.h>
#include <string.h>
#include <pcre.h>
#define VECTORSIZE 6
int main()
{
    const char* subject = "http://mail.google.com:443";
    const char* regex_str = "([^/]+)//([^:]+):(\d+)";
    const char* error = NULL;
    int erroffset = 0;
    int ovector[VECTORSIZE];
    char match[50];
    int matchlen = 0;
    pcre* regex = pcre_compile(regex_str, PCRE_CASELESS, &error, &erroffset, NULL);
    if(regex == NULL)
    {
        printf("error=%s,offset=%dn", error, erroffset);
        return -1;
    }
    int matches = pcre_exec(regex, NULL, subject, strlen(subject), 0, 0, ovector, VECTORSIZE);
    printf("matches=%dn", matches);
    if(matches == -1)
    {
        printf("no matchesn");
        return -1;
    }
    for(int i=0; i<matches; i++)
    {
        memset(match, 0, sizeof(match));
        matchlen = ovector[2*i + 1] - ovector[2*i];
        printf("start=%d, lenth=%dn", ovector[2*i], matchlen);
        memcpy(match, subject + ovector[2*i], matchlen);
        printf("match=%sn", match);
    }
    pcre_free(regex);
    return 0;
}

此行:

#define VECTORSIZE 6

把它改成30左右(int很便宜)就行了。pcre_exec每个子组需要3个元素,整个匹配还需要3个。因此,在您的示例中,它需要至少为12。