在匹配另一个模式的字符串中找到最短子字符串的开始和结尾索引
find the begin and end index of shortest substring in a string that match another pattern
给定两个字符串 text
和 pattern
,找到与 pattern
匹配的text
中最短子字符串的开始和结尾索引,这意味着pattern
中的所有字符均以相同的顺序出现在两个子弦中,和 pattern
,但是这些字符之间可能还有其他字符。
如果您可以从text
找到此类子字符串,请打印其开始和结束索引,Else Print -1,-1。如果有多个最短的匹配子字符串,请返回最小的开始索引子字符串的索引。
样本输入:
axxxbcaxbcaxxbc abc
abcd x
axxxbaxbab ab
样本输出:
6 9
-1 -1
8 9
是否有人有一些好的算法来解决此问题而不使用内置支持在C 或Python中的正则表达式
python
def shortest_match(text, pattern):
stack = [] # to store matches
for i in range(len(text) - len(pattern) + 1):
# if we match the firts character of pattern in
# text then we start to search for the rest of it
if pattern[0] == text[i]:
j = 1 # pattern[0] already match, let's check from 1 onwards
k = i + 1 # text[i] == pattern[0], let's check from text[i+1] onwards
# while pattern[j] could match text[i]
while j < len(pattern) and k < len(text):
if pattern[j] == text[k]:
j += 1 # pattern[j] matched. Let's move to the next character
k += 1
if j == len(pattern): # if the match was found add it to the stack
stack.append((i, k-1))
else: # otherwise break the loop (we won't find any other match)
break
if not stack: # no match found
return (-1, -1)
lengths = [y - x for x, y in stack] # list of matches lengths
return stack[lengths.index(min(lengths))] # return the shortest
C
#include <iostream>
#include <vector>
#include <string.h>
using namespace std;
struct match_pair
{
int start;
int end;
int length;
};
void
print_match (match_pair m)
{
cout << "(" << m.start << ", " << m.end << ")";
}
match_pair
shortest_match (char * text, char * pattern)
{
vector <match_pair> stack; // to store matches
for (int i = 0; strlen(text) - strlen(pattern) + 1; ++i)
{
// if we match the firts character of pattern in
// text then we start to search for the rest of it
if (pattern[0] == text[i])
{
int j = 1; // pattern[0] already match, let's check from 1 onwards
int k = i + 1; // text[i] == pattern[0], let's check from text[i+1] onwards
// while pattern[j] could match text[i]
while (j < strlen(pattern) && k < strlen(text))
{
if (pattern[j] == text[k])
{
++j; // pattern[j] matched. Let's move to the next character
}
++k;
}
if (j == strlen(pattern)) // if the match was found add it to the stack
{
match_pair current_match;
current_match.start = i;
current_match.end = k - 1;
current_match.length = current_match.end - current_match.start;
stack.push_back(current_match);
} else // otherwise break the loop (we won't find any other match)
{
break;
}
}
}
match_pair shortest;
if (stack.empty()) // no match, return (-1, -1)
{
shortest.start = -1;
shortest.end = -1;
shortest.length = 0;
return shortest;
}
// search for shortest match
shortest.start = stack[0].start;
shortest.end = stack[0].end;
shortest.length = stack[0].length;
for (int i = 1; i < stack.size(); ++i)
{
if (stack[i].length < shortest.length)
{
shortest.start = stack[i].start;
shortest.end = stack[i].end;
shortest.length = stack[i].length;
}
}
return shortest;
}
// override << for printing match_pair
std::ostream&
operator<< (std::ostream& os, const match_pair& m)
{
return os << "(" << m.start << ", " << m.end << ")";
}
int
main ()
{
char text[] = "axxxbcaxbcaxxbc";
char pattern[] = "abc";
cout << shortest_match(text, pattern);
return 0;
}
在文本的字符上循环,并在文本中找到图案的第一个字符。如果找到它,请在其余文本中搜索图案的第二个字符的情况,然后重复所有模式中的所有字符,然后在文本中跳过不必要的字符。完成后,重新开始,从文本中的模式的第一个字符开始。
也许使用abc
模式更为视觉:
axxxbcaxbcaxxbc
[axxx|b|c] -> 6 chars
[ax|b|c] -> 4 chars
[axx|b|c] -> 5 chars
或
aababaccccccc
[aa|baba|c] -> 6 chars
[a|baba|c] -> 5 chars
[a|ba|c] -> 4 chars
[accccccc] -> -1 chars as the substring does not match the pattern
编辑:您应该尝试从文本末尾开始实现此算法,因为这是您要寻找的子字符串的位置。
相关文章:
- 从末尾开始的字符串中获取反向子字符串
- 我如何使它,无论用户用空白字符串按 Enter 多少次,它总是打印"开始"字符串?
- 如何修复我的编码,刚开始使用字符串
- 如何将从第 2 个字符开始的字符串作为函数中的参数传递以进行递归,并约束数据 tiee 是函数中的字符串?
- 尝试查找从索引 0 开始的偶数长度子字符串
- 如何从字符串中间开始的子字符串
- 在字符串开始时删除某些字符
- 当给定一个字符串时,Cout 在行首开始打印
- c 字符串开始时ASCII值是什么
- c++ UTF-8 字符串 erro 从 10xxx 开始,带有 cpprestsdk
- 在匹配另一个模式的字符串中找到最短子字符串的开始和结尾索引
- 如何将SSCANF()设置为在字符串开始时忽略字符
- gvim 换行应该结束当前字符串并在下一行开始一个新的字符串
- 返回从字符串末尾 (C++) 之后的位置开始的子字符串
- 字符串文本的C++宏开始迭代器和结束迭代器
- 当试图在ostringstream中连接一个字符串时,这些字符串的内容会被修改和重构,字符串会在开始时被添加
- 如何读取从特定索引开始的字符串
- 一个字符串开始另一个字符串是什么意思?c++
- 如何在c++中检查字符串开始
- 我如何写一个Qt正则表达式来突出显示字符串开始和结束与非单词字符