字符串的递归二进制搜索-C++
Recursive binary search for a string - C++
我正在尝试实现函数findMatchesInDict
,该函数试图查看一个单词是否与预先排序的词典中的任何单词匹配。以下是我目前的实现:
void findMatchesInDict(string word, int start, const string dict[], int end, string results[], int& totalResults)
{
// initial start = 0 index
// initial end = last index of dict array
int middle = start + (end - start) / 2;
if (end < start)
return;
if (word == dict[middle]) // if we found a match
storeUniqueMatches(word, 0, results, totalResults);
else if (word < dict[middle])
findMatchesInDict(word, start, dict, middle - 1, results, totalResults);
else
findMatchesInDict(word, middle + 1, dict, end, results, totalResults);
}
storeUniqueMatches
函数工作正常(这只是将匹配的单词存储到results
数组中,确保不会存储重复的单词
该功能将只匹配字典中的选定单词,而不匹配其他单词。
关于为什么这可能无法正常工作,有什么想法吗?
作为参考,此实现有效,但效率极低,并会导致堆栈溢出错误。
void findMatchesInDict(string word, int start, const string dict[], int end, string results[], int& totalResults)
{
if (start > end)
return;
if (word == dict[start]) // if we found a match
storeUniqueMatches(word, 0, results, totalResults);
findMatchesInDict(word, start + 1, dict, size, results, totalResults);
}
我仍然相信OP犯了一个1比1的错误。
我强烈怀疑
findMatchesInDict(word, start, dict, middle - 1, results, totalResults);
应该是
findMatchesInDict(word, start, dict, middle, results, totalResults);
我自己做了一个小样品。(因此,我重新设计了一点代码,因为我对OP的表现感到不走运。(
#include <iostream>
#include <string>
size_t find(const std::string &word, const std::string dict[], size_t i0, size_t size)
{
if (!size) return (size_t)-1; // bail out with invalid index
const size_t i = i0 + size / 2;
return word == dict[i]
? i
: word < dict[i]
? find(word, dict, i0, i - i0)
: find(word, dict, i + 1, i0 + size - (i + 1));
}
int main()
{
const std::string dict[] = {
"Ada", "BASIC", "C", "C++",
"D", "Haskell", "INTERCAL", "Modula2",
"Oberon", "Pascal", "Scala", "Scratch",
"Vala"
};
const size_t sizeDict = sizeof dict / sizeof *dict;
unsigned nErrors = 0;
// brute force tests to find something what is in
for (size_t n = 1; n <= sizeDict; ++n) {
for (size_t i = 0; i < n; ++i) {
if (find(dict[i], dict, 0, n) >= n) {
std::cerr << "ALERT! Unable to find entry " << i << " in " << n << " entries!n";
++nErrors;
}
}
}
// brute force tests to find something what is not in
for (size_t n = 1; n <= sizeDict; ++n) {
if (find("", dict, 0, n) < n) {
std::cerr << "ALERT! Able to find entry '' in " << n << " entries!n";
++nErrors;
}
for (size_t i = 0; i < n; ++i) {
if (find(dict[i] + " + Assembler", dict, 0, n) < n) {
std::cerr << "ALERT! Able to find entry '" << dict[i] << " + Assembler' in " << n << " entries!n";
++nErrors;
}
}
}
// report
if (!nErrors) std::cout << "All tests passed OK.n";
else std::cerr << nErrors << " tests failed!n";
// done
return nErrors > 0;
}
coliru上的实时演示
这些代码中大部分是暴力测试代码:
对
dict
从1到大小的每个长度进行了测试。对于每个长度,搜索dict
的任何条目。对CCD_ 6从1到大小的每个长度进行了测试。对于每个长度,测试空字符串(在任何其他条目之前(以及任何经过修改的条目。(修改授权它将在未修改的条目和其后续条目之间,或在最后一个条目之后。(
输出:
All tests passed OK.
一切顺利。
然后我换了
find(word, dict, i0, i - i0)
带有
find(word, dict, i0, i - i0 > 0 ? i - i0 - 1 : 0)
类似于(在我看来(OP代码的错误。
输出:
ALERT! Unable to find entry 0 in 2 entries!
ALERT! Unable to find entry 0 in 3 entries!
ALERT! Unable to find entry 1 in 4 entries!
ALERT! Unable to find entry 1 in 5 entries!
ALERT! Unable to find entry 3 in 5 entries!
ALERT! Unable to find entry 0 in 6 entries!
ALERT! Unable to find entry 2 in 6 entries!
ALERT! Unable to find entry 4 in 6 entries!
ALERT! Unable to find entry 0 in 7 entries!
ALERT! Unable to find entry 2 in 7 entries!
ALERT! Unable to find entry 4 in 7 entries!
ALERT! Unable to find entry 0 in 8 entries!
ALERT! Unable to find entry 3 in 8 entries!
ALERT! Unable to find entry 5 in 8 entries!
ALERT! Unable to find entry 0 in 9 entries!
ALERT! Unable to find entry 3 in 9 entries!
ALERT! Unable to find entry 6 in 9 entries!
ALERT! Unable to find entry 1 in 10 entries!
ALERT! Unable to find entry 4 in 10 entries!
ALERT! Unable to find entry 7 in 10 entries!
ALERT! Unable to find entry 1 in 11 entries!
ALERT! Unable to find entry 4 in 11 entries!
ALERT! Unable to find entry 7 in 11 entries!
ALERT! Unable to find entry 9 in 11 entries!
ALERT! Unable to find entry 1 in 12 entries!
ALERT! Unable to find entry 3 in 12 entries!
ALERT! Unable to find entry 5 in 12 entries!
ALERT! Unable to find entry 8 in 12 entries!
ALERT! Unable to find entry 10 in 12 entries!
ALERT! Unable to find entry 1 in 13 entries!
ALERT! Unable to find entry 3 in 13 entries!
ALERT! Unable to find entry 5 in 13 entries!
ALERT! Unable to find entry 7 in 13 entries!
ALERT! Unable to find entry 9 in 13 entries!
ALERT! Unable to find entry 11 in 13 entries!
35 tests failed!
嗯。事实上,这并不能证明任何关于OP.的代码
然而,这显示
"off by 1"可以从本质上打破二进制搜索。
如何设计强力测试来发现此类错误。
因此,这将有望帮助OP自己发现算法中的错误(这对他来说实际上更有价值(。
相关文章:
- 有根的二进制搜索树.保留与其父级的链接
- 在C++中搜索嵌套多映射值
- cpp二进制搜索问题,计算给定数组中输入元素的出现次数
- 二进制搜索树叶数问题
- 为什么二进制搜索在我的测试中不起作用
- 正在尝试重载二进制搜索树分配运算符
- c++binary_search函数排序数组(流行名称搜索)出现问题
- 向量上的线性搜索
- 如何在动态数组上使用搜索函数
- 对于MacOS上的G++,如何添加默认的include目录/usr/local/include和默认的库搜索路径/usr
- cmake:添加要搜索头文件的目录
- 使用C++创建特殊的二叉搜索树
- 在C++的字符串中搜索和删除某些字符
- std::unordered_map 搜索算法是如何实现的?
- 使用不变量来确定二分搜索中的边界条件
- 二叉搜索如何比线性搜索更快?
- 按边长度递归搜索图中所有可行路径
- QStackWidget - 按名称搜索
- 在递归二叉搜索树中搜索
- 我的二进制搜索程序只是关闭了