递归合并排序，大数组的分段错误

recursive merge sort, segmentation fault for big arrays

本文关键字：分段错误数组合并排序递归更新时间：2023-10-16

我已经递归地实现了合并排序。它可以工作到一定大小的排序数组，然后以"分段错误"崩溃。在Intel Xeon 16GB的情况下，最大浮点数组大小为17352，int数组更大，double数组更低。在AMD A10，16GB，浮动的限制是2068。显然存在内存问题。我为数组（非递归）所做的其他排序算法可以很好地处理~2e6。编译器为GCC 4.4.7。如何改进这种合并排序，使其适用于较大的数组？

#include <iostream>
#include <stdlib.h>
#include <cmath>
#include <vector>
using namespace std;
// --------------------------------------------------------
// merge 2 subarrays of 1 array around its middle im
template <class C>
void merge(C* arr, int ilow, int imid, int ihigh)
{
vector<C> temp; // array seg faults earlier than vector
for(int i=ilow; i<=ihigh; i++) temp.push_back(arr[i]); // copy array
int i1=ilow, i2=imid+1, ai=ilow; // starting positions
while(i1<=imid && i2<=ihigh) // compare 1st and 2nd halves
{
    if(temp[i1]<=arr[i2])
    {
        arr[ai] = temp[i1];
        i1++; // leave smaller val behind
    }
    else
    {
        arr[ai] = temp[i2];
        i2++; // leave smaller val behind
    }
    ai++; // move forward
}
if(i2>ihigh) while(i1<=imid) // if 2nd is done, copy the rest from 1st
{
    arr[ai] = temp[i1];
    i1++;
    ai++;
}
if(i1>imid) while(i2<=ihigh) // if 1st is done, copy the rest from 2nd
{
    arr[ai] = temp[i2];
    i2++;
    ai++;
}
} // merge()

// --------------------------------------------------------
// merge sort algorithm for arrays
template <class C>
void sort_merge(C* arr, int ilow, int ihigh)
{
if(ilow < ihigh)
{
    int imid = (ilow+ihigh)/2; // get middle point
    sort_merge(arr, ilow,   imid); // do 1st half
    sort_merge(arr, imid+1, ihigh); // do 2nd half
    merge(arr, ilow, imid, ihigh); // merge 1st and 2nd halves
}
return;
} // sort_merge()

///////////////////////////////////////////////////////////
int main(int argc, char *argv[])
{
// crashes at 17353 on Intel Xeon, and at 2069 on AMD A10, both 16Gb of ram
const int N=17352+0;
float arr[N]; // with arr[double] crashes sooner, with arr[int] crashes later
// fill array
for(long int i=0; i<N; i++)
{
    //arr[i] = rand()*1.0/RAND_MAX; // random
    arr[i] = sin(i*10)+cos(i*10); // partially sorted
    //arr[i] = i; // sorted
    //arr[i] = -i; // reversed
}
sort_merge(arr, 0, N-1);
return 0;
}

考虑复制数组的方式：

vector<C> temp; // array seg faults earlier than vector
for(int i=ilow; i<=ihigh; i++) temp.push_back(arr[i]); // copy array

完成此操作后，temp包含ihigh - ilow + 1值，这些值可从temp[0]到temp[ihigh - ilow]访问。这意味着temp中的所有值与arr相比偏移了-ilow。

但是，代码的其余部分使用源数组的索引访问temp，例如：

if(temp[i1]<=arr[i2]) // i1 isn't a valid index into temp, should be (i1 - ilow)

因此发生了坠机事件。在temp中使用适当的偏移量时，您的代码似乎工作正常。

由于臭名昭著的堆栈溢出，如果N足够大，以下内容本身就足以导致分段错误。如果没有，则应填充数组。

int main(int argc, char *argv[])
{
    // crashes at 17353 on Intel Xeon, and at 2069 on AMD A10, both 16Gb of ram
    const int N=17352+0;
    float arr[N];
}

原因是局部变量倾向于在堆栈上分配，但堆栈的大小有限，并且不是为大规模内存分配而设计的。如果你改为

float *arr = new arr[N]; // probably should be unique_ptr instead....

或

std::vector<float> arr(N);

您不会有任何问题，因为这两种方法都在堆上分配内存。