c++直方图bin排序

c++ histogram bin sorting

本文关键字:排序 bin 直方图 c++      更新时间:2023-10-16

我正在编写一个函数来克隆excel中数据分析插件的直方图功能。基本上,提供了样本数据的输入,然后也提供了bin范围。bin范围必须是单调递增的,在我的例子中,需要明确为[0 20 40 60 80 100]。如果样本大于下界(左边缘)且小于或等于上界(右边缘),则excel会计算样本是否属于bin范围。

我在下面写了bin排序算法,它对data0给出了不正确的输出(非常接近),但对data1和data2给出了正确的输出。在这种情况下,适当意味着该算法的输出与excel生成的表中的输出完全匹配,其中在bin旁边计算样本数量。任何帮助都是感激的!

#include <iostream>
int main(int argc, char **agv)
{
    const int SAMPLE_COUNT      = 21;
    const int BIN_COUNT         = 6;
    int binranges[BIN_COUNT]    = {0, 20, 40, 60, 80, 100};
    int bins[BIN_COUNT]         = {0, 0, 0, 0, 0, 0};
    int data0[SAMPLE_COUNT] =  {4,82,49,17,89,73,93,86,74,36,74,55,81,61,88,94,72,65,35,25,79};
    // for data0 excell's bins read:
    // 0    0
    // 20   2
    // 40   3
    // 60   2
    // 80   7
    // 100  7
    //
    // instead output of bins is: 203277
    int data1[SAMPLE_COUNT] = {88,83,0,0,95,86,0,94,92,77,94,73,93,90,50,95,93,83,0,95,91};
    //for data1 excell and this algorithm both yield:
    // 0    4
    // 20   0
    // 40   0
    // 60   1
    // 80   2
    // 100  14  (correct)
    int data2[SAMPLE_COUNT] = {58,48,75,68,85,78,74,83,83,75,67,58,75,58,84,68,57,88,55,79,72};
    //for data2 excell and this algorithm both yield:
    // 0    0
    // 20   0
    // 40   0
    // 60   6
    // 80   10
    // 100  5   (correct)
    for (unsigned int binNum = 1; binNum < BIN_COUNT; ++binNum)
    {
        const int leftEdge = binranges[binNum - 1];
        const int rightEdge = binranges[binNum];
        for (unsigned int sampleNum = 0; sampleNum < SAMPLE_COUNT; ++sampleNum)
        {
            const int sample = data0[sampleNum];
            if (binNum == 1)
            {
                if (sample >= leftEdge && sample <= rightEdge)
                    bins[binNum - 1]++;
            }
            else if (sample > leftEdge && sample <= rightEdge)
            {
                bins[binNum]++;
            }
        }
    }
    for (int i = 0; i < BIN_COUNT; ++i)
        std::cout << bins[i] << " " << std::flush;
    std::cout << std::endl << std::endl;
    return 0;
}

假设这些边总是按递增顺序排列,那么你只需要:

     unsigned int bin;
    for (unsigned int sampleNum = 0; sampleNum < SAMPLE_COUNT; ++sampleNum)
    {
           const int sample = data0[sampleNum];
           bin = BIN_COUNT;
           for (unsigned int binNum = 0; binNum < BIN_COUNT; ++binNum)  {
                 const int rightEdge = binranges[binNum];
                 if (sample <= rightEdge) {
                    bin = binNum;
                    break;
                }
           }
           bins[bin]++;
      }

但是,为了使这段代码工作,您需要为等于或低于第一条边(0)的值再添加一个bin。

有理数是如果你有n个分隔符,那么你有n+1个区间。