使用Kmean找到具有最高数量元素的聚类

find the cluster of a highest number of elements using Kmean?

本文关键字：高数量元素聚类 Kmean 使用更新时间：2023-10-16

我使用kmean函数将8-D向量聚类为一组聚类，如下所示：

 kmeans(Vectors, clusterCount, labels, TermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 100, 2), 10, KMEANS_PP_CENTERS, centers);

对我来说，最成功的集群是包含更多向量的集群。所以我的问题是如何找到种群数量最多的集群？标签param是每个向量所属的一个指示符，我觉得如果我用它来找到频率，它将消耗一段时间。有人能提出一个主意吗？

传统上，我是这样做的：

int max = -1;int index = -1;
vector<int> classes;
classes.resize(clusterCount);
for (int i=0;i<labels.rows;i++)
{
  int idx = labels.at<int>(i,0);
  classes[idx]++;
  if (classes[idx] > max)
  {
    max = classes[idx];
    index = idx;
 }
}

有比这更快的解决方案吗？

我正在寻找相同的代码，但还没有发现任何实质性的不同，但是你可以加快你的代码：

不要每次都更新您的最大值
避免使用中间变量（如int idx）

这是我的代码：

int classes[clusterCount];
memset(classes, 0, sizeof(classes[0]) * clusterCount);
int * labels_ptr = labels.ptr<int>(0);
for (int i = 0; i < labels.rows; ++i)
    classes[*labels_ptr++]++;
for (int i = 0; i < clusterCount; ++i)
    {
    if (classes[i] > max)
        {
        max = count[i];
        index = i;
        }
    }

这段代码的结果与您的相同，在我的电脑（intelcorei7）上，运行速度大约是您提供的代码的5倍（在不同的映像上测试了1000次）。