找到一个素数列表中最接近的数

Finding the closest number that factors given a list of primes

本文关键字：列表数列最接近一个更新时间：2023-10-16

假设我有一个数字，我可以找到组成这个数字的所有素数。例如，6000是2^4*3*5^3。

如果我有一个数不能很好地分解（给定一个可接受素数的列表），我如何找到下一个最接近的数？例如，给定数字5917，与素数2、3、5、7列表相关的最接近的数字是什么？在这个例子中是6000。

我有一些东西可以用蛮力找到答案，但必须有一个更优雅的解决方案。

const UInt32 num = 5917;
const CVector<UInt32> primes = { 2, 3, 5, 7 };
const size_t size = primes.size();
UInt32 x = num;
while (x < num * 2)
{
    const UInt32 y = x;
    for(size_t i = 0; i < size && x > 1; ++i)
    {
        while(x % primes[i] == 0)
        {
            x /= primes[i];
        }
    }
    if (x == 1)
    {
        cout << "Found " << y << endl;
        break;
    }
    else
    {
        x = y + 1;
    }
}

编辑

我创建了一个测试，使用蛮力方法和提供的3种方法作为答案，得到了一些令人惊讶的结果。所有4个版本都能给出正确的答案（所以感谢您的贡献），然而，蛮力方法似乎是最快的，数量级。我尝试了一些不同的系统、编译器和体系结构，它们都产生了基本一致的结果。

测试代码可在此处找到：http://ideone.com/HAgDsF.请随时提出建议。

我建议以下解决方案。我假设素数的顺序是从低到大。我还使用了方便的vector和int类型。

vector<int> primes = { 2, 3, 5, 7 };
int num = 5917;
// initialize bestCandidate as a power of some prime greater than num
int bestCandidate = 1;
while (bestCandidate < num) bestCandidate *= primes[0];
set<int> s;
s.insert(1);
while (s.size()) {
    int current = *s.begin();
    s.erase(s.begin());
    for (auto p : primes) { // generate new candidates
        int newCandidate = current * p;
        if (newCandidate < num) {
            // new lower candidates should be stored.
            if (s.find(newCandidate) == s.end())
                s.insert(newCandidate);
        }
        else {
            if (newCandidate < bestCandidate) bestCandidate = newCandidate;
            break; // further iterations will generate only larger numbers
        }
    }
}
cout << bestCandidate;

演示

接下来，我想对提出的解决方案进行分析。让我使用np作为素数的数量n作为一个数字，以找到最接近的结果minP作为列表中的最小素数。

我的解决方案生成的所有可能值都低于n。新值是从旧值中生成的。每个值只使用一次作为生成源。如果新值超过n，则视为有效候选者。如果列表将包含低于n的所有素数，则算法仍然可以执行良好。我不知道算法的时间复杂度公式，但它是低于n的有效候选者的数量乘以先前因子的对数。日志来自set数据结构操作。如果n足够小，可以分配n大小的数组来标记哪些值已经生成，我们可以去掉Log因子，一个简单的列表可以保存生成源值，而不是设置。
您的初始解决方案具有O（n（np+log_minP（n））。您检查每个数字是否有效，然后从n到2n一个接一个地为每个检查支付np+log_minP（n）。
@anatolyg的递归解决方案在多次"访问"某些有效数字方面存在很大缺陷，效率非常低。它可以通过引入一个标志来修复，该标志指示该号码已经被"访问"。例如，将从6 = 2*3和4 = 2*2访问12 = 2*2*3。次要缺陷是许多上下文切换和支持每个调用的状态。该解决方案有一个扰乱全局命名空间的全局变量，这可以通过添加函数参数来解决。
@dasblinkenlight的解决方案缺乏效率，因为已经"使用"的候选者被用来生成新的候选者，产生集合中已经存在的数字。尽管我借用了set的概念。

基于@的答案，我创建了一个c++解，它确实看起来渐近更有效，因为没有log因子。然而，我拒绝使用double对数，并将解决方案留给整数。这个想法很简单。我们有一份低于num的产品清单。每个乘积都是从第一个CCD_ 11素数中生成的。然后，我们尝试使用next prime生成新产品。这种方法保证生成唯一产品：

vector<int> primes = { 2, 3, 5, 7, 11, 17, 23 };
int num = 100005917;
int bestCandidate = INT_MAX;
list<pair<int, int> > ls;
ls.push_back(make_pair(1, 0));
while (ls.size()) {
    long long currentProd = ls.front().first;
    int primesUsed = ls.front().second;
    ls.pop_front();
    int currentPrime = primes[primesUsed];
    while (currentProd < num) {
        if(primesUsed < primes.size() - 1)
            ls.push_back(make_pair(currentProd, primesUsed + 1));
        currentProd *= currentPrime;
    }
    bestCandidate = min((long long)bestCandidate, currentProd);
}
cout << bestCandidate;

Demo

您可以尝试生成所有可能的乘积，直到您枚举target*minPrime下的所有乘积，其中minPrime是集合中最小的素数，而不是试图通过重复因子分解来得出答案。

从一个由1组成的集合开始。每次迭代都尝试将当前集合中的每个数字乘以每个素数。如果在最大值下找到一个新数字，则会将其添加到当前集合中。该过程会重复进行，直到无法添加新的数字为止。

在您的情况下，第一代将是

1 2 3 5 7

下一代将是

1 2 3 4 5 6 7 9 10 14 15 21 25 35 49

之后你会看到

第3代

1 2 3 4 5 6 7 8 9 10 12 14 15 18 20 21 25 27 28 30 35 42 45 49 50 63 70 75 98 105 125 147 175 245 343

第4代

1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24 25 27 28 30 35 36 40 42 45 49 50 54 56 60 63 70 75 81 84 90 98 100 105 125 126 135 140 147 150 175 189 196 210 225 245 250 294 315 343 350 375 441 490 525 625 686 735 875 1029 1225 1715 2401

等等。十二代之后，你的集合将不再增长，此时你可以找到高于目标的最小值。

演示。

我们的想法是，检查所有可接受素数的可能乘积，并选择最好的。

要实现这一点，使用递归是最简单的，尽管可能不是最有效的。制作一个递归函数，通过逐个添加所有可接受的素数来"检查"临时乘积。要记住最佳结果，最简单的方法是使用全局变量。

int g_result;
void check(int num, int product, const vector<int>& primes)
{
    if (product >= num)
    {
        g_result = std::min(g_result, product);
    }
    else
    {
        for (int prime: primes)
            check(num, product * prime, primes);
    }
}
...
int main()
{
    g_result = INT_MAX;
    vector<int> primes = { 2, 3, 5, 7 };
    check(5917, 1, primes);
    std::cout << g_result;
}

全局变量的使用是一种丑陋的黑客行为；在这个简单的例子中，它已经足够好了，但对于复杂（多线程）系统来说就不好了。要消除全局变量，请将函数填充到一个类中，并使其成为一个方法；并且使用成员变量CCD_ 15而不是全局变量。

注意：为了方便起见，我使用了vector<int>而不是CVector<UInt32>。

取对数，我们可以将其视为子集和问题的变体。下面是一个JavaScript示例，它列举了刚好通过目标标记的不同组合。

function f(target,primes){
  target = Math.log(target);
  primes = primes.map(function(x){ return Math.log(x); });
  var best = primes[0] * Math.ceil(target / primes[0]);
  var stack = [[0,0]];
  while (stack[0] !== undefined){
    var params = stack.pop();
    var t = params[0];
    var i = params[1];
    if (t > target){
      if (t < best){
        best = t;
      }
    } else if (i == primes.length - 1){
      var m = Math.ceil((target - t) / primes[i]);
      stack.push([t + m * primes[i],i + 1]);
    } else {
      t -= primes[i];
      while (t < target){
        t += primes[i];
        stack.push([t,i + 1]);
      }
    }
  }
  return Math.round(Math.pow(Math.E,best));
}
console.log(f(5917,[2,3,5,7]));