有效地向节点发送请求的算法

algorithm to efficiently send request to nodes

本文关键字：请求算法节点有效地更新时间：2023-10-16

我想根据每个节点的配置将流量均匀地分配到各个节点。最多可以有100个节点，以及分配给多个节点的流量百分比可配置

例如，如果有4个节点:-

node 1 - 20
node 2 - 50
node 3 - 10
node 4 - 20
------------
sum    - 100
------------

所有节点的值之和应为100。例子:-

node 1 - 50
node 2 - 1
node 3 - 1
node 4 - 1
.
.
.
node 100 - 1

上述配置共有51个节点。节点1为50，其余50节点配置为1

在一种情况下，请求可以按以下模式分发:-node1, node2、node3 node4 node5,…,node51, node1, node1, node1, node1, node1, node1, node1,…

上面的分布是低效的，因为我们连续向node1发送了太多的流量，这会导致node1拒绝请求。

在另一种情况下，请求可以按以下模式分发:-node1, node2, node1 node3, node1 node4, node1, node5, node1, node6, node1, node7, node1, node8…

在上述情况下，请求被更有效地分配。

我发现下面的代码，但不能理解背后的思想。

func()
{
  for(int itr=1;itr<=total_requests+1;itr++)
  {
      myval = 0;           
      // Search the node that needs to be incremented
       // to best approach the rates of all branches                      
      for(int j=0;j<Total_nodes;j++)
      {
         if((nodes[j].count*100/itr > nodes[j].value) ||
           ((nodes[j].value - nodes[j].count*100/itr) < myval) ||
           ((nodes[j].value==0 && nodes[j].count ==0 )))
              continue;
            cand = j;
            myval = abs((long)(nodes[j].count*100/itr - nodes[j].value));
       }
       nodes[cand].count++;
  }
  return nodes[cand].nodeID;
}

在上面的代码中，total_requests是到目前为止接收到的请求总数。Total_requests变量将每次递增，为了便于理解，将其视为全局值。

Total_nodes，是配置的节点总数，每个节点使用以下结构表示:

nodes是一个结构体:-

struct node{
  int count;
  int value;
  int nodeID;
};

例如:-

If 4 nodes are configured :-
node 1 - 20
node 2 - 50
node 3 - 10
node 4 - 20
------------
sum    - 100
------------

将用以下值创建四个节点[4]:-

node1{
  count = 0;
  value = 20;
  nodeID = 1;
};
node2{
  count = 0;
  value = 50;
  nodeID = 2;
};
node3{
  count = 0;
  value = 10;
  nodeID = 3;
};
node4{
  count = 0;
  value = 20;
  nodeID = 4;
};

你能给我解释一下它是如何有效地分配它的算法或想法吗?

nodes[j].count*100/itr是到目前为止节点j已响应的请求百分比的下限。nodes[j].value表示节点j响应请求的百分比。您发布的代码查找落后目标百分比最远的节点(或多或少，取决于整数除法的不稳定性)，并为其分配下一个请求。

嗯。似乎当你达到100个节点时，它们必须每个占用1%的流量?

我真的不知道你提供的函数是做什么的。我假设它试图找到离长期平均负载最远的节点。但是，如果total_requests是迄今为止的总数，那么我不知道外部for(int itr=1;itr<=total_requests+1;itr++)循环正在做什么，除非它实际上是一些测试的一部分，以显示它如何分配负载?

无论如何，基本上你所做的类似于构建一个非均匀随机序列。对于多达100个节点，如果我可以(暂时)假设0..999给出了足够的分辨率，然后您可以使用带有1000个节点ID的"id_vector[]"，其中包含n1个节点1的ID副本，n2个节点2的ID副本，以此类推——其中节点1接收n1/1000的流量，以此类推。决策过程非常非常简单——选择id_vector[random() % 1000]。随着时间的推移，节点将接收到适当数量的流量。

如果你对随机分布的流量不满意，那么你用节点id播种id_vector，这样你就可以通过"轮询"来选择，并为每个节点获得合适的频率。这样做的一种方法是随机洗牌如上所述构造的id_vector(并且可能偶尔重新洗牌，这样如果一次洗牌是"坏的"，您就不会被它所困扰)。或者你可以做一个一次性的漏桶操作并从中填充id_vector。每次绕过id_vector，这保证每个节点都将收到分配的请求数量。

id_vector的粒度越细，对每个节点的短期请求频率的控制就越好。

请注意，上面的所有内容都假设节点的相对负载是恒定的。如果不是，那么你需要(时不时地?)调整id_vector.

编辑以按要求添加更多细节…

…假设我们只有5个节点，但我们将每个节点的"权重"表示为n/1000，以允许最多100个节点。假设它们的id是1..5、和权重:

  ID=1, weight = 100
  ID=2, weight = 375
  ID=3, weight = 225
  ID=4, weight = 195
  ID=5, weight = 105

显然，加起来是1000。

所以我们构造一个id_vector[1000]这样:

  id_vector[  0.. 99] = 1   -- first 100 entries = ID 1
  id_vector[100..474] = 2   -- next  375 entries = ID 2
  id_vector[475..699] = 3   -- next  225 entries = ID 3
  id_vector[700..894] = 4   -- next  195 entries = ID 4
  id_vector[100..999] = 5   -- last  105 entries = ID 5

现在，如果我们洗牌id_vector[]，我们得到一个随机的节点选择序列，但超过1000个请求，每个节点请求的正确平均频率。

对于娱乐值，我尝试了一个"泄漏桶"，通过为每个节点使用一个泄漏桶填充id_vector，看看它如何能够保持对每个节点的稳定请求频率。下面是完成此操作的代码，看看它的效果如何，以及简单随机版本的效果如何。

每个泄漏桶都有一个cc计数，即在将下一个请求发送到此节点之前，应该发送(到其他节点)的请求数量。每次发送请求时，所有桶的cc计数都会减少，并且桶的cc最小(或者如果cc相等，则id最小)的节点发送请求，此时节点桶的cc被重新充值。(每个请求都会导致所有桶滴注一次，所选节点的桶被重新充电。)

cc是bucket的"contents"的整数部分。cc的初始值为q = 1000 / w，其中w为节点的权重。每次对桶进行充值，q就被添加到cc中。然而，为了精确地做事情，我们需要处理剩余的r = 1000 % w…或者换句话说，"内容"有一个小数部分——这就是cr的由来。内容的真值为cc + cr/w(其中cr/w是真分数，而不是整数除法)。初始值分别为cc = q和cr = r。每次对桶进行充值，q被添加到cc中，r被添加到cr中。当cr/w>= 1/2时，我们四舍五入，因此cc +=1和cr -= w(将1加到整数部分，通过从小数部分减去1 -即w/w来平衡)。为了测试cr/w>= 1/2，代码实际上测试了(cr * 2) >= w。希望bucket_recharge()函数(现在)能有意义。

漏桶运行1000次以填充id_vector[]。少量的测试表明，这为所有节点保持了相当稳定的频率，并且每次在id_vector[]循环中每个节点的数据包数量都是准确的。

少量的测试表明random() shuffle方法在每个id_vector[]循环中具有更多的可变频率，但仍然为每个循环提供每个节点的确切数量的数据包。

漏桶的稳定性假设有稳定的传入请求流。这可能是一个完全不现实的假设。如果请求以较大的突发(与id_vector[]周期相比较大，在本例中为1000)到达，那么与请求到达的可变性相比，(简单)随机()shuffle方法的可变性可能会相形见绌!

enum
{
  n_nodes  =    5,        /* number of nodes      */
  w_res    = 1000,        /* weight resolution    */
} ;
struct node_bucket
{
  int   id ;            /* 1 origin                 */
  int   cc ;            /* current count            */
  int   cr ;            /* current remainder        */
  int   q ;             /* recharge -- quotient     */
  int   r ;             /* recharge -- remainder    */
  int   w ;             /* weight                   */
} ;
static void bucket_recharge(struct node_bucket* b) ;
static void node_checkout(int weights[], int id_vector[], bool rnd) ;
static void node_shuffle(int id_vector[]) ;
/*------------------------------------------------------------------------------
 * To begin at the beginning...
 */
int
main(int argc, char* argv[])
{
  int node_weights[n_nodes] = { 100, 375, 225, 195, 105 } ;
  int id_vector[w_res] ;
  int cx ;
  struct node_bucket buckets[n_nodes] ;
  /* Initialise the buckets -- charged
   */
  cx = 0 ;
  for (int id = 0 ; id < n_nodes ; ++id)
    {
      struct node_bucket* b ;
      b = &buckets[id] ;
      b->id = id + 1 ;              /* 1 origin     */
      b->w  = node_weights[id] ;
      cx += b->w ;
      b->q  = w_res / b->w ;
      b->r  = w_res % b->w ;
      b->cc = 0 ;
      b->cr = 0 ;
      bucket_recharge(b) ;
    } ;
  assert(cx == w_res) ;
  /* Run the buckets for one cycle to fill the id_vector
   */
  for (int i = 0 ; i < w_res ; ++i)
    {
      int id ;
      id = 0 ;
      buckets[id].cc -= 1 ;         /* drip     */
      for (int jd = 1 ; jd < n_nodes ; ++jd)
        {
          buckets[jd].cc -= 1 ;     /* drip     */
          if (buckets[jd].cc < buckets[id].cc)
            id = jd ;
        } ;
      id_vector[i] = id + 1 ;       /* '1' origin   */
      bucket_recharge(&buckets[id]) ;
    } ;
  /* Diagnostics and checking
   *
   * First, check that the id_vector contains exactly the right number of
   * each node, and that the bucket state at the end is the same (apart from
   * cr) as it is at the beginning.
   */
  int nf[n_nodes] = { 0 } ;
  for (int i = 0 ; i < w_res ; ++i)
    nf[id_vector[i] - 1] += 1 ;
  for (int id = 0 ; id < n_nodes ; ++id)
    {
      struct node_bucket* b ;
      b = &buckets[id] ;
      printf("ID=%2d weight=%3d freq=%3d  (cc=%3d  cr=%+4d  q=%3d  r=%3d)n",
                                b->id, b->w, nf[id], b->cc, b->cr, b->q, b->r) ;
    } ;
  node_checkout(node_weights, id_vector, false /* not random */) ;
  /* Try the random version -- with shuffled id_vector.
   */
  int iv ;
  iv = 0 ;
  for (int id = 0 ; id < n_nodes ; ++id)
    {
      for (int i = 0 ; i < node_weights[id] ; ++i)
        id_vector[iv++] = id + 1 ;
    } ;
  assert(iv == 1000) ;
  for (int s = 0 ; s < 17 ; ++s)
    node_shuffle(id_vector) ;
  node_checkout(node_weights, id_vector, true /* random */) ;
  return 0 ;
} ;
static void
bucket_recharge(struct node_bucket* b)
{
  b->cc += b->q ;
  b->cr += b->r ;
  if ((b->cr * 2) >= b->w)
    {
      b->cc += 1 ;
      b->cr -= b->w ;
    } ;
} ;
static void
node_checkout(int weights[], int id_vector[], bool rnd)
{
  struct node_test
  {
    int   last_t ;
    int   count ;
    int   cycle_count ;
    int   intervals[w_res] ;
  } ;
  struct node_test tests[n_nodes] = { { 0 } } ;
  printf("n---Test Run: %s ---n", rnd ? "Random Shuffle" : "Leaky Bucket") ;
  /* Test run
   */
  int s ;
  s = 0 ;
  for (int t = 1 ; t <= (w_res * 5) ; ++t)
    {
      int id ;
      id = id_vector[s++] - 1 ;
      if (tests[id].last_t != 0)
        tests[id].intervals[t - tests[id].last_t] += 1 ;
      tests[id].count += 1 ;
      tests[id].last_t = t ;
      if (s == w_res)
        {
          printf("At time %4dn", t) ;
          for (id = 0 ; id < n_nodes ; ++id)
            {
              struct node_test*   nt ;
              long   total_intervals ;
              nt = &tests[id] ;
              total_intervals = 0 ;
              for (int i = 0 ; i < w_res ; ++i)
                total_intervals += (long)i * nt->intervals[i] ;
              printf("  ID=%2d weight=%3d count=%4d(+%3d)  av=%6.2f vs %6.2fn",
                        id+1, weights[id], nt->count, nt->count - nt->cycle_count,
                                          (double)total_intervals / nt->count,
                                          (double)w_res / weights[id]) ;
              nt->cycle_count = nt->count ;
              for (int i = 0 ; i < w_res ; ++i)
                {
                  if (nt->intervals[i] != 0)
                    {
                      int h ;
                      printf("  %6d x %4d ", i, nt->intervals[i]) ;
                      h = ((nt->intervals[i] * 75) + ((nt->count + 1) / 2))/
                                                                     nt->count ;
                      while (h-- != 0)
                        printf("=") ;
                      printf("n") ;
                    } ;
                } ;
            } ;
          if (rnd)
            node_shuffle(id_vector) ;
          s = 0 ;
        } ;
    } ;
} ;
static void
node_shuffle(int id_vector[])
{
  for (int iv = 0 ; iv < (w_res - 1) ; ++iv)
    {
      int is, s ;
      is = (int)(random() % (w_res - iv)) + iv ;
      s             = id_vector[iv] ;
      id_vector[iv] = id_vector[is] ;
      id_vector[is] = s ;
    } ;
} ;