将浮点数截断为前导N位小数

Truncate a floating-point number to the leading N decimal digits

本文关键字：小数浮点数更新时间：2023-10-16

获取浮点数(number>= 0.0)的n最左边非零位数的最优方法是什么?

例如,

如果n = 1:

0.246456 -> 0.2

如果n = 2:

0.246456 -> 0.24

@schil227评论后:目前，我正在根据需要进行乘法和除法(除以10)，以便在十进制数字字段中拥有n位数。

代码可以使用sprintf(buf, "%e",...)来完成大部分繁重的工作。

有很多角落的情况，其他直接代码可能会失败，sprintf()至少可能是一个很好的可靠的参考解决方案。

这段代码打印double到DBL_DECIMAL_DIG的位置，以确保没有四舍五入的数字会产生差异。然后根据n将各种数字归零。

请参阅@Mark Dickinson的评论，了解使用大于DBL_DECIMAL_DIG的原因。大概是DBL_DECIMAL_DIG*2的数量级。如上所述，存在许多极端情况。

#include <float.h>
#include <math.h>
#include <stdio.h>
double foo(double x, int n) {
  if (!isfinite(x)) {
    return x;
  }
  printf("%gn", x);
  char buf[DBL_DECIMAL_DIG + 11];
  sprintf(buf, "%+.*e", DBL_DECIMAL_DIG, x);
  //puts(buf);
  assert(n >= 1 && n <= DBL_DECIMAL_DIG + 1);
  memset(buf + 2 + n, '0', DBL_DECIMAL_DIG - n + 1);
  //puts(buf);
  char *endptr;
  x = strtod(buf, &endptr);
  printf("%gn", x);
  return x;
}
int main() {
 foo(0.014568, 1);
 foo(0.246456, 1);
 foo(0.014568, 2);
 foo(0.246456, 2);
 return 0;
}

输出

此答案假设OP不想要一个四舍五入的答案。Re: 0.246456 -> 0.24

如果您希望结果为字符串，则可能应该打印为具有额外精度的字符串，然后自己将其截断。(参见@chux的回答，详细了解IEEE 64位double需要多少额外的精度才能避免从9的字符串中舍入，因为您想要截断，但所有通常的to-string函数都将其舍入到最接近的值。)

如果你想要一个double的结果，那么你确定你真的想要这个吗?在计算过程中进行舍入/截断通常只会降低最终结果的准确性。当然，floor/ceil, trunc和nearbyint在实际算法中也有使用，这只是trunc的缩放版本。

如果你只想要一个double，你可以得到相当好的结果，而不需要一个字符串。使用ndigits和floor(log10(fabs(x)))计算出比例因子，然后将缩放后的值截断为整数，然后缩回

测试和工作(有和没有-ffast-math)。请参阅Godbolt编译器资源管理器上的asm。这可能会相当有效地运行，特别是在-ffast-math -msse4.1(因此floor和trunc可以内联到roundsd)。

如果你关心速度，考虑用一些利用指数是一个小整数的东西来替换pow()。我不确定在这种情况下库pow()实现有多快。GNU C __builtin_powi(x, n)以精度换取速度，对于整数指数，做乘法树，这比pow()做的更不准确。

#include <float.h>
#include <math.h>
#include <stdio.h>
double truncate_n_digits(double x, int digits)
{
    if (x==0 || !isfinite(x))
        return x;   // good idea stolen from Chux's answer :)
    double l10 = log10(fabs(x));
    double scale = pow(10.,  floor(l10) + (1 - digits));  // floor rounds towards -Inf
    double scaled = x / scale;
    double scaletrunc = trunc(scaled);  // trunc rounds towards zero
    double truncated = scaletrunc * scale;
#if 1    // debugging code
    printf("%2d %24.14g =>t%24.14gt scale=%g, scaled=%.30gn", digits, x, truncated, scale, scaled);
    // print with more accuracy to reveal the real behaviour
    printf("   %24.20g =>t%24.20gn", x, truncated);
#endif
    return truncated;
}

测试用例:

int main() {
 truncate_n_digits(0.014568, 1);
 truncate_n_digits(0.246456, 1);
 truncate_n_digits(0.014568, 2);
 truncate_n_digits(-0.246456, 2);
 truncate_n_digits(1234567, 2);
 truncate_n_digits(99999999999, 6);
 truncate_n_digits(-99999999999, 6);
 truncate_n_digits(99999, 10);
 truncate_n_digits(-0.0000000001234567, 3);
 truncate_n_digits(1000, 6);
 truncate_n_digits(0.001, 6);
 truncate_n_digits(1e-312, 2);  // denormal, and not exactly representable: 9.999...e-313
 truncate_n_digits(nextafter(1e-312, INFINITY), 2);  // denormal, just above 1.00000e-312
 return 0;
}

每个结果显示两次:第一次只显示%.14g，因此舍入得到我们想要的字符串，然后再次显示%.20g，以显示足够的位置来揭示浮点数学的现实。大多数数字都不能精确表示，因此即使使用完全舍入，也不可能返回double ，完全表示截断的十进制字符串。(小于尾数大小的整数是可以精确表示的，分母是2的幂的分数也是如此。)

 1                 0.014568 =>                      0.01         scale=0.01, scaled=1.45679999999999987281285029894
    0.014567999999999999353 =>   0.010000000000000000208
 1                 0.246456 =>                       0.2         scale=0.1, scaled=2.46456000000000008398615136684
      0.2464560000000000084 =>     0.2000000000000000111
 2                 0.014568 =>                     0.014         scale=0.001, scaled=14.5679999999999996163069226895
    0.014567999999999999353 =>   0.014000000000000000291
 2                -0.246456 =>                     -0.24         scale=0.01, scaled=-24.6456000000000017280399333686
     -0.2464560000000000084 =>   -0.23999999999999999112
 3               1234.56789 =>                      1230         scale=10, scaled=123.456789000000000555701262783
       1234.567890000000034 =>                      1230
 6               1234.56789 =>                   1234.56         scale=0.01, scaled=123456.789000000004307366907597
       1234.567890000000034 =>     1234.5599999999999454
 6              99999999999 =>               99999900000         scale=100000, scaled=999999.999990000040270388126373
                99999999999 =>               99999900000
 6             -99999999999 =>              -99999900000         scale=100000, scaled=-999999.999990000040270388126373
               -99999999999 =>              -99999900000
10                    99999 =>                     99999         scale=1e-05, scaled=9999900000
                      99999 =>     99999.000000000014552
 3            -1.234567e-10 =>                 -1.23e-10         scale=1e-12, scaled=-123.456699999999983674570103176
   -1.234566999999999879e-10 => -1.2299999999999998884e-10
 6                     1000 =>                      1000         scale=0.01, scaled=100000
                       1000 =>                      1000
 6                    0.001 =>                     0.001         scale=1e-08, scaled=100000
   0.0010000000000000000208 =>  0.0010000000000000000208
 2     9.9999999999847e-313 =>      9.9999999996388e-313         scale=1e-314, scaled=100.000000003458453079474566039
   9.9999999999846534143e-313 =>        9.9999999996388074622e-313
 2     1.0000000000034e-312 =>      9.0000000001196e-313         scale=1e-313, scaled=9.9999999999011865980946822674
   1.0000000000034059979e-312 =>        9.0000000001195857973e-31

由于您想要的结果通常不能精确表示，(并且由于其他舍入错误)结果double有时会低于您想要的结果，因此以全精度打印它可能会得到1.19999999而不是1.20000011。你可能想使用nextafter(result, copysign(INFINITY, original))来获得一个更有可能比你想要的更大的结果。

当然，在某些情况下，这可能会使事情变得更糟。但是由于我们向零截断，大多数情况下我们得到的结果(在大小上)略低于不可表示的精确值。

好的，另一个类似于@Peter Cordes但更通用的。

/** Return c digits semantic digis of number c x.
    tparam T Type of number c x can be floating point or integral.
    param x The number.
    param digits The requested number of semantic digits of number c x.
    return The number with only c digits semantic digits of number c x. */
template<typename T>
requires(std::integral<T> || std::floating_point<T>)
T roundn(T x, unsigned int digits)
{
    if (!x || !std::isfinite(x)) return x;
    typedef std::conditional_t<std::floating_point<T>, T, double> Tp;
    Tp mul = pow(10, floor(digits - log10(abs(x))));
    Tp y = round(x * mul) / mul;
    if constexpr (std::floating_point<T>) return y;
    else return round(y);
}
int main()
{
    cout << setprecision(100);
    cout << roundn(123.456789, 1) << "n";
    cout << roundn(123.456789, 2) << "n";
    cout << roundn(123.456789, 3) << "n";
    cout << roundn(123.456789, 4) << "n";
    cout << roundn(123.456789, 5) << "n";
    cout << roundn(-123.456789, 1) << "n";
    cout << roundn(-123.456789, 2) << "n";
    cout << roundn(-123.456789, 3) << "n";
    cout << roundn(-123.456789, 4) << "n";
    cout << roundn(-123.456789, 5) << "n";
    cout << roundn(-123.456789, 15) << "n";
    cout << roundn(123456, 1) << "n";
    cout << roundn(123456, 2) << "n";
    cout << roundn(123456, 3) << "n";
    cout << roundn(123456, 10) << "n";
    cout << roundn(-123456, 1) << "n";
    cout << roundn(-123456, 2) << "n";
    cout << roundn(-123456, 3) << "n";
    cout << roundn(-123456, 10) << "n";
    cout << roundn(0.0123456789, 1) << "n";
    cout << roundn(0.0123456789, 2) << "n";
    cout << roundn(-0.0123456789, 1) << "n";
    cout << roundn(-0.0123456789, 2) << "n";
    return 0;
}

返回

99.9999999999999857891452847979962825775146484375
120
123
123.5
123.4599999999999937472239253111183643341064453125
-99.9999999999999857891452847979962825775146484375
-120
-123
-123.5
-123.4599999999999937472239253111183643341064453125
-123.4567890000000005557012627832591533660888671875
100000
120000
123000
123456
-100000
-120000
-123000
-123456
0.01000000000000000020816681711721685132943093776702880859375
0.0120000000000000002498001805406602215953171253204345703125
-0.01000000000000000020816681711721685132943093776702880859375
-0.0120000000000000002498001805406602215953171253204345703125