Loss of significance

Last updated October 25, 2021

Loss of significance is an undesirable effect in calculations using finite-precision arithmetic such as floating-point arithmetic. It occurs when an operation on two numbers increases relative error substantially more than it increases absolute error, for example in subtracting two nearly equal numbers (known as catastrophic cancellation). The effect is that the number of significant digits in the result is reduced unacceptably. Ways to avoid this effect are studied in numerical analysis.

Demonstration of the problem

Subtraction

The effect can be demonstrated with decimal numbers. The following example demonstrates catastrophic cancellation for a decimal floating-point data type with 10 significant digits:

Consider the decimal number

x = 0.1234567891234567890

A floating-point representation of this number on a machine that keeps 10 floating-point digits would be

y = 0.1234567890

which is fairly close when measuring the error as a percentage of the value. It is very different when measured in order of precision. The value 'x' is accurate to 10×10⁻¹⁹, while the value 'y' is only accurate to 10×10⁻¹⁰.

Now perform the calculation

x - y = 0.1234567891234567890 − 0.1234567890000000000

The answer, accurate to 20 significant digits, is

0.0000000001234567890

However, on the 10-digit floating-point machine, the calculation yields

0.1234567891 − 0.1234567890 = 0.0000000001

In both cases the result is accurate to same order of magnitude as the inputs (−20 and −10 respectively). In the second case, the answer seems to have one significant digit, which would amount to loss of significance. However, in computer floating-point arithmetic, all operations can be viewed as being performed on antilogarithms, for which the rules for significant figures indicate that the number of significant figures remains the same as the smallest number of significant figures in the mantissas. The way to indicate this and represent the answer to 10 significant figures is

1.000000000×10⁻¹⁰

Loss of significant bits

Let x and y be positive normalized floating-point numbers.

In the subtraction x − y, r significant bits are lost where

q\leq r\leq p,

2^{-p}\leq 1-{\frac {y}{x}}\leq 2^{-q}

for some positive integers p and q.

Workarounds

It is possible to do computations using an exact fractional representation of rational numbers and keep all significant digits, but this is often prohibitively slower than floating-point arithmetic.

One of the most important parts of numerical analysis is to avoid or minimize loss of significance in calculations. If the underlying problem is well-posed, there should be a stable algorithm for solving it.

Sometimes clever algebra tricks can change an expression into a form that circumvents the problem. One such trick is to use the well-known equation

\left(x-y\right)\left(x+y\right)=x^{2}-y^{2}

So with the expression $x-y$ , multiply numerator and denominator by $x+y$ giving

{\frac {x^{2}-y^{2}}{x+y}}

Now, can the expression $x^{2}-y^{2}$ be reduced to eliminate the subtraction? Sometimes it can.

For example, the expression $1-{\sqrt {1-\delta }}$ can suffer loss of significant bits if $|\delta |$ is much smaller than 1. So rewrite the expression as

{\frac {1-\left(1-\delta \right)}{1+{\sqrt {1-\delta }}}}

or

{\frac {\delta }{1+{\sqrt {1-\delta }}}}

Instability of the quadratic equation

For example, consider the quadratic equation

ax^{2}+bx+c=0,

with the two exact solutions:

x={\frac {-b\pm {\sqrt {b^{2}-4ac}}}{2a}}.

This formula may not always produce an accurate result. For example, when $c$ is very small, loss of significance can occur in either of the root calculations, depending on the sign of $b$ .

The case $a=1$ , $b=200$ , $c=-0.000015$ will serve to illustrate the problem:

x^{2}+200x-0.000015=0.

We have

{\sqrt {b^{2}-4ac}}={\sqrt {200^{2}+4\times 1\times 0.000015}}=200.00000015\dotso

In real arithmetic, the roots are

(-200-200.00000015)/2=-200.000000075,

(-200+200.00000015)/2=0.000000075.

In 10-digit floating-point arithmetic:

(-200-200.0000001)/2=-200.00000005,

(-200+200.0000001)/2=0.00000005.

Notice that the solution of greater magnitude is accurate to ten digits, but the first nonzero digit of the solution of lesser magnitude is wrong.

Because of the subtraction that occurs in the quadratic equation, it does not constitute a stable algorithm to calculate the two roots.

A better algorithm

A careful floating-point computer implementation combines several strategies to produce a robust result. Assuming that the discriminant b² − 4ac is positive, and b is nonzero, the computation would be as follows:^[1]

{\begin{aligned}x_{1}&={\frac {-b-\operatorname {sgn}(b)\,{\sqrt {b^{2}-4ac}}}{2a}},\\x_{2}&={\frac {2c}{-b-\operatorname {sgn}(b)\,{\sqrt {b^{2}-4ac}}}}={\frac {c}{ax_{1}}}.\end{aligned}}

Here sgn denotes the sign function, where $\operatorname {sgn}(b)$ is 1 if $b$ is positive, and −1 if $b$ is negative. This avoids cancellation problems between $b$ and the square root of the discriminant by ensuring that only numbers of the same sign are added.

To illustrate the instability of the standard quadratic formula compared to this formula, consider a quadratic equation with roots $1.786737589984535$ and $1.149782767465722\times 10^{-8}$ . To 16 significant digits, roughly corresponding to double-precision accuracy on a computer, the monic quadratic equation with these roots may be written as

x^{2}-1.786737601482363x+2.054360090947453\times 10^{-8}=0.

Using the standard quadratic formula and maintaining 16 significant digits at each step, the standard quadratic formula yields

{\sqrt {\Delta }}=1.786737578486707,

x_{1}=(1.786737601482363+1.786737578486707)/2=1.786737589984535,

x_{2}=(1.786737601482363-1.786737578486707)/2=0.000000011497828.

Note how cancellation has resulted in $x_{2}$ being computed to only 8 significant digits of accuracy.

The variant formula presented here, however, yields the following:

x_{1}=(1.786737601482363+1.786737578486707)/2=1.786737589984535,

x_{2}=2.054360090947453\times 10^{-8}/1.786737589984535=1.149782767465722\times 10^{-8}.

Note the retention of all significant digits for $x_{2}$ .

Note that while the above formulation avoids catastrophic cancellation between $b$ and ${\sqrt {b^{2}-4ac}}$ , there remains a form of cancellation between the terms $b^{2}$ and $-4ac$ of the discriminant, which can still lead to loss of up to half of correct significant digits.^[2]^[3] The discriminant $b^{2}-4ac$ needs to be computed in arithmetic of twice the precision of the result to avoid this (e.g. quad precision if the final result is to be accurate to full double precision).^[4] This can be in the form of a fused multiply-add operation.^[2]

To illustrate this, consider the following quadratic equation, adapted from Kahan (2004):^[2]

94906265.625x^{2}-189812534x+94906268.375.

This equation has $\Delta =7.5625$ and roots

x_{1}=1.000000028975958,

x_{2}=1.000000000000000.

However, when computed using IEEE 754 double-precision arithmetic corresponding to 15 to 17 significant digits of accuracy, $\Delta$ is rounded to 0.0, and the computed roots are

x_{1}=1.000000014487979,

x_{2}=1.000000014487979,

which are both false after the 8th significant digit. This is despite the fact that superficially, the problem seems to require only 11 significant digits of accuracy for its solution.

Other examples

The expm1 function calculates exponential minus 1. For a small $x$ , $exp(x) - 1$ will cause loss of significance in the subtraction; using a specially designed function helps solve the problem.

Related Research Articles

In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often used in systems with very small and very large real numbers that require fast processing times. In general, a floating-point number is represented approximately with a fixed number of significant digits and scaled using an exponent in some fixed base; the base for the scaling is normally two, ten, or sixteen. A number that can be represented exactly is of the following form:

In algebra, a quadratic equation is any equation that can be rearranged in standard form as

In mathematics, a square root of a number $x$ is a number $y$ such that $y 2 = x$ ; in other words, a number $y$ whose square (the result of multiplying the number by itself, or $y$  ⋅  $y$ ) is $x$ . For example, 4 and −4 are square roots of 16, because $42 = (-4) 2 = 16$ . Every nonnegative real number $x$ has a unique nonnegative square root, called the principal square root, which is denoted by $where the symbol is called the radical sign or radix . For example, the principal square root of 9 is 3, which is denoted by because 32 = 3 \cdot 3 = 9 and 3 is nonnegative. The term (or number) whose square root is being considered is known as the radicand . The radicand is the number or expression underneath the radical sign, in this case 9.$

In elementary algebra, the quadratic formula is a formula that provides the solution(s) to a quadratic equation. There are other ways of solving a quadratic equation instead of using the quadratic formula, such as factoring, completing the square, graphing and others.

In mathematics, tables of trigonometric functions are useful in a number of areas. Before the existence of pocket calculators, trigonometric tables were essential for navigation, science and engineering. The calculation of mathematical tables was an important area of study, which led to the development of the first mechanical computing devices.

Cubic equation Polynomial equation of degree 3

In algebra, a cubic equation in one variable is an equation of the form

Cubic function Polynomial function of degree 3

In mathematics, a cubic function is a function of the form

In mathematics, a quadratic irrational number is an irrational number that is the solution to some quadratic equation with rational coefficients which is irreducible over the rational numbers. Since fractions in the coefficients of a quadratic equation can be cleared by multiplying both sides by their least common denominator, a quadratic irrational is an irrational root of some quadratic equation with integer coefficients. The quadratic irrational numbers, a subset of the complex numbers, are algebraic numbers of degree 2, and can therefore be expressed as

Quartic function Polynomial function of degree four

In algebra, a quartic function is a function of the form

In mathematics, an algebraic equation or polynomial equation is an equation of the form

A roundoff error, also called rounding error, is the difference between the result produced by a given algorithm using exact arithmetic and the result produced by the same algorithm using finite-precision, rounded arithmetic. Rounding errors are due to inexactness in the representation of real numbers and the arithmetic operations done with them. This is a form of quantization error. When using approximation equations or algorithms, especially when using finitely many digits to represent real numbers, one of the goals of numerical analysis is to estimate computation errors. Computation errors, also called numerical errors, include both truncation errors and roundoff errors.

In relativity, proper time along a timelike world line is defined as the time as measured by a clock following that line. It is thus independent of coordinates, and is a Lorentz scalar. The proper time interval between two events on a world line is the change in proper time. This interval is the quantity of interest, since proper time itself is fixed only up to an arbitrary additive constant, namely the setting of the clock at some event along the world line.

Significance arithmetic is a set of rules for approximating the propagation of uncertainty in scientific or statistical calculations. These rules can be used to find the appropriate number of significant figures to use to represent the result of a calculation. If a calculation is done without analysis of the uncertainty involved, a result that is written with too many significant figures can be taken to imply a higher precision than is known, and a result that is written with too few significant figures results in an avoidable loss of precision. Understanding these rules requires a good understanding of the concept of significant and insignificant figures.

Methods of computing square roots are numerical analysis algorithms for finding the principal, or non-negative, square root of a real number. Arithmetically, it means given S, a procedure for finding a number which when multiplied by itself, yields S; algebraically, it means a procedure for finding the non-negative root of the equation x² - S = 0; geometrically, it means given the area of a square, a procedure for constructing a side of the square.

In algebraic number theory, a fundamental unit is a generator for the unit group of the ring of integers of a number field, when that group has rank 1. Dirichlet's unit theorem shows that the unit group has rank 1 exactly when the number field is a real quadratic field, a complex cubic field, or a totally imaginary quartic field. When the unit group has rank ≥ 1, a basis of it modulo its torsion is called a fundamental system of units. Some authors use the term fundamental unit to mean any element of a fundamental system of units, not restricting to the case of rank 1.

In numerical analysis, Aitken's delta-squared process or Aitken Extrapolation is a series acceleration method, used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926. Its early form was known to Seki Kōwa and was found for rectification of the circle, i.e. the calculation of π. It is most useful for accelerating the convergence of a sequence that is converging linearly.

In mathematics, an infinite periodic continued fraction is a continued fraction that can be placed in the form

In mathematics, Pythagorean addition is the following binary operation on the real numbers:

In numerical analysis, catastrophic cancellation is the phenomenon that subtracting good approximations to two nearby numbers may yield a very bad approximation to the difference of the original numbers.

As with other spreadsheets, Microsoft Excel works only to limited accuracy because it retains only a certain number of figures to describe numbers. With some exceptions regarding erroneous values, infinities, and denormalized numbers, Excel calculates in double-precision floating-point format from the IEEE 754 specification. Although Excel can display 30 decimal places, its precision for a specified number is confined to 15 significant figures, and calculations may have an accuracy that is even less due to five issues: round off, truncation, and binary storage, accumulation of the deviations of the operands in calculations, and worst: cancellation at subtractions resp. 'Catastrophic cancellation' at subtraction of values with similar magnitude.

References

↑ Press, William Henry; Flannery, Brian P.; Teukolsky, Saul A.; Vetterling, William T. (1992). "Section 5.6: Quadratic and Cubic Equations". Numerical Recipes in C (2 ed.).
1 2 3 Kahan, William Morton (2004-11-20). "On the Cost of Floating-Point Computation Without Extra-Precise Arithmetic" (PDF). Retrieved 2012-12-25.
↑ Higham, Nicholas John (2002). Accuracy and Stability of Numerical Algorithms (2 ed.). SIAM. p. 10. ISBN 978-0-89871-521-7.
↑ Hough, David (March 1981). "Applications of the proposed IEEE 754 standard for floating point arithmetic". Computer . IEEE. 14 (3): 70–74. doi:10.1109/C-M.1981.220381.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Press_1992-1] Press, William Henry; Flannery, Brian P.; Teukolsky, Saul A.; Vetterling, William T. (1992). "Section 5.6: Quadratic and Cubic Equations". Numerical Recipes in C (2 ed.).

[Kahan_2004-2] 1 2 3 Kahan, William Morton (2004-11-20). "On the Cost of Floating-Point Computation Without Extra-Precise Arithmetic" (PDF). Retrieved 2012-12-25.

[Higham_2002-3] Higham, Nicholas John (2002). Accuracy and Stability of Numerical Algorithms (2 ed.). SIAM. p. 10. ISBN 978-0-89871-521-7.

[Hough_1981-4] Hough, David (March 1981). "Applications of the proposed IEEE 754 standard for floating point arithmetic". Computer . IEEE. 14 (3): 70–74. doi:10.1109/C-M.1981.220381.

[1]

[2]

[3]

[4]