Unit in the last place

Last updated

In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the least significant digit (rightmost digit) represents if it is 1. It is used as a measure of accuracy in numeric calculations. [1]

Contents

Definition

The most common definition is: In radix with precision , if , then , [2] where is the minimal exponent of the normal numbers. In particular, for normal numbers, and for subnormals.

Another definition, suggested by John Harrison, is slightly different: is the distance between the two closest straddling floating-point numbers and (i.e., satisfying and ), assuming that the exponent range is not upper-bounded. [3] [4] These definitions differ only at signed powers of the radix. [2]

The IEEE 754 specification—followed by all modern floating-point hardware—requires that the result of an elementary arithmetic operation (addition, subtraction, multiplication, division, and square root since 1985, and FMA since 2008) be correctly rounded, which implies that in rounding to nearest, the rounded result is within 0.5 ulp of the mathematically exact result, using John Harrison's definition; conversely, this property implies that the distance between the rounded result and the mathematically exact result is minimized (but for the halfway cases, it is satisfied by two consecutive floating-point numbers). Reputable numeric libraries compute the basic transcendental functions to between 0.5 and about 1 ulp. Only a few libraries compute them within 0.5 ulp, this problem being complex due to the Table-maker's dilemma. [5]

Examples

Example 1

Let be a positive floating-point number and assume that the active rounding mode is round to nearest, ties to even, denoted . If , then . Otherwise, or , depending on the value of the least significant digit and the exponent of . This is demonstrated in the following Haskell code typed at an interactive prompt:[ citation needed ]

> until(\x->x==x+1)(+1)0::Float 1.6777216e7 > it-1 1.6777215e7 > it+1 1.6777216e7 

Here we start with 0 in single precision (binary32) and repeatedly add 1 until the operation does not change the value. Since the significand for a single-precision number contains 24 bits, the first integer that is not exactly representable is 224+1, and this value rounds to 224 in round to nearest, ties to even. Thus the result is equal to 224.

Example 2

The following example in Java approximates π as a floating point value by finding the two double values bracketing : .

// π with 20 decimal digitsBigDecimalπ=newBigDecimal("3.14159265358979323846");// truncate to a double floating pointdoublep0=π.doubleValue();// -> 3.141592653589793  (hex: 0x1.921fb54442d18p1)// p0 is smaller than π, so find next number representable as doubledoublep1=Math.nextUp(p0);// -> 3.1415926535897936 (hex: 0x1.921fb54442d19p1)

Then is determined as .

// ulp(π) is the difference between p1 and p0BigDecimalulp=newBigDecimal(p1).subtract(newBigDecimal(p0));// -> 4.44089209850062616169452667236328125E-16// (this is precisely 2**(-51))// same result when using the standard library functiondoubleulpMath=Math.ulp(p0);// -> 4.440892098500626E-16 (hex: 0x1.0p-51)

Example 3

Another example, in Python, also typed at an interactive prompt, is:

>>> x=1.0>>> p=0>>> whilex!=x+1:... x=x*2... p=p+1... >>> x9007199254740992.0>>> p53>>> x+2+19007199254740996.0

In this case, we start with x = 1 and repeatedly double it until x = x + 1. Similarly to Example 1, the result is 253 because the double-precision floating-point format uses a 53-bit significand.

Language support

The Boost C++ libraries provides the functions boost::math::float_next, boost::math::float_prior, boost::math::nextafter and boost::math::float_advance to obtain nearby (and distant) floating-point values, [6] and boost::math::float_distance(a, b) to calculate the floating-point distance between two doubles. [7]

The C language library provides functions to calculate the next floating-point number in some given direction: nextafterf and nexttowardf for float, nextafter and nexttoward for double, nextafterl and nexttowardl for long double, declared in <math.h>. It also provides the macros FLT_EPSILON, DBL_EPSILON, LDBL_EPSILON, which represent the positive difference between 1.0 and the next greater representable number in the corresponding type (i.e. the ulp of one). [8]

The Java standard library provides the functions Math.ulp(double) and Math.ulp(float) . They were introduced with Java 1.5.

The Swift standard library provides access to the next floating-point number in some given direction via the instance properties nextDown and nextUp. It also provides the instance property ulp and the type property ulpOfOne (which corresponds to C macros like FLT_EPSILON [9] ) for Swift's floating-point types. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Computable number</span> Real number that can be computed within arbitrary precision

In mathematics, computable numbers are the real numbers that can be computed to within any desired precision by a finite, terminating algorithm. They are also known as the recursive numbers, effective numbers or the computable reals or recursive reals. The concept of a computable real number was introduced by Emile Borel in 1912, using the intuitive notion of computability available at the time.

<span class="mw-page-title-main">Floating-point arithmetic</span> Computer approximation for real numbers

In computing, floating-point arithmetic (FP) is arithmetic that represents subsets of real numbers using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. Numbers of this form are called floating-point numbers. For example, 12.345 is a floating-point number in base ten with five digits of precision:

IEEE 754-1985 is a historic industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was the most widely used format for floating-point computation. It was implemented in software, in the form of floating-point libraries, and in hardware, in the instructions of many CPUs and FPUs. The first integrated circuit to implement the draft of what was to become IEEE 754-1985 was the Intel 8087.

Double-precision floating-point format is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

<span class="mw-page-title-main">Rounding</span> Replacing a number with a simpler value

Rounding or rounding off means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or the expression √2 with 1.414.

<span class="mw-page-title-main">Prime-counting function</span> Function representing the number of primes less than or equal to a given number

In mathematics, the prime-counting function is the function counting the number of prime numbers less than or equal to some real number x. It is denoted by π(x) (unrelated to the number π).

The IEEE Standard for Floating-Point Arithmetic is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.

In computing, a roundoff error, also called rounding error, is the difference between the result produced by a given algorithm using exact arithmetic and the result produced by the same algorithm using finite-precision, rounded arithmetic. Rounding errors are due to inexactness in the representation of real numbers and the arithmetic operations done with them. This is a form of quantization error. When using approximation equations or algorithms, especially when using finitely many digits to represent real numbers, one of the goals of numerical analysis is to estimate computation errors. Computation errors, also called numerical errors, include both truncation errors and roundoff errors.

In mathematics and computer algebra, automatic differentiation, also called algorithmic differentiation, computational differentiation, is a set of techniques to evaluate the partial derivative of a function specified by a computer program.

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

Machine epsilon or machine precision is an upper bound on the relative approximation error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of numerical analysis, and by extension in the subject of computational science. The quantity is also called macheps and it has the symbols Greek epsilon .

<span class="mw-page-title-main">Sine and cosine</span> Fundamental trigonometric functions

In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opposite that angle to the length of the longest side of the triangle, and the cosine is the ratio of the length of the adjacent leg to that of the hypotenuse. For an angle , the sine and cosine functions are denoted simply as and .

<span class="mw-page-title-main">Stability theory</span> Part of mathematics that addresses the stability of solutions

In mathematics, stability theory addresses the stability of solutions of differential equations and of trajectories of dynamical systems under small perturbations of initial conditions. The heat equation, for example, is a stable partial differential equation because small perturbations of initial data lead to small variations in temperature at a later time as a result of the maximum principle. In partial differential equations one may measure the distances between functions using Lp norms or the sup norm, while in differential geometry one may measure the distance between spaces using the Gromov–Hausdorff distance.

Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format. In contrast to extended precision, arbitrary-precision arithmetic refers to implementations of much larger numeric types using special software.

In numerical analysis, catastrophic cancellation is the phenomenon that subtracting good approximations to two nearby numbers may yield a very bad approximation to the difference of the original numbers.

In computing, quadruple precision is a binary floating-point–based computer number format that occupies 16 bytes with precision at least twice the 53-bit double precision.

Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

As with other spreadsheets, Microsoft Excel works only to limited accuracy because it retains only a certain number of figures to describe numbers. With some exceptions regarding erroneous values, infinities, and denormalized numbers, Excel calculates in double-precision floating-point format from the IEEE 754 specification. Although Excel allows display of up to 30 decimal places, its precision for any specific number is no more than 15 significant figures, and calculations may have an accuracy that is even less due to five issues: round off, truncation, and binary storage, accumulation of the deviations of the operands in calculations, and worst: cancellation at subtractions resp. 'Catastrophic cancellation' at subtraction of values with similar magnitude.

In mathematics, the Bloch group is a cohomology group of the Bloch–Suslin complex, named after Spencer Bloch and Andrei Suslin. It is closely related to polylogarithm, hyperbolic geometry and algebraic K-theory.

<span class="mw-page-title-main">Minimal residual method</span> Computational method

The Minimal Residual Method or MINRES is a Krylov subspace method for the iterative solution of symmetric linear equation systems. It was proposed by mathematicians Christopher Conway Paige and Michael Alan Saunders in 1975.

References

  1. Goldberg, David (March 1991). "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (PDF). ACM Computing Surveys . 23 (1): 5–48. doi: 10.1145/103162.103163 . S2CID   222008826. Archived (PDF) from the original on 20 July 2006. Retrieved 20 January 2016. (, , ).
  2. 1 2 Muller, Jean-Michel; Brunie, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Joldes, Mioara; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Torres, Serge (2018) [2010]. Handbook of Floating-Point Arithmetic (2 ed.). Birkhäuser. doi:10.1007/978-3-319-76526-6. ISBN   978-3-319-76525-9.
  3. Harrison, John. "A Machine-Checked Theory of Floating Point Arithmetic" . Retrieved 17 July 2013.
  4. Muller, Jean-Michel (2005–11). "On the definition of ulp(x)". INRIA Technical Report 5504. ACM Transactions on Mathematical Software, Vol. V, No. N, November 2005. Retrieved in 2012-03 from http://ljk.imag.fr/membres/Carine.Lucas/TPScilab/JMMuller/ulp-toms.pdf.
  5. Kahan, William. "A Logarithm Too Clever by Half" . Retrieved 14 November 2008.
  6. Boost float_advance.
  7. Boost float_distance.
  8. ISO/IEC 9899:1999 specification (PDF). p. 237, §7.12.11.3 The nextafter functions and §7.12.11.4 The nexttoward functions.
  9. "ulpOfOne - FloatingPoint | Apple Developer Documentation". Apple Inc. Apple Inc. Retrieved 18 August 2019.
  10. "FloatingPoint - Swift Standard Library | Apple Developer Documentation". Apple Inc. Apple Inc. Retrieved 18 August 2019.

Bibliography