Some programming languages (or compilers for them) provide a built-in (primitive) or library decimal data type to represent non-repeating decimal fractions like 0.3 and -1.17 without rounding, and to do arithmetic on them. Examples are the
decimal.Decimal type of Python, and analogous types provided by other languages.
Fractional numbers are supported on most programming languages as floating-point numbers or fixed-point numbers. However, such representations typically restrict the denominator to a power of two. Most decimal fractions (or most fractions in general) cannot be represented exactly as a fraction with a denominator that is a power of two. For example, the simple decimal fraction 0.3 (3/10) might be represented as 5404319552844595/18014398509481984 (0.299999999999999988897769...). This inexactness causes many problems that are familiar to experienced programmers. For example, the expression
0.1 * 7 == 0.7 might counterintuitively evaluate to false in some systems, due to the inexactness of the representation of decimals.
Although all decimal fractions are fractions, and thus it is possible to use a rational data type to represent it exactly, it may be more convenient in many situations to consider only non-repeating decimal fractions (fractions whose denominator is a power of ten). For example, fractional units of currency worldwide are mostly based on a denominator that is a power of ten. Also, most fractional measurements in science are reported as decimal fractions, as opposed to fractions with any other system of denominators.
A decimal data type could be implemented as either a floating-point number or as a fixed-point number. In the fixed-point case, the denominator would be set to a fixed power of ten. In the floating-point case, a variable exponent would represent the power of ten to which the mantissa of the number is multiplied.
Languages that support a rational data type usually allow the construction of such a value from two integers, instead of a base-2 floating-point number, due to the loss of exactness the latter would cause. Usually the basic arithmetic operations ('+', '−', '×', '/', integer powers) and comparisons ('=', '<', '>', '≤') would be extended to act on them — either natively or through operator overloading facilities provided by the language. These operations may be translated by the compiler into a sequence of integer machine instructions, or into library calls. Support may also extend to other operations, such as formatting, rounding to an integer or floating point value, etc.. An example of this is 123.456
IEEE 754 specifies three standard floating-point decimal data types of different precision:
In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often found in systems which include very small and very large real numbers, which require fast processing times. A number is, in general, represented approximately to a fixed number of significant digits and scaled using an exponent in some fixed base; the base for the scaling is normally two, ten, or sixteen. A number that can be represented exactly is of the following form:
IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was the most widely used format for floating-point computation. It was implemented in software, in the form of floating-point libraries, and in hardware, in the instructions of many CPUs and FPUs. The first integrated circuit to implement the draft of what was to become IEEE 754-1985 was the Intel 8087.
A computer number format is the internal representation of numeric values in digital computer and calculator hardware and software. Normally, numeric values are stored as groupings of bits, named for the number of bits that compose them. The encoding between numerical values and bit patterns is chosen for convenience of the operation of the computer; the bit format used by the computer's instruction set generally requires conversion for external use such as printing and display. Different types of processors may have different internal representations of numerical values. Different conventions are used for integer and real numbers. Most calculations are carried out with number formats that fit into a processor register, but some software systems allow representation of arbitrarily large numbers using multiple words of memory.
Double-precision floating-point format is a computer number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.
Rounding means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or the expression √2 with 1.414.
In computer science, primitive data type is either of the following:
The IEEE Standard for Floating-Point Arithmetic is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.
The significand is part of a number in scientific notation or a floating-point number, consisting of its significant digits. Depending on the interpretation of the exponent, the significand may represent an integer or a fraction.
In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point. Fixed-point number representation can be compared to the more complicated floating-point number representation.
In computer architecture, 128-bit integers, memory addresses, or other data units are those that are 128 bits wide. Also, 128-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size.
In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.
In C and related programming languages,
long double refers to a floating-point data type that is often more precise than double precision though the language standard only requires it to be at least as precise as
double. As with C's other floating-point types, it may not necessarily map to an IEEE format.
In computer science, a scale factor is a number used as a multiplier to represent a number on a different scale, functioning similarly to an exponent in mathematics. A scale factor is used when a real-world set of numbers needs to be represented on a different scale in order to fit a specific number format. Although using a scale factor extends the range of representable values, it also decreases the precision, resulting in rounding error for certain calculations.
Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format. In contrast to extended precision, arbitrary-precision arithmetic refers to implementations of much larger numeric types using special software.
Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding errors that otherwise typically occur when converting between decimal fractions and binary (base-2) fractions.
IEEE 754-2008 was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 floating-point standard, while in 2019 it was updated with a minor revision IEEE 754-2019. The 2008 revision extended the previous standard where it was necessary, added decimal arithmetic and formats, tightened up certain areas of the original standard which were left undefined, and merged in IEEE 854.
Some programming languages provide a built-in (primitive) rational data type to represent rational numbers like 1/3 and -11/17 without rounding, and to do arithmetic on them. Examples are the
ratio type of Common Lisp, and analogous types provided by most languages for algebraic computation, such as Mathematica and Maple. Many languages that do not have a built-in rational type still provide it as a library-defined type.
In computing, half precision is a binary floating-point computer number format that occupies 16 bits in computer memory.
In computing, quadruple precision is a binary floating point–based computer number format that occupies 16 bytes with precision more than twice the 53-bit double precision.
Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.