Numerical tower

Last updated

In Scheme and in Lisp dialects inspired by it, the numerical tower is a set of data types that represent numbers and a logic for their hierarchical organisation.

Contents

A representation of the numerical tower with five types of numbers. NumericalTower.svg
A representation of the numerical tower with five types of numbers.

Each type in the tower conceptually "sits on" a more fundamental type, so an integer is a rational number and a number, but the converse is not necessarily true, i.e. not every number is an integer. This asymmetry implies that a language can safely allow implicit coercions of numerical types—without creating semantic problems—in only one direction: coercing an integer to a rational loses no information and will never influence the value returned by a function, but to coerce most reals to an integer would alter any relevant computation (e.g., the real 1/3 does not equal any integer) and is thus impermissible.

In Lisp

Principally, the numerical tower is designed to codify the set theoretic properties of numbers in an easy-to-implement language facility: every integer is a rational with an implicit denominator of 1, and all reals are complex with an implicit imaginary part of 0. Practically, the implementation may save time and space by ignoring these properties unless they become arithmetically relevant, and also may correspondingly improve the efficiency of its representation when reducing numerical values to their canonical representation by eliminating negligible components of a number.

The most generic type, number, is somewhat confusingly named: it exists to capture all mathematical values whose type is more general than complex, but which are still usable with standard mathematical operations, as defined by Scheme. Thus it captures, for example, positive and negative infinity (+inf.0 and -inf.0, the significand here meaning approximation up to cardinality), since those are mathematical objects to which at least some numerical operations may validly apply (e.g. one can add to or multiply by infinity, yielding infinity; or compare cardinality against infinity, with infinity always being greater than any finite value). [1] On a more technical level, number in Lisp simply provides a place in the type hierarchy for the kinds of non-strictly-numerical values defined by IEEE 754.

The Scheme programming language defines all its arithmetic within this model, as do most other Lisp dialects. [2] [3] Some implementations may extend or adapt the tower. Kawa, a Scheme implementation for the JVM, extends the tower to include both quaternions [4] and quantities, [5] with quantities being a way of subtyping numerical values with units; e.g. a number of grams cannot meaningfully be added to a number of metres because via quantities numbers inherit logic derived from dimensional analysis to govern their meaning in relation to and thus valid arithmetical interactions with each other.

Another common variation is to support both exact and inexact versions of the tower or parts of it; R7RS Scheme recommends but does not strictly require this of implementations. In this case, similar semantics are used to determine the permissibility of implicit coercion: inexactness is a contagious property of numbers, [6] and any numerical operation involving both exact and inexact values must yield inexact return values of at least the same precision as the most precise inexact number appearing in the expression, unless the precision is practically infinite (e.g. containing a detectable repetend), or unless it can be proven that the precision of the result of the operation is independent of the inexactness of any of its operands (for example, a series of multiplications where at least one multiplicand is 0).

In other languages

Most programming languages and language implementations do not support a Scheme-like numerical tower, though some languages provide limited or inconsistent support if implementation simplicity permits. Python, for example, provides a similar structure via PEP3141, [7] citing Scheme's example, though in practice rational numbers (fractions) must be imported from their own module, and both rational and complex numbers use slightly variant syntax from normal number literals, since Python's syntax is less explicit than Lisp's.

Thus in the following Scheme examples we see:

1-2+31-231/31/372/6+8/3i12+8/3i; coercion: canonical form(+3+2i2-2i)5; coercion: canonical form(-3-62/32i1+inf.0i)2-inf.0i; coercion: infinite cardinality(>3+0/2i3)#f; coercion: 3 ≯ 3

While in the following Python examples we see:

1;-2;+31-231/30.3333333333333333inf=float('inf')# infinity not first-classfromfractionsimportFractionx=Fraction(1,3)y=Fraction(2,3)x+yFraction(1,1)# no coercion(3+2j)(3+2j)complex(x,inf)(0.3333333333333333+infj)# coercion: equality violateda=1/3b=Fraction(1,3)caz=complex(a,0)cbz=complex(b,0)a==bFalsecaz==cbzTrue# proof of equality violationcomplex(x+y,-inf)(1-infj)# coercion: equality preserved(3+0j)>3Traceback(mostrecentcalllast):File"<stdin>",line1,in<module># no coercion: type errorTypeError:'>'notsupportedbetweeninstancesof'complex'and'int'

In the Python examples, we can see that numerical issues freely arise with an inconsistent application of the semantics of its type coercion. While 1 / 3 in Python is treated as a call to divide 1 by 3, yielding a float, the inclusion of rationals inside a complex number, though clearly permissible, implicitly coerces them from rationals into floats or ints, even in cases where this is incorrect.

Smalltalk is another programming language that follows this model, but it has ArithmeticValue and Magnitude as superclasses of Number.

Related Research Articles

<span class="mw-page-title-main">Floating-point arithmetic</span> Computer approximation for real numbers

In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be represented as a base-ten floating-point number:

IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was the most widely used format for floating-point computation. It was implemented in software, in the form of floating-point libraries, and in hardware, in the instructions of many CPUs and FPUs. The first integrated circuit to implement the draft of what was to become IEEE 754-1985 was the Intel 8087.

<span class="mw-page-title-main">Number</span> Used to count, measure, and label

A number is a mathematical object used to count, measure, and label. The original examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers can be represented by symbols, called numerals; for example, "5" is a numeral that represents the number five. As only a relatively small number of symbols can be memorized, basic numerals are commonly organized in a numeral system, which is an organized way to represent any number. The most common numeral system is the Hindu–Arabic numeral system, which allows for the representation of any number using a combination of ten fundamental numeric symbols, called digits. In addition to their use in counting and measuring, numerals are often used for labels, for ordering, and for codes. In common usage, a numeral is not clearly distinguished from the number that it represents.

<span class="mw-page-title-main">Scheme (programming language)</span> Dialect of Lisp

Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT AI Lab and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.

A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The encoding between numerical values and bit patterns is chosen for convenience of the operation of the computer; the encoding used by the computer's instruction set generally requires conversion for external use, such as for printing and display. Different types of processors may have different internal representations of numerical values and different conventions are used for integer and real numbers. Most calculations are carried out with number formats that fit into a processor register, but some software systems allow representation of arbitrarily large numbers using multiple words of memory.

Double-precision floating-point format is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

<span class="mw-page-title-main">Rounding</span> Replacing a number with a simpler value

Rounding means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or the expression 2 with 1.414.

<span class="mw-page-title-main">Division by zero</span> Class of mathematical expression

In mathematics, division by zero is division where the divisor (denominator) is zero. Such a division can be formally expressed as , where a is the dividend (numerator). In ordinary arithmetic, the expression has no meaning, as there is no number that, when multiplied by 0, gives a ; thus, division by zero is undefined. Since any number multiplied by zero is zero, the expression is also undefined; when it is the form of a limit, it is an indeterminate form. Historically, one of the earliest recorded references to the mathematical impossibility of assigning a value to is contained in Anglo-Irish philosopher George Berkeley's criticism of infinitesimal calculus in 1734 in The Analyst.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

The IEEE Standard for Floating-Point Arithmetic is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.

In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representing the cents. More generally, the term may refer to representing fractional values as integer multiples of some fixed small unit, e.g. a fractional amount of hours as an integer multiple of ten-minute intervals. Fixed-point number representation is often contrasted to the more complicated and computationally demanding floating-point representation.

In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are limited only by the available memory of the host system. This contrasts with the faster fixed-precision arithmetic found in most arithmetic logic unit (ALU) hardware, which typically offers between 8 and 64 bits of precision.

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean.

<span class="mw-page-title-main">Integer overflow</span> Computer arithmetic error

In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits – either higher than the maximum or lower than the minimum representable value.

Some programming languages provide a complex data type for complex number storage and arithmetic as a built-in (primitive) data type.

Some programming languages provide a built-in (primitive) rational data type to represent rational numbers like 1/3 and -11/17 without rounding, and to do arithmetic on them. Examples are the ratio type of Common Lisp, and analogous types provided by most languages for algebraic computation, such as Mathematica and Maple. Many languages that do not have a built-in rational type still provide it as a library-defined type.

Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

Some programming languages provide a built-in (primitive) or library decimal data type to represent non-repeating decimal fractions like 0.3 and -1.17 without rounding, and to do arithmetic on them. Examples are the decimal.Decimal type of Python, and analogous types provided by other languages.

The bfloat16 floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a truncated (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing. It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.

References

  1. "Revised7 Report on the Algorithmic Language Scheme: 6.2.4: Implementation extensions" (PDF).
  2. "Revised5 Report on the Algorithmic Language Scheme: 6.2.1: Numerical types" (PDF).
  3. "Revised7 Report on the Algorithmic Language Scheme: 6.2.1: Numerical types" (PDF).
  4. "Kawa Reference‌ Documentation: 12.4. Quaternions".
  5. "Kawa Reference‌ Documentation: 12.5 Quantities and Units".
  6. "Revised7 Report on the Algorithmic Language Scheme: 6.2.2: Exactness" (PDF).
  7. "PEP 3141 – A Type Hierarchy for Numbers".