Unum (number format)

Last updated December 20, 2024

Unums (universal numbers^[1]) are a family of number formats and arithmetic for implementing real numbers on a computer, proposed by John L. Gustafson in 2015.^[2] They are designed as an alternative to the ubiquitous IEEE 754 floating-point standard. The latest version is known as posits.^[3]

Type I Unum

The first version of unums, formally known as Type I unum, was introduced in Gustafson's book The End of Error as a superset of the IEEE-754 floating-point format.^[2] The defining features of the Type I unum format are:

a variable-width storage format for both the significand and exponent, and
a u-bit, which determines whether the unum corresponds to an exact number (u = 0), or an interval between consecutive exact unums (u = 1). In this way, the unums cover the entire extended real number line [−∞,+∞].

For computation with the format, Gustafson proposed using interval arithmetic with a pair of unums, what he called a ubound, providing the guarantee that the resulting interval contains the exact solution.

William M. Kahan and Gustafson debated unums at the Arith23 conference.^[4]^[5]^[6]^[7]

Type II Unum

Type II Unums were introduced in 2016^[8] as a redesign of Unums that broke IEEE-754 compatibility : in addition to the sign bit and the interval bit mentioned earlier, the Type II unum uses a bit to indicate inversion. These three operations make it possible, starting from a finite set of points between one and infinity, to quantify the entire projective line except for four points: the two exceptions, 0 and ∞, and then 1 and -1. This set of points is chosen arbitrarily, and arithmetic operations involving them are not performed logically but rather by using a lookup table. The size of such a table becomes prohibitive for an encoding format spanning multiple bytes. This challenge necessitated the development of the Type III unum, known as the posit, discussed below.

Posit (Type III Unum)

In February 2017, Gustafson officially introduced Type III unums (posits), for fixed floating-point-like values and valids for interval arithmetic.^[3] In March 2022, a standard was ratified and published by the Posit Working Group.^[9]

Posits^[3]^[10]^[11] are a hardware-friendly version of unum where difficulties faced in the original type I unum due to its variable size are resolved. Compared to IEEE 754 floats of similar size, posits offer a bigger dynamic range and more fraction bits for values with magnitude near 1 (but fewer fraction bits for very large or very small values), and Gustafson claims that they offer better accuracy.^[12]^[13] Studies^[14]^[15] confirm that for some applications, posits with quire out-perform floats in accuracy. Posits have superior accuracy in the range near one, where most computations occur. This makes it very attractive to the current trend in deep learning to minimize the number of bits used. It potentially helps any application to accelerate by enabling the use of fewer bits (since it has more fraction bits for accuracy) reducing network and memory bandwidth and power requirements.

The format of an n-bit posit is given a label of "posit" followed by the decimal digits of n (e.g., the 16-bit posit format is "posit16") and consists of four sequential fields:

sign: 1 bit, representing an unsigned integer s
regime: at least 2 bits and up to (n − 1), representing an unsigned integer r as described below
exponent: generally 2 bits as available after regime, representing an unsigned integer e
fraction: all remaining bits available after exponent, representing a non-negative real dyadic rational f less than 1

The regime field uses unary coding of k identical bits, followed by a bit of opposite value if any remaining bits are available, to represent an unsigned integer r that is −k if the first bit is 0 or k − 1 if the first bit is 1. The sign, exponent, and fraction fields are analogous to IEEE 754 sign, exponent, and significand fields (respectively), except that the posit exponent and fraction fields may be absent or truncated and implicitly extended with zeroes—an absent exponent is treated as 00₂ (representing 0), a one-bit exponent E₁ is treated as E₁0₂ (representing the integer 0 if E₁ is 0 or 2 if E₁ is 1), and an absent fraction is treated as 0. Negative numbers (s is 1) are encoded as 2's complements.

The two encodings in which all non-sign bits are 0 have special interpretations:

If the sign bit is 1, the posit value is NaR ("not a real")
If the sign bit is 0, the posit value is 0 (which is unsigned and the only value for which the sign function returns 0)

Otherwise, the posit value is equal to ${\textstyle ((1-3s)+f)\times 2^{(1-2s)\times (4r+e+s)}}$ , in which r scales by powers of 16, e scales by powers of 2, f distributes values uniformly between adjacent combinations of (r, e), and s adjusts the sign symmetrically about 0.

Examples


type (positn)	Binary	Value	Notes
Any	`10…`	`NaR`	anything not mathematically definable as a unique real number^[9]
Any	`00…`	0
Any	`010…`	1
Any	`110…`	−1
Any	`00111 0…`	0.5
Any	`00…1`	${\textstyle 2^{-4n+8}}$	smallest positive value
Any	`01…`	${\textstyle 2^{4n-8}}$	largest positive value
posit8	`00000001`	${\textstyle 2^{-24}\approx 6.0\times 10^{-8}}$	smallest positive value
posit8	`01111111`	${\textstyle 2^{24}\approx 1.7\times 10^{7}}$	largest positive value
posit16	`0000000000000001`	${\textstyle 2^{-56}\approx 1.4\times 10^{-17}}$	smallest positive value
posit16	`0111111111111111`	${\textstyle 2^{56}\approx 7.2\times 10^{16}}$	largest positive value
posit32	`00000000000000000000000000000001`	${\textstyle 2^{-120}\approx 7.5\times 10^{-37}}$	smallest positive value
posit32	`01111111111111111111111111111111`	${\textstyle 2^{120}\approx 1.3\times 10^{36}}$	largest positive value

Note: 32-bit posit is expected to be sufficient to solve almost all classes of applications^{[ citation needed ]}.

Quire

For each positn type of precision ${\textstyle n}$ , the standard defines a corresponding "quire" type quiren of precision ${\textstyle 16\times n}$ , used to accumulate exact sums of products of those posits without rounding or overflow in dot products for vectors of up to 2³¹ or more elements (the exact limit is $2^{23+4n}$ ). The quire format is a two's complement signed integer, interpreted as a multiple of units of magnitude $2^{16-8n}$ except for the special value with a leading sign bit of 1 and all other bits equal to 0 (which represents NaR). Quires are based on the work of Ulrich W. Kulisch and Willard L. Miranker.^[16]

Valid

Valids are described as a Type III Unum mode that bounds results in a given range.^[3]

Implementations

Several software and hardware solutions implement posits.^[14]^[17]^[18]^[19]^[20] The first complete parameterized posit arithmetic hardware generator was proposed in 2018.^[21]

Unum implementations have been explored in Julia ^[22]^[23]^[24]^[25]^[26]^[27] and MATLAB.^[28]^[29] A C++ version^[30] with support for any posit sizes combined with any number of exponent bits is available. A fast implementation in C, SoftPosit,^[31] provided by the NGA research team based on Berkeley SoftFloat adds to the available software implementations.

Project author	Type	Precisions	Quire Support?	Speed	Testing	Notes
GP-GPU VividSparks	World's first FPGA GPGPU	32	Yes	~3.2 TPOPS	Exhaustive. No known bugs.	RacEr GP-GPU has 512 cores
SoftPosit ASTAR*	C library based on Berkeley SoftFloat C++ wrapper to override operators Python wrapper using SWIG of SoftPosit	8, 16, 32 published and complete;	Yes	~60 to 110 MPOPS on x86 core (Broadwell)	8: Exhaustive; 16: Exhaustive except FMA, quire 32: Exhaustive test is still in progress. No known bugs.	Open source license. Fastest and most comprehensive C library for posits presently. Designed for plug-in comparison of IEEE floats and posits.
posit4.nb ASTAR*	Mathematica notebook	All	Yes	< 80 KPOPS	Exhaustive for low precisions. No known bugs.	Open source (MIT license). Original definition and prototype. Most complete environment for comparing IEEE floats and posits. Many examples of use, including linear solvers
posit-javascript ASTAR*	JavaScript widget	Convert decimal to posit 6, 8, 16, 32; generate tables 2–17 with es 1–4.	N/A	N/A; interactive widget	Fully tested	Table generator and conversion
Universal Stillwater Supercomputing, Inc	C++ template library C library Python wrapper Golang library	Arbitrary precision posit float valid (p) Unum type 1 (p) Unum type 2 (p)	Arbitrary quire configurations with programmable capacity	posit<4,0> 1 GPOPS posit<8,0> 130 MPOPS posit<16,1> 115 MPOPS posit<32,2> 105 MPOPS posit<64,3> 50 MPOPS posit<128,4> 1 MPOPS posit<256,5> 800 KPOPS	Complete validation suite for arbitrary posits Randoms for large posit configs. Uses induction to prove nbits+1 is correct no known bugs	Open source. MIT license. Fully integrated with C/C++ types and automatic conversions. Supports full C++ math library (native and conversion to/from IEEE). Runtime integrations: MTL4/MTL5, Eigen, Trilinos, HPR-BLAS. Application integrations: G+SMO, FDBB, FEniCS, ODEintV2, TVM.ai. Hardware accelerator integration (Xilinx, Intel, Achronix).
Speedgo Chung Shin Yee	Python library	All	No	~20 MPOPS	Extensive; no known bugs	Open source (MIT license)
softposit-rkt David Thien	SoftPosit bindings for Racket	All	Yes	Unknown	Unknown
sfpy Bill Zorn	SoftPosit bindings for Python	All	Yes	~20–45 MPOPS on 4.9 GHz Skylake core	Unknown
positsoctave Diego Coelho	Octave implementation	All	No	Unknown	Limited Testing; no known bugs	GNU GPL
Sigmoid Numbers Isaac Yonemoto	Julia library	All <32, all ES	Yes	Unknown	No known bugs (posits). Division bugs (valids)	Leverages Julia's templated mathematics standard library, can natively do matrix and tensor operations, complex numbers, FFT, DiffEQ. Support for valids
FastSigmoid Isaac Yonemoto	Julia and C/C++ library	8, 16, 32, all ES	No	Unknown	Known bug in 32-bit multiplication	Used by LLNL in shock studies
SoftPosit.jl Milan Klöwer	Julia library	Based on softposit; 8-bit (es=0..2) 16-bit (es=0..2) 24-bit (es=1..2) 32-bit (es=2)	Yes	Similar to A*STAR "SoftPosit" (Cerlane Leong)	Yes: Posit (8,0), Posit (16,1), Posit (32,2) Other formats lack full functionality	Open source. Issues and suggestions on GitHub. This project was developed due to the fact that SigmoidNumbers and FastSigmoid by Isaac Yonemoto is not maintained currently. Supports basic linear algebra functions in Julia (Matrix multiplication, Matrix solve, Elgen decomposition, etc.)
PySigmoid Ken Mercado	Python library	All	Yes	< 20 MPOPS	Unknown	Open source (MIT license). Easy-to-use interface. Neural net example. Comprehensive functions support.
cppPosit Federico Rossi, Emanuele Ruffaldi	C++ library	4 to 64 (any es value); "Template version is 2 to 63 bits"	No	Unknown	A few basic tests	4 levels of operations working with posits. Special support for NaN types (non-standard)
bfp:Beyond Floating Point Clément Guérin	C++ library	Any	No	Unknown	Bugs found; status of fixes unknown	Supports + – × ÷ √ reciprocal, negate, compare
Verilog.jl Isaac Yonemoto	Julia and Verilog	8, 16, 32, ES=0	No	Unknown	Comprehensively tested for 8-bit, no known bugs	Intended for Deep Learning applications Addition, Subtraction and Multiplication only. A proof of concept matrix multiplier has been built, but is off-spec in its precision
Lombiq Arithmetics Lombiq Technologies	C# with Hastlayer for hardware generation	8, 16, 32. (64bits in progress)	Yes	10 MPOPS Click here for more	Partial	Requires Microsoft .Net APIs
Deepfloat Jeff Johnson, Facebook	SystemVerilog	Any (parameterized SystemVerilog)	Yes	N/A (RTL for FPGA/ASIC designs)	Limited	Does not strictly conform to posit spec. Supports +,-,/,*. Implements both logarithmic posit and normal, "linear" posits License: CC-BY-NC 4.0 at present
Tokyo Tech	FPGA	16, 32, extendable	No	"2 GHz", not translated to MPOPS	Partial; known rounding bugs	Yet to be open-source
PACoGen: Posit Arthmetic Core Generator Manish Kumar Jaiswal	Verilog HDL for Posit Arithmetic	Any precision. Able to generate any combination of word-size (N) and exponent-size (ES)	No	Speed of design is based on the underlying hardware platform (ASIC/FPGA)	Exhaustive tests for 8-bit posit. Multi-million random tests are performed for up to 32-bit posit with various ES combinations	It supports rounding-to-nearest rounding method.
Vinay Saxena, Research and Technology Centre, Robert Bosch, India (RTC-IN) and Farhad Merchant, RWTH Aachen University	Verilog generator for VLSI, FPGA	All	No	Similar to floats of same bit size	N=8 - ES=2 \| N=7,8,9,10,11,12 Selective (20000*65536) combinations for - ES=1 \| N=16	To be used in commercial products. To the best of our knowledge. *First ever integration of posits in RISC-V*
Posit-enabled RISC-V core (Sugandha Tiwari, Neel Gala, Chester Rebeiro, V.Kamakoti, IIT MADRAS)	BSV (Bluespec System Verilog) Implementation	32-bit posit with (es=2) and (es=3)	No	—	Verified against SoftPosit for (es=2) and tested with several applications for (es=2) and (es=3). No known bugs.	First complete posit-capable RISC-V core. Supports dynamic switching between (es=2) and (es=3). More info here.
PERCIVAL David Mallasén	Open-Source Posit RISC-V Core with Quire Capability	Posit<32,2> with 512-bit quire	Yes	Speed of design is based on the underlying hardware platform (ASIC/FPGA)	Functionality testing of each posit instruction.	Application-level posit-capable RISC-V core based on CVA6 that can execute all posit instructions, including the quire fused operations. PERCIVAL is the first work that integrates the complete posit ISA and quire in hardware. It allows the native execution of posit instructions as well as the standard floating-point ones simultaneously.
LibPosit Chris Lomont	Single file C# MIT Licensed	Any size	No		Extensive; no known bugs	Ops: arithmetic, comparisons, sqrt, sin, cos, tan, acos, asin, atan, pow, exp, log
unumjl REX Computing	FPGA version of the "Neo" VLIW processor with posit numeric unit	32	No	~1.2 GPOPS	Extensive; no known bugs	No divide or square root. First full processor design to replace floats with posits.
PNU: Posit Numeric Unit Calligo Tech	World's first posit-enabled ASIC with octa-core RISC-V processor and Quire implemented. PCIe accelerator card with this silicon will be ready June 2024 Fully software stack with compilers, debugger, IDE environment and math libraries for applications. C, C++, Python languages supported Applications tested successfully - image and video compression, more to come	<32, 2> with Quire 512 bits support. <64, 3>	Yes - Fully supported.	500 MHz * 8 Cores	Exhaustive tests completed for 32 bits and 64 bits with Quire support completed. Applications tested and being made available for seamless adoption www.calligotech.com	Fully integrated with C/C++ types and automatic conversions. Supports full C++ math library (native and conversion to/from IEEE). Runtime integrations: GNU Utils, OpenBLAS, CBLAS. Application integrations: in progress. Compiler support extended: C/C++, G++, GFortran & LLVM (in progress).
IBM-TACC Jianyu Chen	Specific-purpose FPGA	32	Yes	16–64 GPOPS	Only one known case tested	Does 128-by-128 matrix-matrix multiplication (SGEMM) using quire.
Deep PeNSieve Raul Murillo	Python library (software)	8, 16, 32	Yes	Unknown	Unknown	A DNN framework using posits
Gosit Jaap Aarts	Pure Go library	16/1 32/2 (included is a generic 32/ES for ES<32)^{[ clarification needed ]}	No	80 MPOPS for div32/2 and similar linear functions. Much higher for truncate and much lower for exp.	Fuzzing against C softposit with a lot of iterations for 16/1 and 32/2. Explicitly testing edge cases found.	(MIT license) The implementations where ES is constant the code is generated. The generator should be able to generate for all sizes {8,16,32} and ES below the size. However, the ones not included into the library by default are not tested, fuzzed, or supported. For some operations on 32/ES, mixing and matching ES is possible. However, this is not tested.

SoftPosit

SoftPosit^[31] is a software implementation of posits based on Berkeley SoftFloat.^[32] It allows software comparison between posits and floats. It currently supports

Add
Subtract
Multiply
Divide
Fused-multiply-add
Fused-dot-product (with quire)
Square root
Convert posit to signed and unsigned integer
Convert signed and unsigned integer to posit
Convert posit to another posit size
Less than, equal, less than equal comparison
Round to nearest integer

Helper functions

convert double to posit
convert posit to double
cast unsigned integer to posit

It works for 16-bit posits with one exponent bit and 8-bit posit with zero exponent bit. Support for 32-bit posits and flexible type (2-32 bits with two exponent bits) is pending validation. It supports x86_64 systems. It has been tested on GNU gcc (SUSE Linux) 4.8.5 Apple LLVM version 9.1.0 (clang-902.0.39.2).

Examples

Add with posit8_t

#include"softposit.h"intmain(intargc,char*argv[]){posit8_tpA,pB,pZ;pA=castP8(0xF2);pB=castP8(0x23);pZ=p8_add(pA,pB);// To check answer by converting it to doubledoubledZ=convertP8ToDouble(pZ);printf("dZ: %.15f\n",dZ);// To print result in binary (warning: non-portable code)uint8_tuiZ=castUI8(pZ);printBinary((uint64_t*)&uiZ,8);return0;}

Fused dot product with quire16_t

// Convert double to positposit16_tpA=convertDoubleToP16(1.02783203125);posit16_tpB=convertDoubleToP16(0.987060546875);posit16_tpC=convertDoubleToP16(0.4998779296875);posit16_tpD=convertDoubleToP16(0.8797607421875);quire16_tqZ;// Set quire to 0qZ=q16_clr(qZ);// Accumulate products without roundingsqZ=q16_fdp_add(qZ,pA,pB);qZ=q16_fdp_add(qZ,pC,pD);// Convert back to positposit16_tpZ=q16_to_p16(qZ);// To check answerdoubledZ=convertP16ToDouble(pZ);

Critique

William M. Kahan, the principal architect of IEEE 754-1985 criticizes type I unums on the following grounds (some are addressed in type II and type III standards):^[6]^[33]

The description of unums sidesteps using calculus for solving physics problems.
Unums can be expensive in terms of time and power consumption.
Each computation in unum space is likely to change the bit length of the structure. This requires either unpacking them into a fixed-size space, or data allocation, deallocation, and garbage collection during unum operations, similar to the issues for dealing with variable-length records in mass storage.
Unums provide only two kinds of numerical exception, quiet and signaling NaN (Not-a-Number).
Unum computation may deliver overly loose bounds from the selection of an algebraically correct but numerically unstable algorithm.
The benefits of unum over short precision floating point for problems requiring low precision are not obvious.
Solving differential equations and evaluating integrals with unums guarantee correct answers but may not be as fast as methods that usually work.

Related Research Articles

In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a signed sequence of a fixed number of digits in some base, called a significand, scaled by an integer exponent of that base. Numbers of this form are called floating-point numbers.

IEEE 754-1985 is a historic industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was the most widely used format for floating-point computation. It was implemented in software, in the form of floating-point libraries, and in hardware, in the instructions of many CPUs and FPUs. The first integrated circuit to implement the draft of what was to become IEEE 754-1985 was the Intel 8087.

In computing, NaN, standing for Not a Number, is a particular value of a numeric data type which is undefined as a number, such as the result of 0/0. Systematic use of NaNs was introduced by the IEEE 754 floating-point standard in 1985, along with the representation of other non-finite quantities such as infinities.

Double-precision floating-point format is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide range of numeric values by using a floating radix point.

The IEEE Standard for Floating-Point Arithmetic is a technical standard for floating-point arithmetic originally established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.

The significand is the first (left) part of a number in scientific notation or related concepts in floating-point representation, consisting of its significant digits.

Signed zero is zero with an associated sign. In ordinary arithmetic, the number 0 does not have a sign, so that −0, +0 and 0 are equivalent. However, in computing, some number representations allow for the existence of two zeros, often denoted by −0 and +0, regarded as equal by the numerical comparison operations but with possible different behaviors in particular operations. This occurs in the sign-magnitude and ones' complement signed number representations for integers, and in most floating-point number representations. The number 0 is usually encoded as +0, but can still be represented by +0, −0, or 0.

In C and related programming languages, long double refers to a floating-point data type that is often more precise than double precision though the language standard only requires it to be at least as precise as double. As with C's other floating-point types, it may not necessarily map to an IEEE format.

In computing, minifloats are floating-point values represented with very few bits. This reduced precision makes them ill-suited for general-purpose numerical calculations, but they are useful for special purposes such as:

Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended-precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format. In contrast to extended precision, arbitrary-precision arithmetic refers to implementations of much larger numeric types using special software.

Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding errors that otherwise typically occur when converting between decimal fractions and binary (base-2) fractions.

IEEE 754-2008 is a revision of the IEEE 754 standard for floating-point arithmetic. It was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 standard. The 2008 revision extended the previous standard where it was necessary, added decimal arithmetic and formats, tightened up certain areas of the original standard which were left undefined, and merged in IEEE 854 . In a few cases, where stricter definitions of binary floating-point arithmetic might be performance-incompatible with some existing implementation, they were made optional. In 2019, it was updated with a minor revision IEEE 754-2019.

Offset binary, also referred to as excess-K, excess-N, excess-e, excess code or biased representation, is a method for signed number representation where a signed number n is represented by the bit pattern corresponding to the unsigned number n+K, K being the biasing value or offset. There is no standard for offset binary, but most often the K for an n-bit binary word is K = 2ⁿ⁻¹ (for example, the offset for a four-digit binary number would be 2³=8). This has the consequence that the minimal negative value is represented by all-zeros, the "zero" value is represented by a 1 in the most significant bit and zero in all other bits, and the maximal positive value is represented by all-ones (conveniently, this is the same as using two's complement but with the most significant bit inverted). It also has the consequence that in a logical comparison operation, one gets the same result as with a true form numerical comparison operation, whereas, in two's complement notation a logical comparison will agree with true form numerical comparison operation if and only if the numbers being compared have the same sign. Otherwise the sense of the comparison will be inverted, with all negative values being taken as being larger than all positive values.

In computing, half precision is a binary floating-point computer number format that occupies 16 bits in computer memory. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular image processing and neural networks.

In computing, quadruple precision is a binary floating-point–based computer number format that occupies 16 bytes with precision at least twice the 53-bit double precision.

Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes (32 bits) in computer memory.

In computing, Microsoft Binary Format (MBF) is a format for floating-point numbers which was used in Microsoft's BASIC languages, including MBASIC, GW-BASIC and QuickBASIC prior to version 4.00.

Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error cannot be eliminated, and, at best, can only be managed.

The bfloat16 floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a shortened (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing. It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.

References

↑ Tichy, Walter F. (April 2016). "The End of (Numeric) Error: An interview with John L. Gustafson". Ubiquity – Information Everywhere. 2016 (April). Association for Computing Machinery (ACM): 1–14. doi: 10.1145/2913029 . JG: The word "unum" is short for "universal number," the same way the word "bit" is short for "binary digit."
1 2 Gustafson, John L. (2016-02-04) [2015-02-05]. The End of Error: Unum Computing. Chapman & Hall / CRC Computational Science. Vol. 24 (2nd corrected printing, 1st ed.). CRC Press. ISBN 978-1-4822-3986-7 . Retrieved 2016-05-30.
1 2 3 4 Gustafson, John Leroy; Yonemoto, Isaac (2017). "Beating Floating Point at its Own Game: Posit Arithmetic". Supercomputing Frontiers and Innovations. 4 (2). Publishing Center of South Ural State University, Chelyabinsk, Russia. doi: 10.14529/jsfi170206 . Archived from the original on 2017-11-04. Retrieved 2017-11-04.
↑ "Program: Special Session: The Great Debate: John Gustafson and William Kahan". Arith23: 23rd IEEE Symposium on Computer Arithmetic . Silicon Valley, USA. 2016-07-12. Archived from the original on 2016-05-30. Retrieved 2016-05-30.
↑ Gustafson, John L.; Kahan, William M. (2016-07-12). The Great Debate @ARITH23: John Gustafson and William Kahan (1:34:41) (video). Retrieved 2016-07-20.
1 2 Kahan, William M. (2016-07-16) [2016-07-12]. "A Critique of John L. Gustafson's THE END of ERROR — Unum Computation and his A Radical Approach to Computation with Real Numbers" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23. Archived (PDF) from the original on 2016-07-25. Retrieved 2016-07-25.
↑ Gustafson, John L. (2016-07-12). ""The Great Debate": Unum arithmetic position paper" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23 . Retrieved 2016-07-20.
↑ Tichy, Walter F. (September 2016). "Unums 2.0: An Interview with John L. Gustafson". Ubiquity.ACM.org. Retrieved 2017-01-30. I started out calling them "unums 2.0," which seemed to be as good a name for the concept as any, but it is really not a "latest release" so much as it is an alternative.
1 2 Posit Working Group (2022-03-02). "Standard for Posit Arithmetic (2022)" (PDF). Archived (PDF) from the original on 2022-09-26. Retrieved 2022-12-21.
↑ John L. Gustafson and I. Yonemoto. (February 2017) Beyond Floating Point: Next Generation Computer Arithmetic. [Online]. Available: https://www.youtube.com/watch?v=aP0Y1uAA-2Y
↑ Gustafson, John Leroy (2017-10-10). "Posit Arithmetic" (PDF). Archived (PDF) from the original on 2017-11-05. Retrieved 2017-11-04.
↑ Feldman, Michael (2019-07-08). "New Approach Could Sink Floating Point Computation". www.nextplatform.com. Retrieved 2019-07-09.
↑ Byrne, Michael (2016-04-24). "A New Number Format for Computers Could Nuke Approximation Errors for Good". Vice. Retrieved 2019-07-09.
1 2 Lindstrom, Peter; Lloyd, Scott; Hittinger, Jeffrey (March 2018). Universal Coding of the Reals: Alternatives to IEEE Floating Point. Conference for Next Generation Arithmetic. Art. 5. ACM. doi:10.1145/3190339.3190344.
↑ David Mallasén; Alberto A. Del Barrio; Manuel Prieto-Matias (2024). "Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing". IEEE Transactions on Computers. 73 (6): 1472–1485. arXiv: 2305.06946 . doi:10.1109/TC.2024.3377890.
↑ Kulisch, Ulrich W.; Miranker, Willard L. (March 1986). "The Arithmetic of the Digital Computer: A New Approach". SIAM Rev. 28 (1). SIAM: 1–40. doi:10.1137/1028001.
↑ S. Chung, "Provably Correct Posit Arithmetic with Fixed-Point Big Integer." ACM, 2018.
↑ J. Chen, Z. Al-Ars, and H. Hofstee, "A Matrix-Multiply Unit for Posits in Reconfigurable Logic Using (Open)CAPI." ACM, 2018.
↑ Z. Lehoczky, A. Szabo, and B. Farkas, "High-level .NET Software Implementations of Unum Type I and Posit with Simultaneous FPGA Implementation Using Hastlayer." ACM, 2018.
↑ S. Langroudi, T. Pandit, and D. Kudithipudi, "Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit". In Energy Efficiency Machine Learning and Cognitive Computing for Embedded Applications (EMC), 2018. [Online]. Available: https://sites.google.com/view/asplos-emc2/program
↑ Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, Rainer Leupers, "Parameterized Posit Arithmetic Hardware Generator." ICCD 2018: 334-341.
↑ Byrne, Simon (2016-03-29). "Implementing Unums in Julia" . Retrieved 2016-05-30.
↑ "Unum arithmetic in Julia: Unums.jl". GitHub . Retrieved 2016-05-30.
↑ "Julia Implementation of Unums: README". GitHub . Retrieved 2016-05-30.
↑ "Unum (Universal Number) types and operations: Unums". GitHub . Retrieved 2016-05-30.
↑ "jwmerrill/Pnums.jl". Github.com. Retrieved 2017-01-30.
↑ "GitHub - ityonemo/Unum2: Pivot Unums". GitHub . 2019-04-29.
↑ Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "Reducing Memory Footprints in Explicit Model Predictive Control using Universal Numbers. Submitted to the IFAC World Congress 2017" . Retrieved 2016-11-15.
↑ Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "MATLAB Prototype of unum (munum)" . Retrieved 2016-11-15.
↑ "GitHub - stillwater-sc/Universal: Universal Number Arithmetic". GitHub . 2019-06-16.
1 2 "Cerlane Leong / SoftPosit · GitLab". GitLab.
↑ "Berkeley SoftFloat". www.jhauser.us.
↑ Kahan, William M. (2016-07-15). "Prof. W. Kahan's Commentary on "THE END of ERROR — Unum Computing" by John L. Gustafson, (2015) CRC Press" (PDF). Archived (PDF) from the original on 2016-08-01. Retrieved 2016-08-01.

External links

"Conference for Next Generation Arithmetic (CoNGA)". 2017. Archived from the original on 2017-11-04. Retrieved 2017-11-04.
"SoftPosit". 2018. Retrieved 2018-06-13.
"Community Source Code Contribution". 2018. Retrieved 2018-06-13.
"Anatomy of a posit number". 2018-04-11. Retrieved 2019-08-09.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Tichy_2016_1-1] Tichy, Walter F. (April 2016). "The End of (Numeric) Error: An interview with John L. Gustafson". Ubiquity – Information Everywhere. 2016 (April). Association for Computing Machinery (ACM): 1–14. doi: 10.1145/2913029 . JG: The word "unum" is short for "universal number," the same way the word "bit" is short for "binary digit."

[Gustafson_2015-2] 1 2 Gustafson, John L. (2016-02-04) [2015-02-05]. The End of Error: Unum Computing. Chapman & Hall / CRC Computational Science. Vol. 24 (2nd corrected printing, 1st ed.). CRC Press. ISBN 978-1-4822-3986-7 . Retrieved 2016-05-30.

[Gustafson_2017?_Posit-3] 1 2 3 4 Gustafson, John Leroy; Yonemoto, Isaac (2017). "Beating Floating Point at its Own Game: Posit Arithmetic". Supercomputing Frontiers and Innovations. 4 (2). Publishing Center of South Ural State University, Chelyabinsk, Russia. doi: 10.14529/jsfi170206 . Archived from the original on 2017-11-04. Retrieved 2017-11-04.

[ARITH23_2016-4] "Program: Special Session: The Great Debate: John Gustafson and William Kahan". Arith23: 23rd IEEE Symposium on Computer Arithmetic . Silicon Valley, USA. 2016-07-12. Archived from the original on 2016-05-30. Retrieved 2016-05-30.

[Great-Debate_2016-5] Gustafson, John L.; Kahan, William M. (2016-07-12). The Great Debate @ARITH23: John Gustafson and William Kahan (1:34:41) (video). Retrieved 2016-07-20.

[Kahan_2016-6] 1 2 Kahan, William M. (2016-07-16) [2016-07-12]. "A Critique of John L. Gustafson's THE END of ERROR — Unum Computation and his A Radical Approach to Computation with Real Numbers" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23. Archived (PDF) from the original on 2016-07-25. Retrieved 2016-07-25.

[Gustafson_ARITH23-7] Gustafson, John L. (2016-07-12). ""The Great Debate": Unum arithmetic position paper" (PDF). Santa Clara, CA, USA: IEEE Symposium on Computer Arithmetic, ARITH 23 . Retrieved 2016-07-20.

[Tichy_2016_2-8] Tichy, Walter F. (September 2016). "Unums 2.0: An Interview with John L. Gustafson". Ubiquity.ACM.org. Retrieved 2017-01-30. I started out calling them "unums 2.0," which seemed to be as good a name for the concept as any, but it is really not a "latest release" so much as it is an alternative.

[Posit_Standard_2022-9] 1 2 Posit Working Group (2022-03-02). "Standard for Posit Arithmetic (2022)" (PDF). Archived (PDF) from the original on 2022-09-26. Retrieved 2022-12-21.

[Gustafson_2017_video-10] John L. Gustafson and I. Yonemoto. (February 2017) Beyond Floating Point: Next Generation Computer Arithmetic. [Online]. Available: https://www.youtube.com/watch?v=aP0Y1uAA-2Y

[Gustafson_2017_Posit-11] Gustafson, John Leroy (2017-10-10). "Posit Arithmetic" (PDF). Archived (PDF) from the original on 2017-11-05. Retrieved 2017-11-04.

[Feldman_2019-12] Feldman, Michael (2019-07-08). "New Approach Could Sink Floating Point Computation". www.nextplatform.com. Retrieved 2019-07-09.

[Byrne_2016-13] Byrne, Michael (2016-04-24). "A New Number Format for Computers Could Nuke Approximation Errors for Good". Vice. Retrieved 2019-07-09.

[conga_paper_lindstrom-14] 1 2 Lindstrom, Peter; Lloyd, Scott; Hittinger, Jeffrey (March 2018). Universal Coding of the Reals: Alternatives to IEEE Floating Point. Conference for Next Generation Arithmetic. Art. 5. ACM. doi:10.1145/3190339.3190344.

[Mallasen_2023-15] David Mallasén; Alberto A. Del Barrio; Manuel Prieto-Matias (2024). "Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing". IEEE Transactions on Computers. 73 (6): 1472–1485. arXiv: 2305.06946 . doi:10.1109/TC.2024.3377890.

[kulish_1986-16] Kulisch, Ulrich W.; Miranker, Willard L. (March 1986). "The Arithmetic of the Digital Computer: A New Approach". SIAM Rev. 28 (1). SIAM: 1–40. doi:10.1137/1028001.

[conga_paper_chung-17] S. Chung, "Provably Correct Posit Arithmetic with Fixed-Point Big Integer." ACM, 2018.

[conga_paper_hofstee-18] J. Chen, Z. Al-Ars, and H. Hofstee, "A Matrix-Multiply Unit for Posits in Reconfigurable Logic Using (Open)CAPI." ACM, 2018.

[conga_paper_Lehoczky-19] Z. Lehoczky, A. Szabo, and B. Farkas, "High-level .NET Software Implementations of Unum Type I and Posit with Simultaneous FPGA Implementation Using Hastlayer." ACM, 2018.

[paper_Langroudi-20] S. Langroudi, T. Pandit, and D. Kudithipudi, "Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit". In Energy Efficiency Machine Learning and Cognitive Computing for Embedded Applications (EMC), 2018. [Online]. Available: https://sites.google.com/view/asplos-emc2/program

[iccd_2018_paper-21] Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, Rainer Leupers, "Parameterized Posit Arithmetic Hardware Generator." ICCD 2018: 334-341.

[Julia_1-22] Byrne, Simon (2016-03-29). "Implementing Unums in Julia" . Retrieved 2016-05-30.

[Julia_2-23] "Unum arithmetic in Julia: Unums.jl". GitHub . Retrieved 2016-05-30.

[Julia_3-24] "Julia Implementation of Unums: README". GitHub . Retrieved 2016-05-30.

[Julia_4-25] "Unum (Universal Number) types and operations: Unums". GitHub . Retrieved 2016-05-30.

[Pnums-26] "jwmerrill/Pnums.jl". Github.com. Retrieved 2017-01-30.

[Pivot_2019-27] "GitHub - ityonemo/Unum2: Pivot Unums". GitHub . 2019-04-29.

[matlab-28] Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "Reducing Memory Footprints in Explicit Model Predictive Control using Universal Numbers. Submitted to the IFAC World Congress 2017" . Retrieved 2016-11-15.

[matlab_2-29] Ingole, Deepak; Kvasnica, Michal; De Silva, Himeshi; Gustafson, John L. "MATLAB Prototype of unum (munum)" . Retrieved 2016-11-15.

[Stillwater_2019-30] "GitHub - stillwater-sc/Universal: Universal Number Arithmetic". GitHub . 2019-06-16.

[SoftPosit-31] 1 2 "Cerlane Leong / SoftPosit · GitLab". GitLab.

[SoftFloat-32] "Berkeley SoftFloat". www.jhauser.us.

[Kahan_2016_Commentary-33] Kahan, William M. (2016-07-15). "Prof. W. Kahan's Commentary on "THE END of ERROR — Unum Computing" by John L. Gustafson, (2015) CRC Press" (PDF). Archived (PDF) from the original on 2016-08-01. Retrieved 2016-08-01.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

Unum (number format)

Contents

Type I Unum

Type II Unum

Posit (Type III Unum)

Examples

Quire

Valid

Implementations

SoftPosit

Helper functions

Examples

Critique

See also

Related Research Articles

References

Further reading

External links