Optimal radix choice

Last updated June 24, 2024

In mathematics and computer science, optimal radix choice is the problem of choosing the base, or radix, that is best suited for representing numbers. Various proposals have been made to quantify the relative costs of using different radices in representing numbers, especially in computer systems. One formula is the number of digits needed to express it in that base, multiplied by the base (the number of possible values each digit could have). This expression also arises in questions regarding organizational structure, networking, and other fields.

Definition

The cost of representing a number N in a given base b can be defined as

E(b,N)=b\lfloor \log _{b}(N)+1\rfloor \,

where we use the floor function $\lfloor \rfloor$ and the base-b logarithm $\log _{b}$ .

If both b and N are positive integers, then the quantity $E(b,N)$ is equal to the number of digits needed to express the number N in base b, multiplied by base b.^[1] This quantity thus measures the cost of storing or processing the number N in base b if the cost of each "digit" is proportional to b. A base with a lower average $E(b,N)$ is therefore, in some senses, more efficient than a base with a higher average value.

For example, 100 in decimal has three digits, so its cost of representation is 10×3 = 30, while its binary representation has seven digits (1100100₂), so the analogous calculation gives 2×7 = 14. Likewise, in base 3 its representation has five digits (10201₃), for a value of 3×5 = 15, and in base 36 (2S₃₆) one finds 36×2 = 72.

If the number is imagined to be represented by a combination lock or a tally counter, in which each wheel has b digit faces, from $0,1,...,b-1$ and having $\lfloor \log _{b}(N)+1\rfloor$ wheels, then $E(b,N)$ is the total number of digit faces needed to inclusively represent any integer from 0 to N.

Asymptotic behavior

The quantity $E(b,N)$ for large N can be approximated as follows:

E(b,N)=b\lfloor \log _{b}(N)+1\rfloor \sim b\ \log _{b}(N)={b \over \ln(b)}\ln(N).

{E(b,N) \over \ln(N)}\sim {b \over \ln(b)}.

The asymptotically best value is obtained for base 3, since $b \over \ln(b)$ attains a minimum for $b=3$ in the positive integers:

{2 \over \ln(2)}\approx 2.88539\,,

{3 \over \ln(3)}\approx 2.73072\,,

{4 \over \ln(4)}\approx 2.88539\,.

For base 10, we have:

{10 \over \ln(10)}\approx 4.34294\,.

Comparing different bases

The values of $E(b,N)$ of bases b₁ and b₂ may be compared for a large value of N:

{{E(b_{1},N)} \over {E(b_{2},N)}}\approx {{b_{1}{\log _{b_{1}}(N)}} \over {b_{2}{\log _{b_{2}}(N)}}}={\left({\dfrac {b_{1}\ln(N)}{\ln(b_{1})}}\right) \over \left({\dfrac {b_{2}\ln(N)}{\ln(b_{2})}}\right)}={{b_{1}\ln(b_{2})} \over {b_{2}\ln(b_{1})}}\,.

Choosing e for b₂ gives

{{E(b)} \over {E(e)}}\approx {{b\ln(e)} \over {e\ln(b)}}={{b} \over {e\ln(b)}}\,.

The average $E(b,N)$ of various bases up to several arbitrary numbers (avoiding proximity to powers of 2 through 12 and e) are given in the table below. Also shown are the values relative to that of base e. $E(1,N)$ of any number $N$ is just $N$ , making unary the most economical for the first few integers, but this no longer holds as N climbs to infinity.

Base b	Avg. E(b,N) N = 1 to 6	Avg. E(b,N) N = 1 to 43	Avg. E(b,N) N = 1 to 182	Avg. E(b,N) N = 1 to 5329	${{E(b)} \over {E(e)}}$	Relative size of E (b )/E (e )
1	3.5	22.0	91.5	2,665.0	$\infty$	—
2	4.7	9.3	13.3	22.9	1.0615	1.0615
e	4.5	9.0	12.9	22.1	1.0000	1
3	5.0	9.5	13.1	22.2	1.0046	1.0046
4	6.0	10.3	14.2	23.9	1.0615	1.0615
5	6.7	11.7	15.8	26.3	1.1429	1.1429
6	7.0	12.4	16.7	28.3	1.2319	1.2319
7	7.0	13.0	18.9	31.3	1.3234	1.3234
8	8.0	14.7	20.9	33.0	1.4153	1.4153
9	9.0	16.3	22.6	34.6	1.5069	1.5069
10	10.0	17.9	24.1	37.9	1.5977	1.5977
12	12.0	20.9	25.8	43.8	1.7765	1.7765
15	15.0	25.1	28.8	49.8	2.0377	2.0377
16	16.0	26.4	30.7	50.9	2.1230	2.123
20	20.0	31.2	37.9	58.4	2.4560	2.456
30	30.0	39.8	55.2	84.8	3.2449	3.2449
40	40.0	43.7	71.4	107.7	3.9891	3.9891
60	60.0	60.0	100.5	138.8	5.3910	5.391

Ternary tree efficiency

One result of the relative economy of base 3 is that ternary search trees offer an efficient strategy for retrieving elements of a database.^[2] A similar analysis suggests that the optimum design of a large telephone menu system to minimise the number of menu choices that the average customer must listen to (i.e. the product of the number of choices per menu and the number of menu levels) is to have three choices per menu.^[1]

In a $d$ -ary heap, a priority queue data structure based on $d$ -ary trees, the worst-case number of comparisons per operation in a heap containing $n$ elements is $d\log _{d}n$ (up to lower-order terms), the same formula used above. It has been suggested that choosing $d=3$ or $d=4$ may offer optimal performance in practice.^[3]

Brian Hayes suggests that $E(b,N)$ may be the appropriate measure for the complexity of an Interactive voice response menu: in a tree-structured phone menu with $n$ outcomes and $r$ choices per step, the time to traverse the menu is proportional to the product of $r$ (the time to present the choices at each step) with $\log _{r}n$ (the number of choices that need to be made to determine the outcome). From this analysis, the optimal number of choices per step in such a menu is three.^[1]

Computer hardware efficiencies

The 1950 reference High-Speed Computing Devices describes a particular situation using contemporary technology. Each digit of a number would be stored as the state of a ring counter composed of several triodes. Whether vacuum tubes or thyratrons, the triodes were the most expensive part of a counter. For small radices r less than about 7, a single digit required r triodes.^[4] (Larger radices required 2r triodes arranged as r flip-flops, as in ENIAC's decimal counters.)^[5]

So the number of triodes in a numerical register with n digits was rn. In order to represent numbers up to 10⁶, the following numbers of tubes were needed:

Radix r	Tubes N = rn
2	39.20
3	38.24
4	39.20
5	42.90
10	60.00

The authors conclude,

Under these assumptions, the radix 3, on the average, is the most economical choice, closely followed by radices 2 and 4. These assumptions are, of course, only approximately valid, and the choice of 2 as a radix is frequently justified on more complete analysis. Even with the optimistic assumption that 10 triodes will yield a decimal ring, radix 10 leads to about one and one-half times the complexity of radix 2, 3, or 4. This is probably significant despite the shallow nature of the argument used here.^[6]

Related Research Articles

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

In mathematics and computer programming, exponentiating by squaring is a general method for fast computation of large positive integer powers of a number, or more generally of an element of a semigroup, like a polynomial or a square matrix. Some variants are commonly referred to as square-and-multiply algorithms or binary exponentiation. These can be of quite general use, for example in modular arithmetic or powering of matrices. For semigroups for which additive notation is commonly used, like elliptic curves used in cryptography, this method is also referred to as double-and-add.

In mathematics, the logarithm is the inverse function to exponentiation. That means that the logarithm of a number $x$ to the base $b$ is the exponent to which $b$ must be raised to produce $x$ . For example, since $1000 = 10 3$ , the logarithm base $of 1000 is 3, or log 10 (1000) = 3 . The logarithm of x to base b is denoted as log b (x), or without parentheses, log b x . When the base is clear from the context or is irrelevant it is sometimes written log x .$

In mathematics, the floor function (or greatest integer function) is the function that takes as input a real number $x$ , and gives as output the greatest integer less than or equal to $x$ , denoted $⌊ x ⌋$ or $floor(x)$ . Similarly, the ceiling function maps $x$ to the smallest integer greater than or equal to $x$ , denoted $⌈ x ⌉$ or $ceil(x)$ .

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

Rounding or rounding off means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $ $23.4476$ with $ $23.45$ , the fraction 312/937 with 1/3, or the expression √2 with $1.414$ .

In computer science, the Akra–Bazzi method, or Akra–Bazzi theorem, is used to analyze the asymptotic behavior of the mathematical recurrences that appear in the analysis of divide and conquer algorithms where the sub-problems have substantially different sizes. It is a generalization of the master theorem for divide-and-conquer recurrences, which assumes that the sub-problems have equal size. It is named after mathematicians Mohamad Akra and Louay Bazzi.

Toom–Cook, sometimes known as Toom-3, named after Andrei Toom, who introduced the new algorithm with its low complexity, and Stephen Cook, who cleaned the description of it, is a multiplication algorithm for large integers.

In mathematics a polydivisible number is a number in a given number base with digits abcde... that has the following properties:

Its first digit a is not 0.
The number formed by its first two digits ab is a multiple of 2.
The number formed by its first three digits abc is a multiple of 3.
The number formed by its first four digits abcd is a multiple of 4.
etc.

The AKS primality test is a deterministic primality-proving algorithm created and published by Manindra Agrawal, Neeraj Kayal, and Nitin Saxena, computer scientists at the Indian Institute of Technology Kanpur, on August 6, 2002, in an article titled "PRIMES is in P". The algorithm was the first one which is able to determine in polynomial time, whether a given number is prime or composite and this without relying on mathematical conjectures such as the generalized Riemann hypothesis. The proof is also notable for not relying on the field of analysis. In 2006 the authors received both the Gödel Prize and Fulkerson Prize for their work.

In number theory, a formula for primes is a formula generating the prime numbers, exactly and without exception. Formulas for calculating primes do exist, however, they are computationally very slow. A number of constraints are known, showing what such a "formula" can and cannot be.

In number theory, a self number or Devlali number in a given number base $is a natural number that cannot be written as the sum of any other natural number and the individual digits of . 20 is a self number, because no such combination can be found. 21 is not, because it can be written as 15 + 1 + 5 using n = 15. These numbers were first described in 1949 by the Indian mathematician D. R. Kaprekar.$

In number theory, the integer square root (isqrt) of a non-negative integer $n$ is the non-negative integer $m$ which is the greatest integer less than or equal to the square root of $n$ ,

Elias ω coding or Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the positive integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.

In number theory, Mills' constant is defined as the smallest positive real number A such that the floor function of the double exponential function

A division algorithm is an algorithm which, given two integers N and D, computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed by digital circuit designs and software.

A non-integer representation uses non-integer numbers as the radix, or base, of a positional numeral system. For a non-integer radix β > 1, the value of

In mathematics, trailing zeros are a sequence of 0 in the decimal representation of a number, after which no other digits follow.

The digital root of a natural number in a given radix is the value obtained by an iterative process of summing digits, on each iteration using the result from the previous iteration to compute a digit sum. The process continues until a single-digit number is reached. For example, in base 10, the digital root of the number 12345 is 6 because the sum of the digits in the number is 1 + 2 + 3 + 4 + 5 = 15, then the addition process is repeated again for the resulting number 15, so that the sum of 1 + 5 equals 6, which is the digital root of that number. In base 10, this is equivalent to taking the remainder upon division by 9, which allows it to be used as a divisibility rule.

In computer science, multiply-with-carry (MWC) is a method invented by George Marsaglia for generating sequences of random integers based on an initial set from two to many thousands of randomly chosen seed values. The main advantages of the MWC method are that it invokes simple computer integer arithmetic and leads to very fast generation of sequences of random numbers with immense periods, ranging from around $to .$

References

1 2 3 Brian Hayes (2001). "Third Base". American Scientist . 89 (6): 490. doi:10.1511/2001.40.3268. Archived from the original on 2014-01-11. Retrieved 2013-07-28.
↑ Bentley, Jon; Sedgewick, Bob (1998-04-01). "Ternary Search Trees". Dr. Dobb's Journal. UBM Tech. Retrieved 2013-07-28.
↑ Tarjan, R. E. (1983). "3.2. d-heaps". Data Structures and Network Algorithms. CBMS-NSF Regional Conference Series in Applied Mathematics. Vol. 44. Society for Industrial and Applied Mathematics. pp. 34–38.
↑ Engineering Research Associates Staff (1950). "3-6 The r-triode Counter, Modulo r". High-Speed Computing Devices. McGraw-Hill. pp. 22–23. Retrieved 2008-08-27.
↑ Engineering Research Associates Staff (1950). "3-7 The 2r-triode Counter, Modulo r". High-Speed Computing Devices. McGraw-Hill. pp. 23–25. Retrieved 2008-08-27.
↑ Engineering Research Associates Staff (1950). "6-7 Economy Attained by Radix Choice". High-Speed Computing Devices. McGraw-Hill. pp. 84–87. Retrieved 2008-08-27.