Bijective numeration

Last updated

Bijective numeration is any numeral system in which every non-negative integer can be represented in exactly one way using a finite string of digits. The name refers to the bijection (i.e. one-to-one correspondence) that exists in this case between the set of non-negative integers and the set of finite strings using a finite set of symbols (the "digits").

Contents

Most ordinary numeral systems, such as the common decimal system, are not bijective because more than one string of digits can represent the same positive integer. In particular, adding leading zeroes does not change the value represented, so "1", "01" and "001" all represent the number one. Even though only the first is usual, the fact that the others are possible means that the decimal system is not bijective. However, the unary numeral system, with only one digit, is bijective.

A bijective base-k numeration is a bijective positional notation. It uses a string of digits from the set {1, 2, ..., k} (where k ≥ 1) to encode each positive integer; a digit's position in the string defines its value as a multiple of a power of k. Smullyan (1961) calls this notation k-adic, but it should not be confused with the p-adic numbers: bijective numerals are a system for representing ordinary integers by finite strings of nonzero digits, whereas the p-adic numbers are a system of mathematical values that contain the integers as a subset and may need infinite sequences of digits in any numerical representation.

Definition

The base-k bijective numeration system uses the digit-set {1, 2, ..., k} (k ≥ 1) to uniquely represent every non-negative integer, as follows:

anan−1 ... a1a0
is
ankn + an−1kn−1 + ... + a1k1 + a0k0.
anan−1 ... a1a0
where
and
being the least integer not less than x (the ceiling function).

In contrast, standard positional notation can be defined with a similar recursive algorithm where

Extension to integers

For base , the bijective base- numeration system could be extended to negative integers in the same way as the standard base- numeral system by use of an infinite number of the digit , where , represented as a left-infinite sequence of digits . This is because the Euler summation

meaning that

and for every positive number with bijective numeration digit representation is represented by . For base , negative numbers are represented by with , while for base , negative numbers are represented by . This is similar to how in signed-digit representations, all integers with digit representations are represented as where . This representation is no longer bijective, as the entire set of left-infinite sequences of digits is used to represent the -adic integers, of which the integers are only a subset.

Properties of bijective base-k numerals

For a given base ,

For a given base ,

bijective base 1: λ 1 11 111 1111 11111 111111 1111111 11111111 111111111 1111111111 11111111111 111111111111 1111111111111 11111111111111 111111111111111 1111111111111111 ... (unary numeral system)
bijective base 2: λ 1 2 11 12 21 22 111 112 121 122 211 212 221 222 1111 1112 ...
binary: 0 1 10 11 100 101 110 111 1000 1001 1010 1011 1100 1101 1110 1111 10000 ...
bijective base 3: λ 1 2 3 11 12 13 21 22 23 31 32 33 111 112 113 121 ...
ternary: 0 1 2 10 11 12 20 21 22 100 101 102 110 111 112 120 121 ...
bijective base 8: λ 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 ...
octal: 0 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 20 ...
bijective base 10: λ 1 2 3 4 5 6 7 8 9 A 11 12 13 14 15 16 ...
decimal: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...
bijective base 12: λ 1 2 3 4 5 6 7 8 9 A B C 11 12 13 14 ...
duodecimal: 0 1 2 3 4 5 6 7 8 9 A B 10 11 12 13 14 ...
bijective base 16: λ 1 2 3 4 5 6 7 8 9 A B C D E F G ...
hexadecimal: 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 ...

Examples

34152 (in bijective base-5) = 3×54 + 4×53 + 1×52 + 5×51 + 2×1 = 2427 (in decimal).
119A (in bijective base-10, with "A" representing the digit value ten) = 1×103 + 1×102 + 9×101 + 10×1 = 1200 (in decimal).
A typical alphabetic list with more than 26 elements is bijective, using the order of A, B, C...X, Y, Z, AA, AB, AC...ZX, ZY, ZZ, AAA, AAB, AAC...

The bijective base-10 system

The bijective base-10 system is a base ten positional numeral system that does not use a digit to represent zero. It instead has a digit to represent ten, such as A.

As with conventional decimal, each digit position represents a power of ten, so for example 123 is "one hundred, plus two tens, plus three units." All positive integers that are represented solely with non-zero digits in conventional decimal (such as 123) have the same representation in the bijective base-10 system. Those that use a zero must be rewritten, so for example 10 becomes A, conventional 20 becomes 1A, conventional 100 becomes 9A, conventional 101 becomes A1, conventional 302 becomes 2A2, conventional 1000 becomes 99A, conventional 1110 becomes AAA, conventional 2010 becomes 19AA, and so on.

Addition and multiplication in this system are essentially the same as with conventional decimal, except that carries occur when a position exceeds ten, rather than when it exceeds nine. So to calculate 643 + 759, there are twelve units (write 2 at the right and carry 1 to the tens), ten tens (write A with no need to carry to the hundreds), thirteen hundreds (write 3 and carry 1 to the thousands), and one thousand (write 1), to give the result 13A2 rather than the conventional 1402.

The bijective base-26 system

In the bijective base-26 system one may use the Latin alphabet letters "A" to "Z" to represent the 26 digit values one to twenty-six. (A=1, B=2, C=3, ..., Z=26)

With this choice of notation, the number sequence (starting from 1) begins A, B, C, ..., X, Y, Z, AA, AB, AC, ..., AX, AY, AZ, BA, BB, BC, ...

Each digit position represents a power of twenty-six, so for example, the numeral WI represents the value 23 × 261 + 9 × 260 = 607 in base 10.

Many spreadsheets including Microsoft Excel use this system to assign labels to the columns of a spreadsheet, starting A, B, C, ..., Z, AA, AB, ..., AZ, BA, ..., ZZ, AAA, etc. For instance, in Excel 2013, there can be up to 16384 columns (214 in binary code), labeled from A to XFD. [3] Malware variants are also named using this system: for example, the first widespread Microsoft Word macro virus, Concept, is formally named WM/Concept.A, its 26th variant WM/Concept.Z, the 27th variant WM/Concept.AA, et seq. A variant of this system is used to name variable stars. [4] It can be applied to any problem where a systematic naming using letters is desired, while using the shortest possible strings.

Historical notes

The fact that every non-negative integer has a unique representation in bijective base-k (k ≥ 1) is a "folk theorem" that has been rediscovered many times. Early instances are Foster (1947) for the case k = 10, and Smullyan (1961) and Böhm (1964) for all k ≥ 1. Smullyan uses this system to provide a Gödel numbering of the strings of symbols in a logical system; Böhm uses these representations to perform computations in the programming language P′′. Knuth (1969) mentions the special case of k = 10, and Salomaa (1973) discusses the cases k ≥ 2. Forslund (1995) appears to be another rediscovery, and hypothesizes that if ancient numeration systems used bijective base-k, they might not be recognized as such in archaeological documents, due to general unfamiliarity with this system.

Notes

  1. "How many digits are in the bijective base-k numeral for n?". Stackexchange. Retrieved 22 September 2018.
  2. Forslund (1995).
  3. Harvey, Greg (2013), Excel 2013 For Dummies, John Wiley & Sons, ISBN   9781118550007 .
  4. Hellier, Coel (2001), "Appendix D: Variable star nomenclature", Cataclysmic Variable Stars - How and Why They Vary, Praxis Books in Astronomy and Space, Springer, p. 197, ISBN   9781852332112 .

Related Research Articles

<span class="mw-page-title-main">Decimal</span> Number in base-10 numeral system

The decimal numeral system is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numeral system. The way of denoting numbers in the decimal system is often referred to as decimal notation.

<span class="mw-page-title-main">Numeral system</span> Notation for expressing numbers

A numeral system is a writing system for expressing numbers; that is, a mathematical notation for representing numbers of a given set, using digits or other symbols in a consistent manner.

<span class="mw-page-title-main">Number</span> Used to count, measure, and label

A number is a mathematical object used to count, measure, and label. The most basic examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers can be represented by symbols, called numerals; for example, "5" is a numeral that represents the number five. As only a relatively small number of symbols can be memorized, basic numerals are commonly organized in a numeral system, which is an organized way to represent any number. The most common numeral system is the Hindu–Arabic numeral system, which allows for the representation of any non-negative integer using a combination of ten fundamental numeric symbols, called digits. In addition to their use in counting and measuring, numerals are often used for labels, for ordering, and for codes. In common usage, a numeral is not clearly distinguished from the number that it represents.

<i>p</i>-adic number Number system extending the rational numbers

In number theory, given a prime number p, the p-adic numbers form an extension of the rational numbers which is distinct from the real numbers, though with some similar properties; p-adic numbers can be written in a form similar to decimals, but with digits based on a prime number p rather than ten, and extending to the left rather than to the right.

Golden ratio base is a non-integer positional numeral system that uses the golden ratio as its base. It is sometimes referred to as base-φ, golden mean base, phi-base, or, colloquially, phinary. Any non-negative real number can be represented as a base-φ numeral using only the digits 0 and 1, and avoiding the digit sequence "11" – this is called a standard form. A base-φ numeral that includes the digit sequence "11" can always be rewritten in standard form, using the algebraic properties of the base φ — most notably that φ1 + φ0 = φ2. For instance, 11φ = 100φ.

<span class="mw-page-title-main">Arithmetic coding</span> Form of entropy encoding used in data compression

Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic encoding, frequently used characters will be stored with fewer bits and not-so-frequently occurring characters will be stored with more bits, resulting in fewer bits used in total. Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, an arbitrary-precision fraction q, where 0.0 ≤ q < 1.0. It represents the current information as a range, defined by two numbers. A recent family of entropy coders called asymmetric numeral systems allows for faster implementations thanks to directly operating on a single natural number representing the current information.

In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to be normal in base b if, for every positive integer n, all possible strings n digits long have density bn.

The quater-imaginary numeral system is a numeral system, first proposed by Donald Knuth in 1960. Unlike standard numeral systems, which use an integer as their bases, it uses the imaginary number 2i as its base. It is able to uniquely represent every complex number using only the digits 0, 1, 2, and 3. Numbers less than zero, which are ordinarily represented with a minus sign, are representable as digit strings in quater-imaginary; for example, the number −1 is represented as "103" in quater-imaginary notation.

A numerical digit or numeral is a single symbol used alone or in combinations, to represent numbers in a positional numeral system. The name "digit" comes from the fact that the ten digits of the hands correspond to the ten symbols of the common base 10 numeral system, i.e. the decimal digits.

Balanced ternary is a ternary numeral system that uses a balanced signed-digit representation of the integers in which the digits have the values −1, 0, and 1. This stands in contrast to the standard (unbalanced) ternary system, in which digits have values 0, 1 and 2. The balanced ternary system can represent all integers without using a separate minus sign; the value of the leading non-zero digit of a number has the sign of the number itself. The balanced ternary system is an example of a non-standard positional numeral system. It was used in some early computers and has also been used to solve balance puzzles.

<span class="mw-page-title-main">Positional notation</span> Method for representing or encoding numbers

Positional notation usually denotes the extension to any base of the Hindu–Arabic numeral system. More generally, a positional system is a numeral system in which the contribution of a digit to the value of a number is the value of the digit multiplied by a factor determined by the position of the digit. In early numeral systems, such as Roman numerals, a digit has only one value: I means one, X means ten and C a hundred. In modern positional systems, such as the decimal system, the position of the digit means that its value must be multiplied by some value: in 555, the three identical symbols represent five hundreds, five tens, and five units, respectively, due to their different positions in the digit string.

In mathematical notation for numbers, a signed-digit representation is a positional numeral system with a set of signed digits used to encode the integers.

In mathematics, 0.999... is a notation for the repeating decimal consisting of an unending sequence of 9s after the decimal point. This repeating decimal is a numeral that represents the smallest number no less than every number in the sequence ; that is, the supremum of this sequence. This number is equal to 1. In other words, "0.999..." is not "almost exactly 1" or "very, very nearly but not quite 1"; rather, "0.999..." and "1" represent exactly the same number.

A decimal representation of a non-negative real number r is its expression as a sequence of symbols consisting of decimal digits traditionally written with a single separator:

Non-standard positional numeral systems here designates numeral systems that may loosely be described as positional systems, but that do not entirely comply with the following description of standard positional systems:

In mathematics, Midy's theorem, named after French mathematician E. Midy, is a statement about the decimal expansion of fractions a/p where p is a prime and a/p has a repeating decimal expansion with an even period. If the period of the decimal representation of a/p is 2n, so that

A non-integer representation uses non-integer numbers as the radix, or base, of a positional numeral system. For a non-integer radix β > 1, the value of

A negative base may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negative—that is to say, the base b is equal to −r for some natural number r.

A repeating decimal or recurring decimal is a decimal representation of a number whose digits are eventually periodic ; if this sequence consists only of zeros, the decimal is said to be terminating, and is not considered as repeating. It can be shown that a number is rational if and only if its decimal representation is repeating or terminating. For example, the decimal representation of 1/3 becomes periodic just after the decimal point, repeating the single digit "3" forever, i.e. 0.333.... A more complicated example is 3227/555, whose decimal becomes periodic at the second digit following the decimal point and then repeats the sequence "144" forever, i.e. 5.8144144144.... Another example of this is 593/53, which becomes periodic after the decimal point, repeating the 13-digit pattern "1886792452830" forever, i.e. 11.18867924528301886792452830....

In algebraic geometry, an ℓ-adic sheaf on a Noetherian scheme X is an inverse system consisting of -modules in the étale topology and inducing .

References