Unary numeral system

Last updated

The unary numeral system is the simplest numeral system to represent natural numbers: [1] to represent a number N, a symbol representing 1 is repeated N times. [2]

Contents

In the unary system, the number 0 (zero) is represented by the empty string, that is, the absence of a symbol. Numbers 1, 2, 3, 4, 5, 6, ... are represented in unary as 1, 11, 111, 1111, 11111, 111111, ... [3]

Unary is a bijective numeral system. However, because the value of a digit does not depend on its position, it is not a form of positional notation, and it is unclear whether it would be appropriate to say that it has a base (or "radix") of 1, as it behaves differently from all other bases. [ citation needed ]

The use of tally marks in counting is an application of the unary numeral system. For example, using the tally mark | (𝍷), the number 3 is represented as |||. In East Asian cultures, the number 3 is represented as , a character drawn with three strokes. [4] (One and two are represented similarly.) In China and Japan, the character 正, drawn with 5 strokes, is sometimes used to represent 5 as a tally. [5] [6]

Unary numbers should be distinguished from repunits, which are also written as sequences of ones but have their usual decimal numerical interpretation.

Operations

Addition and subtraction are particularly simple in the unary system, as they involve little more than string concatenation. [7] The Hamming weight or population count operation that counts the number of nonzero bits in a sequence of binary values may also be interpreted as a conversion from unary to binary numbers. [8] However, multiplication is more cumbersome and has often been used as a test case for the design of Turing machines. [9] [10] [11]

Complexity

Compared to standard positional numeral systems, the unary system is inconvenient and hence is not used in practice for large calculations. It occurs in some decision problem descriptions in theoretical computer science (e.g. some P-complete problems), where it is used to "artificially" decrease the run-time or space requirements of a problem. For instance, the problem of integer factorization is suspected to require more than a polynomial function of the length of the input as run-time if the input is given in binary, but it only needs linear runtime if the input is presented in unary. [12] [13] However, this is potentially misleading. Using a unary input is slower for any given number, not faster; the distinction is that a binary (or larger base) input is proportional to the base 2 (or larger base) logarithm of the number while unary input is proportional to the number itself. Therefore, while the run-time and space requirement in unary looks better as function of the input size, it does not represent a more efficient solution. [14]

In computational complexity theory, unary numbering is used to distinguish strongly NP-complete problems from problems that are NP-complete but not strongly NP-complete. A problem in which the input includes some numerical parameters is strongly NP-complete if it remains NP-complete even when the size of the input is made artificially larger by representing the parameters in unary. For such a problem, there exist hard instances for which all parameter values are at most polynomially large. [15]

Applications

In addition to the application in tally marks, unary numbering is used as part of some data compression algorithms such as Golomb coding. [16] It also forms the basis for the Peano axioms for formalizing arithmetic within mathematical logic. [17] A form of unary notation called Church encoding is used to represent numbers within lambda calculus. [18]

Some email spam filters tag messages with a number of asterisks in an e-mail header such as X-Spam-Bar or X-SPAM-LEVEL. The larger the number, the more likely the email is considered spam. Using a unary representation instead of a decimal number lets the user search for messages with a given rating or higher. For example, searching for **** yield messages with a rating of at least 4. [19]

See also

Related Research Articles

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem.

In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

In complexity theory and computability theory, an oracle machine is an abstract machine used to study decision problems. It can be visualized as a Turing machine with a black box, called an oracle, which is able to solve certain problems in a single operation. The problem can be of any complexity class. Even undecidable problems, such as the halting problem, can be used.

In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree. The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?".

In computational complexity theory, a decision problem is P-complete if it is in P and every problem in P can be reduced to it by an appropriate reduction.

In computer science, a universal Turing machine (UTM) is a Turing machine capable of computing any computable sequence, as described by Alan Turing in his seminal paper "On Computable Numbers, with an Application to the Entscheidungsproblem". Common sense might say that a universal machine is impossible, but Turing proves that it is possible. He suggested that we may compare a man in the process of computing a real number to a machine which is only capable of a finite number of conditions q 1: q 2. .... qI; which will be called "m-configurations". He then described the operation of such machine, as described below, and argued:

It is my contention that these operations include all those which are used in the computation of a number.

<span class="mw-page-title-main">Time complexity</span> Estimate of time taken for running an algorithm

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. This appears to have been first demonstrated in Gurevich, Stockmeyer & Vishkin (1984). The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

In computational complexity theory, an advice string is an extra input to a Turing machine that is allowed to depend on the length n of the input, but not on the input itself. A decision problem is in the complexity class P/f(n) if there is a polynomial time Turing machine M with the following property: for any n, there is an advice string A of length f(n) such that, for any input x of length n, the machine M correctly decides the problem on the input x, given x and A.

In computational complexity theory, P/poly is a complexity class representing problems that can be solved by small circuits. More precisely, it is the set of formal languages that have polynomial-size circuit families. It can also be defined equivalently in terms of Turing machines with advice, extra information supplied to the Turing machine along with its input, that may depend on the input length but not on the input itself. In this formulation, P/poly is the class of decision problems that can be solved by a polynomial-time Turing machine with advice strings of length polynomial in the input size. These two different definitions make P/poly central to circuit complexity and non-uniform complexity.

In computational complexity, strong NP-completeness is a property of computational problems that is a special case of NP-completeness. A general computational problem may have numerical parameters. For example, the input to the bin packing problem is a list of objects of specific sizes and a size for the bins that must contain the objects—these object sizes and bin size are numerical parameters.

<span class="mw-page-title-main">Boolean circuit</span> Model of computation

In computational complexity theory and circuit complexity, a Boolean circuit is a mathematical model for combinational digital logic circuits. A formal language can be decided by a family of Boolean circuits, one circuit for each possible input length.

In computational complexity, an NP-complete problem is weakly NP-complete if there is an algorithm for the problem whose running time is polynomial in the dimension of the problem and the magnitudes of the data involved, rather than the base-two logarithms of their magnitudes. Such algorithms are technically exponential functions of their input size and are therefore not considered polynomial.

In computational complexity theory, a unary language or tally language is a formal language where all strings have the form 1k, where "1" can be any fixed symbol. For example, the language {1, 111, 1111} is unary, as is the language {1k | k is prime}. The complexity class of all such languages is sometimes called TALLY.

In computational complexity theory, a sparse language is a formal language such that the complexity function, counting the number of strings of length n in the language, is bounded by a polynomial function of n. They are used primarily in the study of the relationship of the complexity class NP with other classes. The complexity class of all sparse languages is called SPARSE.

In computing, especially computational geometry, a real RAM is a mathematical model of a computer that can compute with exact real numbers instead of the binary fixed point or floating point numbers used by most actual computers. The real RAM was formulated by Michael Ian Shamos in his 1978 Ph.D. dissertation.

In mathematical logic, the spectrum of a sentence is the set of natural numbers occurring as the size of a finite model in which a given sentence is true. By a result in descriptive complexity, a set of natural numbers is a spectrum if and only if it can be recognized in non-deterministic exponential time.

References

  1. Hodges, Andrew (2009), One to Nine: The Inner Life of Numbers, Anchor Canada, p. 14, ISBN   9780385672665 .
  2. Davis, Martin; Sigal, Ron; Weyuker, Elaine J. (1994), Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science, Computer Science and Scientific Computing (2nd ed.), Academic Press, p. 117, ISBN   9780122063824 .
  3. Hext, Jan (1990), Programming Structures: Machines and Programs, vol. 1, Prentice Hall, p. 33, ISBN   9780724809400 .
  4. Woodruff, Charles E. (1909), "The Evolution of Modern Numerals from Ancient Tally Marks", American Mathematical Monthly , 16 (8–9): 125–33, doi:10.2307/2970818, JSTOR   2970818 .
  5. Hsieh, Hui-Kuang (1981), "Chinese Tally Mark", The American Statistician , 35 (3): 174, doi:10.2307/2683999, JSTOR   2683999
  6. Lunde, Ken; Miura, Daisuke (January 27, 2016), "Proposal to Encode Five Ideographic Tally Marks", Unicode Consortium (PDF), Proposal L2/16-046
  7. Sazonov, Vladimir Yu. (1995), "On feasible numbers", Logic and computational complexity (Indianapolis, IN, 1994), Lecture Notes in Comput. Sci., vol. 960, Springer, Berlin, pp.  30–51, doi:10.1007/3-540-60178-3_78, ISBN   978-3-540-60178-4, MR   1449655 . See in particular p.  48.
  8. Blaxell, David (1978), "Record linkage by bit pattern matching", in Hogben, David; Fife, Dennis W. (eds.), Computer Science and Statistics--Tenth Annual Symposium on the Interface, NBS Special Publication, vol. 503, U.S. Department of Commerce / National Bureau of Standards, pp. 146–156.
  9. Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation , Addison Wesley, Example 7.7, pp. 158–159, ISBN   978-0-201-02988-8 .
  10. Dewdney, A. K. (1989), The New Turing Omnibus: Sixty-Six Excursions in Computer Science, Computer Science Press, p. 209, ISBN   9780805071665 .
  11. Rendell, Paul (2015), "5.3 Larger Example TM: Unary Multiplication", Turing Machine Universality of the Game of Life, Emergence, Complexity and Computation, vol. 18, Springer, pp. 83–86, ISBN   9783319198422 .
  12. Arora, Sanjeev; Barak, Boaz (2007), "The computational model —and why it doesn't matter" (PDF), Computational Complexity: A Modern Approach (January 2007 draft ed.), Cambridge University Press, §17, pp. 32–33, retrieved May 10, 2017.
  13. Feigenbaum, Joan (Fall 2012), CPSC 468/568 HW1 Solution Set (PDF), Computer Science Department, Yale University, retrieved 2014-10-21. [ permanent dead link ]
  14. Moore, Cristopher; Mertens, Stephan (2011), The Nature of Computation, Oxford University Press, p. 29, ISBN   9780199233212 .
  15. Garey, M. R.; Johnson, D. S. (1978), "'Strong' NP-completeness results: Motivation, examples, and implications", Journal of the ACM , 25 (3): 499–508, doi: 10.1145/322077.322090 , MR   0478747, S2CID   18371269 .
  16. Golomb, S.W. (1966), "Run-length encodings", IEEE Transactions on Information Theory, IT-12 (3): 399–401, doi:10.1109/TIT.1966.1053907 .
  17. Magaud, Nicolas; Bertot, Yves (2002), "Changing data structures in type theory: a study of natural numbers", Types for proofs and programs (Durham, 2000), Lecture Notes in Comput. Sci., vol. 2277, Springer, Berlin, pp. 181–196, doi:10.1007/3-540-45842-5_12, ISBN   978-3-540-43287-6, MR   2044538 .
  18. Jansen, Jan Martin (2013), "Programming in the λ-calculus: from Church to Scott and back", The Beauty of Functional Code, Lecture Notes in Computer Science, vol. 8106, Springer-Verlag, pp. 168–180, doi:10.1007/978-3-642-40355-2_12, ISBN   978-3-642-40354-5 .
  19. "Email, Spam Control, How to get service for departmental email servers".