Randomized algorithm

Last updated

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output (or both) are random variables.

Contents

There is a distinction between algorithms that use the random input so that they always terminate with the correct answer, but where the expected running time is finite (Las Vegas algorithms, for example Quicksort [1] ), and algorithms which have a chance of producing an incorrect result (Monte Carlo algorithms, for example the Monte Carlo algorithm for the MFAS problem [2] ) or fail to produce a result either by signaling a failure or failing to terminate. In some cases, probabilistic algorithms are the only practical means of solving a problem. [3]

In common practice, randomized algorithms are approximated using a pseudorandom number generator in place of a true source of random bits; such an implementation may deviate from the expected theoretical behavior and mathematical guarantees which may depend on the existence of an ideal true random number generator.

Motivation

As a motivating example, consider the problem of finding an ‘a’ in an array of n elements.

Input: An array of n≥2 elements, in which half are ‘a’s and the other half are ‘b’s.

Output: Find an ‘a’ in the array.

We give two versions of the algorithm, one Las Vegas algorithm and one Monte Carlo algorithm.

Las Vegas algorithm:

findingA_LV(arrayA,n)beginrepeatRandomlyselectoneelementoutofnelements.until'a'isfoundend

This algorithm succeeds with probability 1. The number of iterations varies and can be arbitrarily large, but the expected number of iterations is

Since it is constant, the expected run time over many calls is . (See Big Theta notation)

Monte Carlo algorithm:

findingA_MC(arrayA,n,k)begini:=0repeatRandomlyselectoneelementoutofnelements.i:=i+1untili=kor'a'isfoundend

If an ‘a’ is found, the algorithm succeeds, else the algorithm fails. After k iterations, the probability of finding an ‘a’ is:

This algorithm does not guarantee success, but the run time is bounded. The number of iterations is always less than or equal to k. Taking k to be constant the run time (expected and absolute) is .

Randomized algorithms are particularly useful when faced with a malicious "adversary" or attacker who deliberately tries to feed a bad input to the algorithm (see worst-case complexity and competitive analysis (online algorithm)) such as in the Prisoner's dilemma. It is for this reason that randomness is ubiquitous in cryptography. In cryptographic applications, pseudo-random numbers cannot be used, since the adversary can predict them, making the algorithm effectively deterministic. Therefore, either a source of truly random numbers or a cryptographically secure pseudo-random number generator is required. Another area in which randomness is inherent is quantum computing.

In the example above, the Las Vegas algorithm always outputs the correct answer, but its running time is a random variable. The Monte Carlo algorithm (related to the Monte Carlo method for simulation) is guaranteed to complete in an amount of time that can be bounded by a function the input size and its parameter k, but allows a small probability of error. Observe that any Las Vegas algorithm can be converted into a Monte Carlo algorithm (via Markov's inequality), by having it output an arbitrary, possibly incorrect answer if it fails to complete within a specified time. Conversely, if an efficient verification procedure exists to check whether an answer is correct, then a Monte Carlo algorithm can be converted into a Las Vegas algorithm by running the Monte Carlo algorithm repeatedly till a correct answer is obtained.

Computational complexity

Computational complexity theory models randomized algorithms as probabilistic Turing machines . Both Las Vegas and Monte Carlo algorithms are considered, and several complexity classes are studied. The most basic randomized complexity class is RP, which is the class of decision problems for which there is an efficient (polynomial time) randomized algorithm (or probabilistic Turing machine) which recognizes NO-instances with absolute certainty and recognizes YES-instances with a probability of at least 1/2. The complement class for RP is co-RP. Problem classes having (possibly nonterminating) algorithms with polynomial time average case running time whose output is always correct are said to be in ZPP.

The class of problems for which both YES and NO-instances are allowed to be identified with some error is called BPP. This class acts as the randomized equivalent of P, i.e. BPP represents the class of efficient randomized algorithms.

Early history

Sorting

Quicksort was discovered by Tony Hoare in 1959, and subsequently published in 1961. [4] In the same year, Hoare published the quickselect algorithm, [5] which finds the median element of a list in linear expected time. It remained open until 1973 whether a deterministic linear-time algorithm existed. [6]

Number theory

In 1917, Henry Cabourn Pocklington introduced a randomized algorithm known as Pocklington's algorithm for efficiently finding square roots modulo prime numbers. [7] In 1970, Elwyn Berlekamp introduced a randomized algorithm for efficiently computing the roots of a polynomial over a finite field. [8] In 1977, Robert M. Solovay and Volker Strassen discovered a polynomial-time randomized primality test (i.e., determining the primality of a number). Soon afterwards Michael O. Rabin demonstrated that the 1976 Miller's primality test could also be turned into a polynomial-time randomized algorithm. At that time, no provably polynomial-time deterministic algorithms for primality testing were known.

Data structures

One of the earliest randomized data structures is the hash table, which was introduced in 1953 by Hans Peter Luhn at IBM. [9] Luhn's hash table used chaining to resolve collisions and was also one of the first applications of linked lists. [9] Subsequently, in 1954, Gene Amdahl, Elaine M. McGraw, Nathaniel Rochester, and Arthur Samuel of IBM Research introduced linear probing, [9] although Andrey Ershov independently had the same idea in 1957. [9] In 1962, Donald Knuth performed the first correct analysis of linear probing, [9] although the memorandum containing his analysis was not published until much later. [10] The first published analysis was due to Konheim and Weiss in 1966. [11]

Early works on hash tables either assumed access to a fully random hash function or assumed that the keys themselves were random. [9] In 1979, Carter and Wegman introduced universal hash functions, [12] which they showed could be used to implement chained hash tables with constant expected time per operation.

Early work on randomized data structures also extended beyond hash tables. In 1970, Burton Howard Bloom introduced an approximate-membership data structure known as the Bloom filter. [13] In 1989, Raimund Seidel and Cecilia R. Aragon introduced a randomized balanced search tree known as the treap. [14] In the same year, William Pugh introduced another randomized search tree known as the skip list. [15]

Implicit uses in combinatorics

Prior to the popularization of randomized algorithms in computer science, Paul Erdős popularized the use of randomized constructions as a mathematical technique for establishing the existence of mathematical objects. This technique has become known as the probabilistic method. [16] Erdős gave his first application of the probabilistic method in 1947, when he used a simple randomized construction to establish the existence of Ramsey graphs. [17] He famously used a more sophisticated randomized algorithm in 1959 to establish the existence of graphs with high girth and chromatic number. [18] [16]

Examples

Quicksort

Quicksort is a familiar, commonly used algorithm in which randomness can be useful. Many deterministic versions of this algorithm require O (n2) time to sort n numbers for some well-defined class of degenerate inputs (such as an already sorted array), with the specific class of inputs that generate this behavior defined by the protocol for pivot selection. However, if the algorithm selects pivot elements uniformly at random, it has a provably high probability of finishing in O(n log n) time regardless of the characteristics of the input.

Randomized incremental constructions in geometry

In computational geometry, a standard technique to build a structure like a convex hull or Delaunay triangulation is to randomly permute the input points and then insert them one by one into the existing structure. The randomization ensures that the expected number of changes to the structure caused by an insertion is small, and so the expected running time of the algorithm can be bounded from above. This technique is known as randomized incremental construction. [19]

Min cut

Input: A graph G(V,E)

Output: A cut partitioning the vertices into L and R, with the minimum number of edges between L and R.

Recall that the contraction of two nodes, u and v, in a (multi-)graph yields a new node u ' with edges that are the union of the edges incident on either u or v, except from any edge(s) connecting u and v. Figure 1 gives an example of contraction of vertex A and B. After contraction, the resulting graph may have parallel edges, but contains no self loops.

Figure 2: Successful run of Karger's algorithm on a 10-vertex graph. The minimum cut has size 3 and is indicated by the vertex colours. Single run of Karger's Mincut algorithm.svg
Figure 2: Successful run of Karger's algorithm on a 10-vertex graph. The minimum cut has size 3 and is indicated by the vertex colours.
Figure 1: Contraction of vertex A and B Contraction vertices.jpg
Figure 1: Contraction of vertex A and B

Karger's [20] basic algorithm:

begin     i = 1     repeatrepeat             Take a random edge (u,v) ∈ E in G             replace u and v with the contraction u'         until only 2 nodes remain         obtain the corresponding cut result Ci         i = i + 1     until i = m     output the minimum cut among C1, C2, ..., Cm. end

In each execution of the outer loop, the algorithm repeats the inner loop until only 2 nodes remain, the corresponding cut is obtained. The run time of one execution is , and n denotes the number of vertices. After m times executions of the outer loop, we output the minimum cut among all the results. The figure 2 gives an example of one execution of the algorithm. After execution, we get a cut of size 3.

Lemma 1  Let k be the min cut size, and let C = {e1, e2, ..., ek} be the min cut. If, during iteration i, no edge eC is selected for contraction, then Ci = C.

Proof

If G is not connected, then G can be partitioned into L and R without any edge between them. So the min cut in a disconnected graph is 0. Now, assume G is connected. Let V=LR be the partition of V induced by C : C = { {u,v} ∈ E : uL,vR} (well-defined since G is connected). Consider an edge {u,v} of C. Initially, u,v are distinct vertices. As long as we pick an edge , u and v do not get merged. Thus, at the end of the algorithm, we have two compound nodes covering the entire graph, one consisting of the vertices of L and the other consisting of the vertices of R. As in figure 2, the size of min cut is 1, and C = {(A,B)}. If we don't select (A,B) for contraction, we can get the min cut.

Lemma 2  If G is a multigraph with p vertices and whose min cut has size k, then G has at least pk/2 edges.

Proof

Because the min cut is k, every vertex v must satisfy degree(v) ≥ k. Therefore, the sum of the degree is at least pk. But it is well known that the sum of vertex degrees equals 2|E|. The lemma follows.

Analysis of algorithm

The probability that the algorithm succeeds is 1  the probability that all attempts fail. By independence, the probability that all attempts fail is

By lemma 1, the probability that Ci = C is the probability that no edge of C is selected during iteration i. Consider the inner loop and let Gj denote the graph after j edge contractions, where j ∈ {0, 1, …, n − 3}. Gj has nj vertices. We use the chain rule of conditional possibilities. The probability that the edge chosen at iteration j is not in C, given that no edge of C has been chosen before, is . Note that Gj still has min cut of size k, so by Lemma 2, it still has at least edges.

Thus, .

So by the chain rule, the probability of finding the min cut C is

Cancellation gives . Thus the probability that the algorithm succeeds is at least . For , this is equivalent to . The algorithm finds the min cut with probability , in time .

Derandomization

Randomness can be viewed as a resource, like space and time. Derandomization is then the process of removing randomness (or using as little of it as possible) [21] [22] . It is not currently known[ as of? ] if all algorithms can be derandomized without significantly increasing their running time [23] . For instance, in computational complexity, it is unknown whether P = BPP [23] , i.e., we do not know whether we can take an arbitrary randomized algorithm that runs in polynomial time with a small error probability and derandomize it to run in polynomial time without using randomness.

There are specific methods that can be employed to derandomize particular randomized algorithms:

Where randomness helps

When the model of computation is restricted to Turing machines, it is currently an open question whether the ability to make random choices allows some problems to be solved in polynomial time that cannot be solved in polynomial time without this ability; this is the question of whether P = BPP. However, in other contexts, there are specific examples of problems where randomization yields strict improvements.

See also

Notes

  1. Hoare, C. A. R. (July 1961). "Algorithm 64: Quicksort". Commun. ACM. 4 (7): 321–. doi:10.1145/366622.366644. ISSN   0001-0782.
  2. Kudelić, Robert (2016-04-01). "Monte-Carlo randomized algorithm for minimal feedback arc set problem". Applied Soft Computing. 41: 235–246. doi:10.1016/j.asoc.2015.12.018.
  3. "In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a 'correct' algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering." Hal Abelson and Gerald J. Sussman (1996). Structure and Interpretation of Computer Programs . MIT Press, section 1.2.
  4. Hoare, C. A. R. (July 1961). "Algorithm 64: Quicksort". Communications of the ACM. 4 (7): 321. doi:10.1145/366622.366644. ISSN   0001-0782.
  5. Hoare, C. A. R. (July 1961). "Algorithm 65: find". Communications of the ACM. 4 (7): 321–322. doi:10.1145/366622.366647. ISSN   0001-0782.
  6. Blum, Manuel; Floyd, Robert W.; Pratt, Vaughan; Rivest, Ronald L.; Tarjan, Robert E. (August 1973). "Time bounds for selection". Journal of Computer and System Sciences. 7 (4): 448–461. doi:10.1016/S0022-0000(73)80033-9.
  7. Williams, H. C.; Shallit, J. O. (1994), "Factoring integers before computers", in Gautschi, Walter (ed.), Mathematics of Computation 1943–1993: a half-century of computational mathematics; Papers from the Symposium on Numerical Analysis and the Minisymposium on Computational Number Theory held in Vancouver, British Columbia, August 9–13, 1993, Proceedings of Symposia in Applied Mathematics, vol. 48, Amer. Math. Soc., Providence, RI, pp. 481–531, doi:10.1090/psapm/048/1314885, ISBN   978-0-8218-0291-5, MR   1314885 ; see p. 504, "Perhaps Pocklington also deserves credit as the inventor of the randomized algorithm".
  8. Berlekamp, E. R. (1971). "Factoring polynomials over large finite fields". Proceedings of the second ACM symposium on Symbolic and algebraic manipulation - SYMSAC '71. Los Angeles, California, United States: ACM Press. p. 223. doi:10.1145/800204.806290. ISBN   9781450377867. S2CID   6464612.
  9. 1 2 3 4 5 6 Knuth, Donald E. (1998). The art of computer programming, volume 3: (2nd ed.) sorting and searching. USA: Addison Wesley Longman Publishing Co., Inc. pp. 536–549. ISBN   978-0-201-89685-5.
  10. Knuth, Donald (1963), Notes on "Open" Addressing , archived from the original on 2016-03-03
  11. Konheim, Alan G.; Weiss, Benjamin (November 1966). "An Occupancy Discipline and Applications". SIAM Journal on Applied Mathematics. 14 (6): 1266–1274. doi:10.1137/0114101. ISSN   0036-1399.
  12. Carter, J. Lawrence; Wegman, Mark N. (1979-04-01). "Universal classes of hash functions". Journal of Computer and System Sciences. 18 (2): 143–154. doi: 10.1016/0022-0000(79)90044-8 . ISSN   0022-0000.
  13. Bloom, Burton H. (July 1970). "Space/time trade-offs in hash coding with allowable errors". Communications of the ACM. 13 (7): 422–426. doi: 10.1145/362686.362692 . ISSN   0001-0782. S2CID   7931252.
  14. Aragon, C.R.; Seidel, R.G. (October 1989). "Randomized search trees". 30th Annual Symposium on Foundations of Computer Science. pp. 540–545. doi:10.1109/SFCS.1989.63531. ISBN   0-8186-1982-1.
  15. Pugh, William (April 1989). Concurrent Maintenance of Skip Lists (PS, PDF) (Technical report). Dept. of Computer Science, U. Maryland. CS-TR-2222.
  16. 1 2 Alon, Noga; Spencer, Joel H. (2016). The probabilistic method (Fourth ed.). Hoboken, New Jersey: Wiley. ISBN   978-1-119-06195-3. OCLC   910535517.
  17. P. Erdős: Some remarks on the theory of graphs, Bull. Amer. Math. Soc. 53 (1947), 292--294 MR8,479d; Zentralblatt 32,192.
  18. Erdös, P. (1959). "Graph Theory and Probability". Canadian Journal of Mathematics. 11: 34–38. doi: 10.4153/CJM-1959-003-9 . ISSN   0008-414X. S2CID   122784453.
  19. Seidel R. Backwards Analysis of Randomized Geometric Algorithms.
  20. Karger, David R. (1999). "Random Sampling in Cut, Flow, and Network Design Problems". Mathematics of Operations Research . 24 (2): 383–413. CiteSeerX   10.1.1.215.794 . doi:10.1287/moor.24.2.383.
  21. "6.046J Lecture 22: Derandomization | Design and Analysis of Algorithms | Electrical Engineering and Computer Science". MIT OpenCourseWare. Retrieved 2024-12-27.
  22. Luby, Michael; Wigderson, Avi (July 1995). Pairwise Independence and Derandomization (Report). USA: University of California at Berkeley.
  23. 1 2 "Lecture Notes, Chapter 3. Basic Derandomization Techniques". people.seas.harvard.edu. Retrieved 2024-12-27.
  24. Alippi, Cesare (2014), Intelligence for Embedded Systems, Springer, ISBN   978-3-319-05278-6 .
  25. Kushilevitz, Eyal; Nisan, Noam (2006), Communication Complexity, Cambridge University Press, ISBN   9780521029834 . For the deterministic lower bound see p. 11; for the logarithmic randomized upper bound see pp. 31–32.
  26. Dyer, M.; Frieze, A.; Kannan, R. (1991), "A random polynomial-time algorithm for approximating the volume of convex bodies" (PDF), Journal of the ACM , 38 (1): 1–17, doi:10.1145/102782.102783, S2CID   13268711
  27. Füredi, Z.; Bárány, I. (1986), "Computing the volume is difficult", Proc. 18th ACM Symposium on Theory of Computing (Berkeley, California, May 28–30, 1986) (PDF), New York, NY: ACM, pp. 442–447, CiteSeerX   10.1.1.726.9448 , doi:10.1145/12130.12176, ISBN   0-89791-193-8, S2CID   17867291
  28. Shamir, A. (1992), "IP = PSPACE", Journal of the ACM, 39 (4): 869–877, doi: 10.1145/146585.146609 , S2CID   315182
  29. Cook, Matthew; Soloveichik, David; Winfree, Erik; Bruck, Jehoshua (2009), "Programmability of chemical reaction networks", in Condon, Anne; Harel, David; Kok, Joost N.; Salomaa, Arto; Winfree, Erik (eds.), Algorithmic Bioprocesses (PDF), Natural Computing Series, Springer-Verlag, pp. 543–584, doi:10.1007/978-3-540-88869-7_27, ISBN   978-3-540-88868-0 .

Related Research Articles

In computational complexity theory, a branch of computer science, bounded-error probabilistic polynomial time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded by 1/3 for all instances. BPP is one of the largest practical classes of problems, meaning most problems of interest in BPP have efficient probabilistic algorithms that can be run quickly on real modern machines. BPP also contains P, the class of problems solvable in polynomial time with a deterministic machine, since a deterministic machine is a special case of a probabilistic machine.

<span class="mw-page-title-main">Interactive proof system</span>

In computational complexity theory, an interactive proof system is an abstract machine that models computation as the exchange of messages between two parties: a prover and a verifier. The parties interact by exchanging messages in order to ascertain whether a given string belongs to a language or not. The prover is assumed to possess unlimited computational resources but cannot be trusted, while the verifier has bounded computation power but is assumed to be always honest. Messages are sent between the verifier and prover until the verifier has an answer to the problem and has "convinced" itself that it is correct.

<span class="mw-page-title-main">Time complexity</span> Estimate of time taken for running an algorithm

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.

In computing, a Monte Carlo algorithm is a randomized algorithm whose output may be incorrect with a certain probability. Two examples of such algorithms are the Karger–Stein algorithm and the Monte Carlo algorithm for minimum feedback arc set.

<span class="mw-page-title-main">PP (complexity)</span> Class of problems in computer science

In complexity theory, PP, or PPT is the class of decision problems solvable by a probabilistic Turing machine in polynomial time, with an error probability of less than 1/2 for all instances. The abbreviation PP refers to probabilistic polynomial time. The complexity class was defined by Gill in 1977.

In computational complexity theory, an Arthur–Merlin protocol, introduced by Babai (1985), is an interactive proof system in which the verifier's coin tosses are constrained to be public. Goldwasser & Sipser (1986) proved that all (formal) languages with interactive proofs of arbitrary length with private coins also have interactive proofs with public coins.

In computational complexity theory, the complexity class NEXPTIME is the set of decision problems that can be solved by a non-deterministic Turing machine using time .

In computational complexity theory, SL is the complexity class of problems log-space reducible to USTCON, which is the problem of determining whether there exists a path between two vertices in an undirected graph, otherwise described as the problem of determining whether two vertices are in the same connected component. This problem is also called the undirected reachability problem. It does not matter whether many-one reducibility or Turing reducibility is used. Although originally described in terms of symmetric Turing machines, that equivalent formulation is very complex, and the reducibility definition is what is used in practice.

Randomized Logarithmic-space (RL), sometimes called RLP, is the complexity class of computational complexity theory problems solvable in logarithmic space and polynomial time with probabilistic Turing machines with one-sided error. It is named in analogy with RP, which is similar but has no logarithmic space restriction.

<span class="mw-page-title-main">Maximal independent set</span> Independent set which is not a subset of any other independent set

In graph theory, a maximal independent set (MIS) or maximal stable set is an independent set that is not a subset of any other independent set. In other words, there is no vertex outside the independent set that may join it because it is maximal with respect to the independent set property.

In theoretical computer science and cryptography, a pseudorandom generator (PRG) for a class of statistical tests is a deterministic procedure that maps a random seed to a longer pseudorandom string such that no statistical test in the class can distinguish between the output of the generator and the uniform distribution. The random seed itself is typically a short binary string drawn from the uniform distribution.

In mathematics, the Schwartz–Zippel lemma is a tool commonly used in probabilistic polynomial identity testing. Identity testing is the problem of determining whether a given multivariate polynomial is the 0-polynomial, the polynomial that ignores all its variables and always returns zero. The lemma states that evaluating a nonzero polynomial on inputs chosen randomly from a large-enough set is likely to find an input that produces a nonzero output.

MAX-3SAT is a problem in the computational complexity subfield of computer science. It generalises the Boolean satisfiability problem (SAT) which is a decision problem considered in complexity theory. It is defined as:

In computational complexity theory, Yao's principle relates the performance of randomized algorithms to deterministic (non-random) algorithms. It states that, for certain classes of algorithms, and certain measures of the performance of the algorithms, the following two quantities are equal:

<span class="mw-page-title-main">Karger's algorithm</span> Randomized algorithm for minimum cuts

In computer science and graph theory, Karger's algorithm is a randomized algorithm to compute a minimum cut of a connected graph. It was invented by David Karger and first published in 1993.

<span class="mw-page-title-main">Maximum cut</span> Problem of finding a maximum cut in a graph

In a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets S and T, such that the number of edges between S and T is as large as possible. Finding such a cut is known as the max-cut problem.

In computer science and graph theory, the term color-coding refers to an algorithmic technique which is useful in the discovery of network motifs. For example, it can be used to detect a simple path of length k in a given graph. The traditional color-coding algorithm is probabilistic, but it can be derandomized without much overhead in the running time.

In mathematics and computer science, the method of conditional probabilities is a systematic method for converting non-constructive probabilistic existence proofs into efficient deterministic algorithms that explicitly construct the desired object.

<span class="mw-page-title-main">Planted clique</span> Complete subgraph added to a random graph

In computational complexity theory, a planted clique or hidden clique in an undirected graph is a clique formed from another graph by selecting a subset of vertices and adding edges between each pair of vertices in the subset. The planted clique problem is the algorithmic problem of distinguishing random graphs from graphs that have a planted clique. This is a variation of the clique problem; it may be solved in quasi-polynomial time but is conjectured not to be solvable in polynomial time for intermediate values of the clique size. The conjecture that no polynomial time solution exists is called the planted clique conjecture; it has been used as a computational hardness assumption.

References