Randomized algorithms as zero-sum games

Last updated

Randomized algorithms are algorithms that employ a degree of randomness as part of their logic. These algorithms can be used to give good average-case results (complexity-wise) to problems which are hard to solve deterministically, or display poor worst-case complexity. An algorithmic game theoretic approach can help explain why in the average case randomized algorithms may work better than deterministic algorithms.

Formalizing the game

Consider a zero-sum game between player A, whose strategies are deterministic algorithms, and player B, who's strategies are inputs for A's algorithms. The cost of a strategy profile is the running time of A's chosen algorithm on B's chosen input. Therefore, player A tries to minimize the cost, and player B tries to maximize it. In the world of pure strategies, for every algorithm that A chooses, B may choose the most costly input – this is the worst-case scenario, and can be found using standard complexity analysis.

But in the real world, inputs are normally not selected by an ‘evil opponent’ – rather, they come from some distribution over inputs. Since this is the case, if we allow the algorithms to also be drawn from some distribution, we may look at the game as one that allows mixed strategies. That is, each player chooses a distribution over its strategies.

Analysis

Incorporating mixed strategies into the game allows us to use von Neumann's minimax theorem:

where R is a distribution over the algorithms, D is a distribution over inputs, A is a single deterministic algorithm, and T(A, D) is the average running time of algorithm a on input D. More specifically:

If we limit the set of algorithms to a specific family (for instance, all deterministic choices for pivots in the quick sort algorithm), choosing an algorithm A from R is equivalent to running a randomized algorithm (for instance, running quick sort and randomly choosing the pivots at each step).

This gives us an insight on Yao's principle, which states that the expected cost of any randomized algorithm for solving a given problem, on the worst-case input for that algorithm, can be no better than the expected cost, for a worst-case random probability distribution on the inputs, of the deterministic algorithm that performs best against that distribution.

Related Research Articles

In computer science, the computational complexity, or simply complexity of an algorithm is the amount of resources required for running it. The computational complexity of a problem is the minimum of the complexities of all possible algorithms for this problem.

Computational complexity theory focuses on classifying computational problems according to their inherent difficulty, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

Minimax is a decision rule used in artificial intelligence, decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case scenario. When dealing with gains, it is referred to as "maximin"—to maximize the minimum gain. Originally formulated for two-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.

In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most frequently used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the efficiency of other algorithms that require input data to be in sorted lists. Sorting is also often useful for canonicalizing data and for producing human-readable output. More formally, the output of any sorting algorithm must satisfy two conditions:

  1. The output is in nondecreasing order ;
  2. The output is a permutation of the input.

In theoretical computer science, communication complexity studies the amount of communication required to solve a problem when the input to the problem is distributed among two or more parties. The study of communication complexity was first introduced by Andrew Yao in 1979, while studying the problem of computation distributed among several machines. The problem is usually stated as follows: two parties each receive a -bit string and . The goal is for Alice to compute the value of a certain function, , that depends on both and , with the least amount of communication between them.

Time complexity An estimate of time taken for running an algorithm

In computer science, the time complexity is the computational complexity that describes the amount of time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to differ by at most a constant factor.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random bits. Formally, the algorithm's performance will be a random variable determined by the random bits; thus either the running time, or the output are random variables.

In computing, a Las Vegas algorithm is a randomized algorithm that always gives correct results; that is, it always produces the correct result or it informs about the failure. However, the runtime of a Las Vegas algorithm differs depending on the input. The usual definition of a Las Vegas algorithm includes the restriction that the expected runtime be finite, where the expectation is carried out over the space of random information, or entropy, used in the algorithm. An alternative definition requires that a Las Vegas algorithm always terminates, but may output a symbol not part of the solution space to indicate failure in finding a solution. The nature of Las Vegas algorithms makes them suitable in situations where the number of possible solutions is limited, and where verifying the correctness of a candidate solution is relatively easy while finding a solution is complex.

In computer science, a selection algorithm is an algorithm for finding the kth smallest number in a list or array; such a number is called the kth order statistic. This includes the cases of finding the minimum, maximum, and median elements. There are O(n)-time selection algorithms, and sublinear performance is possible for structured data; in the extreme, O(1) for an array of sorted data. Selection is a subproblem of more complex problems like the nearest neighbor and shortest path problems. Many selection algorithms are derived by generalizing a sorting algorithm, and conversely some sorting algorithms can be derived as repeated application of selection.

In mathematical folklore, the "no free lunch" (NFL) theorem of David Wolpert and William Macready appears in the 1997 "No Free Lunch Theorems for Optimization". Wolpert had previously derived no free lunch theorems for machine learning.

Quicksort A divide and conquer sorting algorithm

Quicksort is an efficient sorting algorithm, serving as a systematic method for placing the elements of a random access file or an array in order. Developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. When implemented well, it can be about two or three times faster than its main competitors, merge sort and heapsort.

In computational complexity theory, Yao's principle states that the expected cost of a randomized algorithm on the worst-case input is no better than the expected cost for a worst-case probability distribution on the inputs of the deterministic algorithm that performs best against that distribution. Thus, to establish a lower bound on the performance of randomized algorithms, it suffices to find an appropriate distribution of difficult inputs, and to prove that no deterministic algorithm can perform well against that distribution. This principle is named after Andrew Yao, who first proposed it.

Competitive analysis is a method invented for analyzing online algorithms, in which the performance of an online algorithm is compared to the performance of an optimal offline algorithm that can view the sequence of requests in advance. An algorithm is competitive if its competitive ratio—the ratio between its performance and the offline algorithm's performance—is bounded. Unlike traditional worst-case analysis, where the performance of an algorithm is measured only for "hard" inputs, competitive analysis requires that an algorithm perform well both on hard and easy inputs, where "hard" and "easy" are defined by the performance of the optimal offline algorithm.

In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs.

The Price of Anarchy (PoA) is a concept in economics and game theory that measures how the efficiency of a system degrades due to selfish behavior of its agents. It is a general notion that can be extended to diverse systems and notions of efficiency. For example, consider the system of transportation of a city and many agents trying to go from some initial location to a destination. Let efficiency in this case mean the average time for an agent to reach the destination. In the 'centralized' solution, a central authority can tell each agent which path to take in order to minimize the average travel time. In the 'decentralized' version, each agent chooses its own path. The Price of Anarchy measures the ratio between average travel time in the two cases.

Smoothed analysis

In theoretical computer science, smoothed analysis is a way of measuring the complexity of an algorithm. Since its introduction in 2001, smoothed analysis has been used as a basis for considerable research, for problems ranging from mathematical programming, numerical analysis, machine learning, and data mining. It can give a more realistic analysis of the practical performance of the algorithm, such as its running time, than using worst-case or average-case scenarios.

Generic-case complexity is a subfield of computational complexity theory that studies the complexity of computational problems on "most inputs".

In mathematics, an evasive Boolean functionƒ is a Boolean function for which every decision tree algorithm has running time of exactly n. Consequently, every decision tree algorithm that represents the function has, at worst case, a running time of n.

In decision theory and game theory, Wald's maximin model is a non-probabilistic decision-making model according to which decisions are ranked on the basis of their worst-case outcomes – the optimal decision is one with the least worst outcome. It is one of the most important models in robust decision making in general and robust optimization in particular.

In computer science, the median of medians is an approximate (median) selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, mainly the quickselect, that selects the kth largest element of an initially unsorted array. Median of medians finds an approximate median in linear time only, which is limited but an additional overhead for quickselect. When this approximate median is used as an improved pivot, the worst-case complexity of quickselect reduces significantly from quadratic to linear, which is also the asymptotically optimal worst-case complexity of any selection algorithm. In other words, the median of medians is an approximate median-selection algorithm that helps building an asymptotically optimal, exact general selection algorithm, by producing good pivot elements.