Robbins' problem

Last updated

In probability theory, Robbins' problem of optimal stopping [1] , named after Herbert Robbins, is sometimes referred to as the fourth secretary problem or the problem of minimizing the expected rank with full information. [2]

Contents

Let X1, ... , Xn be independent, identically distributed random variables, uniform on [0, 1]. We observe the Xk's sequentially and must stop on exactly one of them. No recall of preceding observations is permitted. What stopping rule minimizes the expected rank of the selected observation, and what is its corresponding value?

The general solution to this full-information expected rank problem is unknown. The major difficulty is that the problem is fully history-dependent, that is, the optimal rule depends at every stage on all preceding values, and not only on simpler sufficient statistics of these. Only bounds are known for the limiting value v as n goes to infinity, namely 1.908 < v < 2.329. These bounds are obtained by studying so-called memoryless strategies, that is strategies in which the decision to stop on $X_k$ depends only on the value of $X_k$ and not on the history of observations $X_1, \cdots, X_{k-1}$. It is known that there is some room to improve the lower bound by further computations for a truncated version of the problem within the class of memoryless stategeies. It is still not known how to improve on the upper bound for the limiting value, and this for whatever strategy. [3] [4] [5]

Another attempt proposed to make progress on the problem is a continuous time version of the problem where the observations follow a Poisson arrival process of homogeneous rate 1. Under some mild assumptions, the corresponding value function is bounded and Lipschitz continuous, and the differential equation for this value function is derived. [6] The limiting value of presents the solution of Robbins’ problem. It is shown that for large , . This estimation coincides with the bounds mentioned above.

The advantage of the continuous time version lies in the fact that the answer can be expressed in terms of the solution of a differential equation, i.e. the answer appears in a closed form. However, since the obtained differential equation contains, apart from the "objective function", another (small) unknown function, the approach does not seem so far to give a decisive advantage for finding the optimal limiting value.

A simple suboptimal rule, which performs almost as well as the optimal rule within the class of memoryless stopping rules, was proposed by Krieger & Samuel-Cahn. [7] The rule stops with the smallest such that for a given constant c, where is the relative rank of the ith observation and n is the total number of items. This rule has added flexibility. A curtailed version thereof can be used to select an item with a given probability , . The rule can be used to select two or more items. The problem of selecting a fixed percentage , , of n, is also treated.

Importance

One of the motivations to study Robbins' problem is that with its solution all classical (four) secretary problems would be solved. But the major reason is to understand how to cope with full history dependence in a (deceptively easy-looking) problem. On the Ester's Book International Conference in Israel (2006) Robbins' problem was accordingly named one of the four most important problems in the field of optimal stopping and sequential analysis.

History

Herbert Robbins presented the above described problem at the International Conference on Search and Selection in Real Time [note 1] in Amherst, 1990. He concluded his address with the words I should like to see this problem solved before I die. Scientists working in the field of optimal stopping have since called this problem Robbins' problem. Unfortunately, Herbert Robbins' wish did not become true. He died in 2001.


Chow–Robbins game

Another optimal stopping problem bearing Robbins' name (and not to be c onfused with Robbins' problem) is the Chow–Robbins game: [8] [9]

Given an infinite sequence of IID random variables with distribution , how to decide when to stop, in order to maximize the sample average where is the stopping time? The probability of eventually stopping must be 1 (that is, you are not allowed to keep sampling and never stop).

For any distribution with finite second moment, there exists an optimal strategy, defined by a sequence of numbers . The strategy is to keep sampling until . [10] [11]

Optimal strategy for very large n

If has finite second moment, then after subtracting the mean and dividing by the standard deviation, we get a distribution with mean zero and variance one. Consequently it suffices to study the case of with mean zero and variance one.

With this, , where is the solution to the equation [note 2] which can be proved by solving the same problem with continuous time, with a Wiener process. At the limit of , the discrete time problem becomes the same as the continuous time problem.

This was proved independently [12] by. [13] [14] [15]

When the game is a fair coin toss game, with heads being +1 and tails being -1, then there is a sharper result [9] where is the Riemann zeta function.

Optimal strategy for small n

When n is small, the asymptotic bound does not apply, and finding the value of is much more difficult. Even the simplest case, where are fair coin tosses, is not fully solved.

For the fair coin toss, a strategy is a binary decision: after tosses, with k heads and (n-k) tails, should one continue or should one stop? Since 1D random walk is recurrent, starting at any , the probability of eventually having more heads than tails is 1. So, if , one should always continue. However, if , it is tricky to decide whether to stop or continue. [16]

[17] found an exact solution for all .

Elton [9] found exact solutions for all , and it found an almost always optimal decision rule, of stopping as soon as where


Related Research Articles

In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than is at most , where is any positive constant and is the standard deviation.

<span class="mw-page-title-main">Beta distribution</span> Probability distribution

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

The Lotka–Volterra equations, also known as the Lotka–Volterra predator–prey model, are a pair of first-order nonlinear differential equations, frequently used to describe the dynamics of biological systems in which two species interact, one as a predator and the other as prey. The populations change through time according to the pair of equations:

This article describes periodic points of some complex quadratic maps. A map is a formula for computing a value of a variable based on its own previous value or values; a quadratic map is one that involves the previous value raised to the powers one and two; and a complex map is one in which the variable and the parameters are complex numbers. A periodic point of a map is a value of the variable that occurs repeatedly after intervals of a fixed length.

<span class="mw-page-title-main">Pearson distribution</span> Family of continuous probability distributions

The Pearson distribution is a family of continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostatistics.

In numerical analysis, the Crank–Nicolson method is a finite difference method used for numerically solving the heat equation and similar partial differential equations. It is a second-order method in time. It is implicit in time, can be written as an implicit Runge–Kutta method, and it is numerically stable. The method was developed by John Crank and Phyllis Nicolson in the 1940s.

<span class="mw-page-title-main">Secretary problem</span> Mathematical problem involving optimal stopping theory

The secretary problem demonstrates a scenario involving optimal stopping theory that is studied extensively in the fields of applied probability, statistics, and decision theory. It is also known as the marriage problem, the sultan's dowry problem, the fussy suitor problem, the googol game, and the best choice problem. Its solution is also known as the 37% rule.

In mathematics and economics, the envelope theorem is a major result about the differentiability properties of the value function of a parameterized optimization problem. As we change parameters of the objective, the envelope theorem shows that, in a certain sense, changes in the optimizer of the objective do not contribute to the change in the objective function. The envelope theorem is an important tool for comparative statics of optimization models.

In mathematics, the theory of optimal stopping or early stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance. A key example of an optimal stopping problem is the secretary problem. Optimal stopping problems can often be written in the form of a Bellman equation, and are therefore often solved using dynamic programming.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

<span class="mw-page-title-main">Quantile function</span> Statistical function that defines the quantiles of a probability distribution

In probability and statistics, the quantile function outputs the value of a random variable such that its probability is less than or equal to an input probability value. Intuitively, the quantile function associates with a range at and below a probability input the likelihood that a random variable is realized in that range for some probability distribution. It is also called the percentile function, percent-point function, inverse cumulative distribution function or inverse distribution function.

In probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f-divergence. The Hellinger distance is defined in terms of the Hellinger integral, which was introduced by Ernst Hellinger in 1909.

In mathematics, the ATS theorem is the theorem on the approximation of a trigonometric sum by a shorter one. The application of the ATS theorem in certain problems of mathematical and theoretical physics can be very helpful.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In mathematics, and more precisely, in functional Analysis and PDEs, the Schauder estimates are a collection of results due to Juliusz Schauder concerning the regularity of solutions to linear, uniformly elliptic partial differential equations. The estimates say that when the equation has appropriately smooth terms and appropriately smooth solutions, then the Hölder norm of the solution can be controlled in terms of the Hölder norms for the coefficient and source terms. Since these estimates assume by hypothesis the existence of a solution, they are called a priori estimates.

In statistics and probability theory, the nonparametric skew is a statistic occasionally used with random variables that take real values. It is a measure of the skewness of a random variable's distribution—that is, the distribution's tendency to "lean" to one side or the other of the mean. Its calculation does not require any knowledge of the form of the underlying distribution—hence the name nonparametric. It has some desirable properties: it is zero for any symmetric distribution; it is unaffected by a scale shift; and it reveals either left- or right-skewness equally well. In some statistical samples it has been shown to be less powerful than the usual measures of skewness in detecting departures of the population from normality.

In mathematics, a linear recurrence with constant coefficients sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as t, one period earlier denoted as t − 1, one period later as t + 1, etc.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

References

  1. Chow, Y.S.; Moriguti, S.; Robbins, Herbert Ellis; Samuels, Stephen M. (1964). "Optimal Selection Based on Relative Rank". Israel Journal of Mathematics . 2 (2): 81–90. doi: 10.1007/bf02759948 .
  2. Bruss, F. Thomas (2005). "What is known about Robbins' Problem?". Journal of Applied Probability . 42 (1): 108–120. doi: 10.1239/jap/1110381374 . JSTOR   30040773.
  3. Bruss, F.Thomas; Ferguson, S. Thomas (1993). "Minimizing the expected rank with full information". Journal of Applied Probability . 30 (3): 616–626. doi: 10.1007/bf02759948 . ISSN   0021-9002. JSTOR   3214770.
  4. Bruss, F.Thomas; Ferguson, S. Thomas (1996). "Half-Prophets and Robbins' Problem of Minimizing the expected rank". Lecture Notes in Statistics (LNS). Athens Conference on Applied Probability and Time Series Analysis. Vol. 114. New York, NY: Springer New York. pp. 1–17. doi: 10.1007/978-1-4612-0749-8_1 . ISBN   978-0-387-94788-4.
  5. Assaf, David; Samuel-Cahn, Ester (1996). "The secretary problem: Minimizing the expected rank with i.i.d. random variables". Advances in Applied Probability . 28 (3): 828–852. doi: 10.2307/1428183 . ISSN   0001-8678. JSTOR   1428183.
  6. Bruss, F. Thomas; Swan, Yvik C. (2009). "What is known about Robbins' Problem?". Journal of Applied Probability . 46 (1): 1–18. doi: 10.1239/jap/1238592113 . JSTOR   30040773.
  7. Krieger, Abba M.; Samuel-Cahn, Ester (2009). "The secretary problem of minimizing the expected rank: a simple suboptimal approach with generalization". Advances in Applied Probability . 41 (4): 1041–1058. doi: 10.1239/aap/1261669585 . JSTOR   27793918.
  8. Chow, Y. S.; Robbins, Herbert (September 1965). "On optimal stopping rules for $S_{n}/n$". Illinois Journal of Mathematics. 9 (3): 444–454. doi: 10.1215/ijm/1256068146 . ISSN   0019-2082.
  9. 1 2 3 Elton, John H. (2023-06-06). "Exact Solution to the Chow-Robbins Game for almost all n, using the Catalan Triangle". arXiv: 2205.13499 [math].{{cite arXiv}}: CS1 maint: date and year (link)
  10. Dvoretzky, Aryeh. "Existence and properties of certain optimal stopping rules." Proc. Fifth Berkeley Symp. Math. Statist. Prob. Vol. 1. 1967.
  11. Teicher, H.; Wolfowitz, J. (1966-12-01). "Existence of optimal stopping rules for linear and quadratic rewards". Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete. 5 (4): 361–368. doi:10.1007/BF00535366. ISSN   1432-2064.
  12. Simons, Gordon; Yao, Yi-Ching (1989-08-01). "Optimally stopping the sample mean of a Wiener process with an unknown drift". Stochastic Processes and Their Applications. 32 (2): 347–354. doi:10.1016/0304-4149(89)90084-7. ISSN   0304-4149.
  13. Shepp, L. A. (June 1969). "Explicit Solutions to Some Problems of Optimal Stopping". The Annals of Mathematical Statistics. 40 (3): 993–1010. doi: 10.1214/aoms/1177697604 . ISSN   0003-4851.
  14. Taylor, Howard M. (1968). "Optimal Stopping in a Markov Process". The Annals of Mathematical Statistics. 39 (4): 1333–1344. doi: 10.1214/aoms/1177698259 . ISSN   0003-4851. JSTOR   2239702.
  15. Walker, Leroy H. (1969). "Regarding stopping rules for Brownian motion and random walks". Bulletin of the American Mathematical Society. 75 (1): 46–50. doi: 10.1090/S0002-9904-1969-12140-3 . ISSN   0002-9904.
  16. Häggström, Olle; Wästlund, Johan (2013). "Rigorous Computer Analysis of the Chow–Robbins Game". The American Mathematical Monthly. 120 (10): 893. doi:10.4169/amer.math.monthly.120.10.893.
  17. Christensen, Sören; Fischer, Simon (June 2022). "On the Sn/n problem". Journal of Applied Probability. 59 (2): 571–583. doi:10.1017/jpr.2021.73. ISSN   0021-9002.

Footnotes

  1. The Joint Summer Research Conferences in the Mathematical Sciences were held at the University of Massachusetts from June 7 to July 4, 1990. These were sponsored by the AMS, SIAM, and the Institute for Mathematical Statistics (IMS). Topics in 1990 were: Probability models and statistical analysis for ranking data, Inverse scattering on the line, Deformation theory of algebras and quantization with applications to physics, Strategies for sequential search and selection in real time, Schottky problems, and Logic, fields, and subanalytic sets.
  2. importnumpyasnpfromscipy.integrateimportquadfromscipy.optimizeimportrootdeff(lambda_,alpha):returnnp.exp(lambda_*alpha-lambda_**2/2)defequation(alpha):integral,error=quad(f,0,np.inf,args=(alpha))returnintegral*(1-alpha**2)-alphasolution=root(equation,0.83992,tol=1e-15)# Print the solutionifsolution.success:print(f"Solved α = {solution.x[0]} with a residual of {solution.fun[0]}")else:print("Solution did not converge")