# Optimal decision

Last updated

An optimal decision is a decision that leads to at least as good a known or expected outcome as all other available decision options. It is an important concept in decision theory. In order to compare the different decision outcomes, one commonly assigns a utility value to each of them.

## Contents

If there is uncertainty as to what the outcome will be but knowledge about the distribution of the uncertainty, then under the von Neumann–Morgenstern axioms the optimal decision maximizes the expected utility (a probability–weighted average of utility over all possible outcomes of a decision). Sometimes, the equivalent problem of minimizing the expected value of loss is considered, where loss is (–1) times utility. Another equivalent problem is minimizing expected regret.

"Utility" is only an arbitrary term for quantifying the desirability of a particular decision outcome and not necessarily related to "usefulness." For example, it may well be the optimal decision for someone to buy a sports car rather than a station wagon, if the outcome in terms of another criterion (e.g., effect on personal image) is more desirable, even given the higher cost and lack of versatility of the sports car.

The problem of finding the optimal decision is a mathematical optimization problem. In practice, few people verify that their decisions are optimal, but instead use heuristics to make decisions that are "good enough"that is, they engage in satisficing.

A more formal approach may be used when the decision is important enough to motivate the time it takes to analyze it, or when it is too complex to solve with more simple intuitive approaches, such as many available decision options and a complex decisionoutcome relationship.

## Formal mathematical description

Each decision ${\displaystyle d}$ in a set ${\displaystyle D}$ of available decision options will lead to an outcome ${\displaystyle o=f(d)}$. All possible outcomes form the set ${\displaystyle O}$. Assigning a utility ${\displaystyle U_{O}(o)}$ to every outcome, we can define the utility of a particular decision ${\displaystyle d}$ as

${\displaystyle U_{D}(d)\ =\ U_{O}(f(d)).\,}$

We can then define an optimal decision ${\displaystyle d_{\mathrm {opt} }}$ as one that maximizes ${\displaystyle U_{D}(d)}$ :

${\displaystyle d_{\mathrm {opt} }=\arg \max \limits _{d\in D}U_{D}(d).\,}$

Solving the problem can thus be divided into three steps:

1. predicting the outcome ${\displaystyle o}$ for every decision ${\displaystyle d;}$
2. assigning a utility ${\displaystyle U_{O}(o)}$ to every outcome ${\displaystyle o;}$
3. finding the decision ${\displaystyle d}$ that maximizes ${\displaystyle U_{D}(d).}$

## Under uncertainty in outcome

In case it is not possible to predict with certainty what will be the outcome of a particular decision, a probabilistic approach is necessary. In its most general form, it can be expressed as follows:

Given a decision ${\displaystyle d}$, we know the probability distribution for the possible outcomes described by the conditional probability density ${\displaystyle p(o|d)}$. Considering ${\displaystyle U_{D}(d)}$ as a random variable (conditional on ${\displaystyle d}$), we can calculate the expected utility of decision ${\displaystyle d}$ as

${\displaystyle {\text{E}}U_{D}(d)=\int {p(o|d)U(o)do}\,}$ ,

where the integral is taken over the whole set ${\displaystyle O}$ (DeGroot, pp 121).

An optimal decision ${\displaystyle d_{\mathrm {opt} }}$ is then one that maximizes ${\displaystyle {\text{E}}U_{D}(d)}$, just as above:

${\displaystyle d_{\mathrm {opt} }=\arg \max \limits _{d\in D}{\text{E}}U_{D}(d).\,}$

An example is the Monty Hall problem.

## Related Research Articles

Within economics, the concept of utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or satisfaction within the theory of utilitarianism by moral philosophers such as Jeremy Bentham and John Stuart Mill. The term has been adapted and reapplied within neoclassical economics, which dominates modern economic theory, as a utility function that represents a consumer's preference ordering over a choice set. Utility has thus become a more abstract concept, that is not necessarily solely based on the satisfaction/pleasure received.

Satisficing is a decision-making strategy or cognitive heuristic that entails searching through the available alternatives until an acceptability threshold is met. The term satisficing, a portmanteau of satisfy and suffice, was introduced by Herbert A. Simon in 1956, although the concept was first posted in his 1947 book Administrative Behavior. Simon used satisficing to explain the behavior of decision makers under circumstances in which an optimal solution cannot be determined. He maintained that many natural problems are characterized by computational intractability or a lack of information, both of which preclude the use of mathematical optimization procedures. He observed in his Nobel Prize in Economics speech that "decision makers can satisfice either by finding optimum solutions for a simplified world, or by finding satisfactory solutions for a more realistic world. Neither approach, in general, dominates the other, and both have continued to co-exist in the world of management science".

The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models (HMM).

In mathematical optimization and decision theory, a loss function or cost function is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event. An optimization problem seeks to minimize a loss function. An objective function is either a loss function or its negative, in which case it is to be maximized.

In economics, game theory, and decision theory, the expected utility hypothesis—concerning people's preferences with regard to choices that have uncertain outcomes (gambles)⁠—states that the subjective value associated with an individual's gamble is the statistical expectation of that individual's valuations of the outcomes of that gamble, where these valuations may differ from the dollar value of those outcomes. The introduction of St. Petersburg Paradox by Daniel Bernoulli in 1738 is considered the beginnings of the hypothesis. This hypothesis has proven useful to explain some popular choices that seem to contradict the expected value criterion, such as occur in the contexts of gambling and insurance.

In mathematics, a Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming and reinforcement learning. MDPs were known at least as early as the 1950s; a core body of research on Markov decision processes resulted from Ronald Howard's 1960 book, Dynamic Programming and Markov Processes. They are used in many disciplines, including robotics, automatic control, economics and manufacturing. The name of MDPs comes from the Russian mathematician Andrey Markov as they are an extension of Markov chains.

In signal processing, a matched filter is obtained by correlating a known delayed signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template. The matched filter is the optimal linear filter for maximizing the signal-to-noise ratio (SNR) in the presence of additive stochastic noise.

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices. This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman's “principle of optimality” prescribes.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

In probability theory and intertemporal portfolio choice, the Kelly criterion, also known as the scientific gambling method, is a formula for bet sizing that leads almost surely to higher wealth compared to any other strategy in the long run. The Kelly bet size is found by maximizing the expected value of the logarithm of wealth, which is equivalent to maximizing the expected geometric growth rate. The Kelly Criterion is to bet a predetermined fraction of assets, and it can seem counterintuitive.

In decision theory, a score function, or scoring rule, measures the accuracy of probabilistic predictions. It is applicable to tasks in which predictions must assign probabilities to a set of mutually exclusive outcomes. The set of possible outcomes can be either binary or categorical in nature, and the probabilities assigned to this set of outcomes must sum to one. A score can be thought of as either a measure of the "calibration" of a set of probabilistic predictions, or as a "cost function" or "loss function".

In decision theory, the expected value of sample information (EVSI) is the expected increase in utility that a decision-maker could obtain from gaining access to a sample of additional observations before making a decision. The additional information obtained from the sample may allow them to make a more informed, and thus better, decision, thus resulting in an increase in expected utility. EVSI attempts to estimate what this improvement would be before seeing actual sample data; hence, EVSI is a form of what is known as preposterior analysis.

Causal decision theory is a mathematical theory intended to determine the set of rational choices in a given situation. It is a school of thought in decision theory. In informal terms, it maintains that the rational choice is that with the best expected causal consequences. This theory is often contrasted with evidential decision theory, which recommends those actions that provide the best expected outcome conditional on one’s best evidence about the world.

In decision theory, the von Neumann–Morgensternutility theorem shows that, under certain axioms of rational behavior, a decision-maker faced with risky (probabilistic) outcomes of different choices will behave as if he or she is maximizing the expected value of some function defined over the potential outcomes at some specified point in the future. This function is known as the von Neumann–Morgenstern utility function. The theorem is the basis for expected utility theory.

In queueing theory, a discipline within the mathematical theory of probability, the backpressure routing algorithm is a method for directing traffic around a queueing network that achieves maximum network throughput, which is established using concepts of Lyapunov drift. Backpressure routing considers the situation where each job can visit multiple service nodes in the network. It is an extension of max-weight scheduling where each job visits only a single service node.

In decision theory and quantitative policy analysis, the expected value of including uncertainty (EVIU) is the expected difference in the value of a decision based on a probabilistic analysis versus a decision based on an analysis that ignores uncertainty.

Optimal computing budget allocation (OCBA) is an approach to maximize the overall simulation efficiency for finding an optimal decision. The concept was introduced in the mid-1990s by Dr. Chun-Hung Chen. Simply put, OCBA is an approach to simulation that will help determine the number of replications or the simulation time that is needed in order to receive acceptable or best results within a set of given parameters. This is accomplished by using an asymptotic framework to analyze the structure of the optimal allocation. OCBA has also been shown effective in enhancing partition-based random search algorithms for solving deterministic global optimization problems.

In mechanism design, a Vickrey–Clarke–Groves (VCG) mechanism is a generic truthful mechanism for achieving a socially-optimal solution. It is a generalization of a Vickrey–Clarke–Groves auction. A VCG auction performs a specific task: dividing items among people. A VCG mechanism is more general: it can be used to select any outcome out of a set of possible outcomes.

A Bayesian-optimal mechanism (BOM) is a mechanism in which the designer does not know the valuations of the agents for whom the mechanism is designed, but he knows that they are random variables and he knows the probability distribution of these variables.

Dynamic discrete choice (DDC) models, also known as discrete choice models of dynamic programming, model an agent's choices over discrete options that have future implications. Rather than assuming observed choices are the result of static utility maximization, observed choices in DDC models are assumed to result from an agent's maximization of the present value of utility, generalizing the utility theory upon which discrete choice models are based.

## References

• Morris DeGroot Optimal Statistical Decisions. McGraw-Hill. New York. 1970. ISBN   0-07-016242-5.
• James O. Berger Statistical Decision Theory and Bayesian Analysis. Second Edition. 1980. Springer Series in Statistics. ISBN   0-387-96098-8.