# Optimal decision

Last updated

An optimal decision is a decision that leads to at least as good a known or expected outcome as all other available decision options. It is an important concept in decision theory. In order to compare the different decision outcomes, one commonly assigns a utility value to each of them. If there is uncertainty as to what the outcome will be, then under the von Neumann–Morgenstern axioms the optimal decision maximizes the expected utility (a probability–weighted average of utility over all possible outcomes of a decision).

Decision theory is the study of an agent's choices. Decision theory can be broken into two branches: normative decision theory, which analyzes the outcomes of decisions or determines the optimal decisions given constraints and assumptions, and descriptive decision theory, which analyzes how agents actually make the decisions they do.

Within economics, the concept of utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or satisfaction within the theory of utilitarianism by moral philosophers such as Jeremy Bentham and John Stuart Mill. The term has been adapted and reapplied within neoclassical economics, which dominates modern economic theory, as a utility function that represents a consumer's preference ordering over a choice set. It is devoid of its original interpretation as a measurement of the pleasure or satisfaction obtained by the consumer from that choice.

In economics, game theory, and decision theory, the expected utility hypothesis—concerning people's preferences with regard to choices that have uncertain outcomes (gambles)⁠—states that the subjective value associated with an individual's gamble is the statistical expectation of that individual's valuations of the outcomes of that gamble, where these valuations may differ from the dollar value of those outcomes. The introduction of St. Petersburg Paradox by Daniel Bernoulli in 1738 is considered the beginnings of the hypothesis. This hypothesis has proven useful to explain some popular choices that seem to contradict the expected value criterion, such as occur in the contexts of gambling and insurance.

## Contents

Sometimes, the equivalent problem of minimizing the expected value of loss is considered, where loss is (–1) times utility.

In probability theory, the expected value of a random variable, intuitively, is the long-run average value of repetitions of the same experiment it represents. For example, the expected value in rolling a six-sided die is 3.5, because the average of all the numbers that come up is 3.5 as the number of rolls approaches infinity. In other words, the law of large numbers states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity. The expected value is also known as the expectation, mathematical expectation, EV, average, mean value, mean, or first moment.

In mathematical optimization and decision theory, a loss function or cost function is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event. An optimization problem seeks to minimize a loss function. An objective function is either a loss function or its negative, in which case it is to be maximized.

"Utility" is only an arbitrary term for quantifying the desirability of a particular decision outcome and not necessarily related to "usefulness." For example, it may well be the optimal decision for someone to buy a sports car rather than a station wagon, if the outcome in terms of another criterion (e.g., effect on personal image) is more desirable, even given the higher cost and lack of versatility of the sports car.

The problem of finding the optimal decision is a mathematical optimization problem. In practice, few people verify that their decisions are optimal, but instead use heuristics to make decisions that are "good enough"that is, they engage in satisficing.

Mathematical optimization or mathematical programming is the selection of a best element from some set of available alternatives. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

Satisficing is a decision-making strategy or cognitive heuristic that entails searching through the available alternatives until an acceptability threshold is met. The term satisficing, a portmanteau of satisfy and suffice, was introduced by Herbert A. Simon in 1956, although the concept was first posted in his 1947 book Administrative Behavior. Simon used satisficing to explain the behavior of decision makers under circumstances in which an optimal solution cannot be determined. He maintained that many natural problems are characterized by computational intractability or a lack of information, both of which preclude the use of mathematical optimization procedures. He observed in his Nobel Prize in Economics speech that "decision makers can satisfice either by finding optimum solutions for a simplified world, or by finding satisfactory solutions for a more realistic world. Neither approach, in general, dominates the other, and both have continued to co-exist in the world of management science".

A more formal approach may be used when the decision is important enough to motivate the time it takes to analyze it, or when it is too complex to solve with more simple intuitive approaches, such as many available decision options and a complex decisionoutcome relationship.

## Formal mathematical description

Each decision ${\displaystyle d}$ in a set ${\displaystyle D}$ of available decision options will lead to an outcome ${\displaystyle o=f(d)}$. All possible outcomes form the set ${\displaystyle O}$. Assigning a utility ${\displaystyle U_{O}(o)}$ to every outcome, we can define the utility of a particular decision ${\displaystyle d}$ as

${\displaystyle U_{D}(d)\ =\ U_{O}(f(d)).\,}$

We can then define an optimal decision ${\displaystyle d_{\mathrm {opt} }}$ as one that maximizes ${\displaystyle U_{D}(d)}$ :

${\displaystyle d_{\mathrm {opt} }=\arg \max \limits _{d\in D}U_{D}(d).\,}$

Solving the problem can thus be divided into three steps:

1. predicting the outcome ${\displaystyle o}$ for every decision ${\displaystyle d;}$
2. assigning a utility ${\displaystyle U_{O}(o)}$ to every outcome ${\displaystyle o;}$
3. finding the decision ${\displaystyle d}$ that maximizes ${\displaystyle U_{D}(d).}$

## Under uncertainty in outcome

In case it is not possible to predict with certainty what will be the outcome of a particular decision, a probabilistic approach is necessary. In its most general form, it can be expressed as follows:

Given a decision ${\displaystyle d}$, we know the probability distribution for the possible outcomes described by the conditional probability density ${\displaystyle p(o|d)}$. Considering ${\displaystyle U_{D}(d)}$ as a random variable (conditional on ${\displaystyle d}$), we can calculate the expected utility of decision ${\displaystyle d}$ as

In probability theory and statistics, given two jointly distributed random variables and , the conditional probability distribution of Y given X is the probability distribution of when is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value of as a parameter. When both and are categorical variables, a conditional probability table is typically used to represent the conditional probability. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable.

In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is described informally as a variable whose values depend on outcomes of a random phenomenon. The formal mathematical treatment of random variables is a topic in probability theory. In that context, a random variable is understood as a measurable function defined on a probability space whose outcomes are typically real numbers.

${\displaystyle {\text{E}}U_{D}(d)=\int {p(o|d)U(o)do}\,}$ ,

where the integral is taken over the whole set ${\displaystyle O}$ (DeGroot, pp 121).

An optimal decision ${\displaystyle d_{\mathrm {opt} }}$ is then one that maximizes ${\displaystyle {\text{E}}U_{D}(d)}$, just as above:

${\displaystyle d_{\mathrm {opt} }=\arg \max \limits _{d\in D}{\text{E}}U_{D}(d).\,}$

An example is the Monty Hall problem.

In psychology, decision-making is regarded as the cognitive process resulting in the selection of a belief or a course of action among several alternative possibilities. Decision-making is the process of identifying and choosing alternatives based on the values, preferences and beliefs of the decision-maker. Every decision-making process produces a final choice, which may or may not prompt action.

Decision-making software is software for computer applications that help individuals and organisations make choices and take decisions, typically by ranking, prioritizing or choosing from a number of options.

Two-alternative forced choice (2AFC) is a method for measuring the subjective experience of a person or animal through their pattern of choices and response times. The subject is presented with two alternative options, only one of which contains the target stimulus, and is forced to choose which one was the correct option. Both options can be presented concurrently or sequentially in two intervals. The term 2AFC is often mistakenly used for describing the more common yes-no task, where a subject is presented with one option only and is forced to choose whether it belongs to one or another category. 2AFC is a method of psychophysics developed by Gustav Theodor Fechner.

## Related Research Articles

Minimax is a decision rule used in artificial intelligence, decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case scenario. When dealing with gains, it is referred to as "maximin"—to maximize the minimum gain. Originally formulated for two-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.

A Markov decision process (MDP) is a discrete time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming and reinforcement learning. MDPs were known at least as early as the 1950s; a core body of research on Markov decision processes resulted from Ronald Howard's 1960 book, Dynamic Programming and Markov Processes. They are used in many disciplines, including robotics, automatic control, economics and manufacturing. The name of MDPs comes from the Russian mathematician Andrey Markov.

In signal processing, a matched filter is obtained by correlating a known delayed signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template. The matched filter is the optimal linear filter for maximizing the signal-to-noise ratio (SNR) in the presence of additive stochastic noise.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

In probability theory and intertemporal portfolio choice, the Kelly criterion, Kelly strategy, Kelly formula, or Kelly bet is a formula for bet sizing that leads almost surely to higher wealth compared to any other strategy in the long run. The Kelly bet size is found by maximizing the expected logarithm of wealth which is equivalent to maximizing the expected geometric growth rate.

In computational complexity theory, the maximum satisfiability problem (MAX-SAT) is the problem of determining the maximum number of clauses, of a given Boolean formula in conjunctive normal form, that can be made true by an assignment of truth values to the variables of the formula. It is a generalization of the Boolean satisfiability problem, which asks whether there exists a truth assignment that makes all clauses true.

Stochastic dominance is a partial order between random variables. It is a form of stochastic ordering. The concept arises in decision theory and decision analysis in situations where one gamble can be ranked as superior to another gamble for a broad class of decision-makers. It is based on shared preferences regarding sets of possible outcomes and their associated probabilities. Only limited knowledge of preferences is required for determining dominance. Risk aversion is a factor only in second order stochastic dominance.

In decision theory, a score function, or scoring rule, measures the accuracy of probabilistic predictions. It is applicable to tasks in which predictions must assign probabilities to a set of mutually exclusive outcomes. The set of possible outcomes can be either binary or categorical in nature, and the probabilities assigned to this set of outcomes must sum to one. A score can be thought of as either a measure of the "calibration" of a set of probabilistic predictions, or as a "cost function" or "loss function".

In decision theory, the expected value of sample information (EVSI) is the expected increase in utility that a decision-maker could obtain from gaining access to a sample of additional observations before making a decision. The additional information obtained from the sample may allow them to make a more informed, and thus better, decision, thus resulting in an increase in expected utility. EVSI attempts to estimate what this improvement would be before seeing actual sample data; hence, EVSI is a form of what is known as preposterior analysis.

Causal decision theory is a mathematical theory intended to determine the set of rational choices in a given situation. In informal terms, it maintains that the rational choice is that with the best expected causal consequences. This theory is often contrasted with evidential decision theory, which recommends those actions that provide the best expected outcome conditional on one’s best evidence about the world.

Evidential decision theory is a school of thought within decision theory according to which the best action is the one which, conditional on one's having chosen it, gives one the best expectations for the outcome. It contrasts with causal decision theory, which requires a causal connection between one's actions and the desirable outcome.

In decision theory, the von Neumann-Morgenstern utility theorem shows that, under certain axioms of rational behavior, a decision-maker faced with risky (probabilistic) outcomes of different choices will behave as if he or she is maximizing the expected value of some function defined over the potential outcomes at some specified point in the future. This function is known as the von Neumann-Morgenstern utility function. The theorem is the basis for expected utility theory.

AIXI['ai̯k͡siː] is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000 and several results regarding AIXI are proved in Hutter's 2005 book Universal Artificial Intelligence.

In decision theory and quantitative policy analysis, the expected value of including uncertainty (EVIU) is the expected difference in the value of a decision based on a probabilistic analysis versus a decision based on an analysis that ignores uncertainty.

In mechanism design, a Vickrey–Clarke–Groves (VCG) mechanism is a generic truthful mechanism for achieving a socially-optimal solution. It is a generalization of a Vickrey–Clarke–Groves auction. A VCG auction performs a specific task: dividing items among people. A VCG mechanism is more general: it can be used to select any outcome out of a set of possible outcomes.

A random-sampling mechanism (RSM) is a truthful mechanism that uses sampling in order to achieve approximately-optimal gain in prior-free mechanisms and prior-independent mechanisms.

Dynamic discrete choice (DDC) models, also known as discrete choice models of dynamic programming, model an agent's choices over discrete options that have future implications. Rather than assuming observed choices are the result of static utility maximization, observed choices in DDC models are assumed to result from an agent's maximization of the present value of utility, generalizing the utility theory upon which discrete choice models are based.

## References

• Morris DeGroot Optimal Statistical Decisions. McGraw-Hill. New York. 1970. ISBN   0-07-016242-5.
• James O. Berger Statistical Decision Theory and Bayesian Analysis. Second Edition. 1980. Springer Series in Statistics. ISBN   0-387-96098-8.

The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency.