Credible interval

Last updated

In Bayesian statistics, a credible interval is an interval within which an unobserved parameter value falls with a particular probability. It is an interval in the domain of a posterior probability distribution or a predictive distribution. [1] The generalisation to multivariate problems is the credible region. Credible intervals are analogous to confidence intervals in frequentist statistics, [2] although they differ on a philosophical basis: [3] Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value. Also, Bayesian credible intervals use (and indeed, require) knowledge of the situation-specific prior distribution, while the frequentist confidence intervals do not.

Contents

For example, in an experiment that determines the distribution of possible values of the parameter ${\displaystyle \mu }$, if the subjective probability that ${\displaystyle \mu }$ lies between 35 and 45 is 0.95, then ${\displaystyle 35\leq \mu \leq 45}$ is a 95% credible interval.

Choosing a credible interval

Credible intervals are not unique on a posterior distribution. Methods for defining a suitable credible interval include:

• Choosing the narrowest interval, which for a unimodal distribution will involve choosing those values of highest probability density including the mode (the maximum a posteriori ). This is sometimes called the highest posterior density interval (HPDI).
• Choosing the interval where the probability of being below the interval is as likely as being above it. This interval will include the median. This is sometimes called the equal-tailed interval.
• Assuming that the mean exists, choosing the interval for which the mean is the central point.

It is possible to frame the choice of a credible interval within decision theory and, in that context, an optimal interval will always be a highest probability density set. [4]

Contrasts with confidence interval

A frequentist 95% confidence interval means that with a large number of repeated samples, 95% of such calculated confidence intervals would include the true value of the parameter. In frequentist terms, the parameter is fixed (cannot be considered to have a distribution of possible values) and the confidence interval is random (as it depends on the random sample).

Bayesian credible intervals can be quite different from frequentist confidence intervals for two reasons:

• credible intervals incorporate problem-specific contextual information from the prior distribution whereas confidence intervals are based only on the data;
• credible intervals and confidence intervals treat nuisance parameters in radically different ways.

For the case of a single parameter and data that can be summarised in a single sufficient statistic, it can be shown that the credible interval and the confidence interval will coincide if the unknown parameter is a location parameter (i.e. the forward probability function has the form ${\displaystyle \mathrm {Pr} (x|\mu )=f(x-\mu )}$ ), with a prior that is a uniform flat distribution; [5] and also if the unknown parameter is a scale parameter (i.e. the forward probability function has the form ${\displaystyle \mathrm {Pr} (x|s)=f(x/s)}$ ), with a Jeffreys' prior  ${\displaystyle \mathrm {Pr} (s|I)\;\propto \;1/s}$ [5] — the latter following because taking the logarithm of such a scale parameter turns it into a location parameter with a uniform distribution. But these are distinctly special (albeit important) cases; in general no such equivalence can be made.

Related Research Articles

Statistical inference is the process of using data analysis to deduce properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

In probability and statistics, Student's t-distribution is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown. It was developed by William Sealy Gosset under the pseudonym Student.

In statistics, a confidence interval (CI) is a type of estimate computed from the statistics of the observed data. This proposes a range of plausible values for an unknown parameter. The interval has an associated confidence level that the true parameter is in the proposed range. Given observations and a confidence level , a valid confidence interval has a probability of containing the true underlying parameter. The level of confidence can be chosen by the investigator. In general terms, a confidence interval for an unknown parameter is based on sampling the distribution of a corresponding estimator.

In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.

In statistics, the question of checking whether a coin is fair is one whose importance lies, firstly, in providing a simple problem on which to illustrate basic ideas of statistical inference and, secondly, in providing a simple problem that can be used to compare various competing methods of statistical inference, including decision theory. The practical problem of checking whether a coin is fair might be considered as easily solved by performing a sufficiently large number of trials, but statistics and probability theory can provide guidance on two types of question; specifically those of how many trials to undertake and of the accuracy an estimate of the probability of turning up heads, derived from a given sample of trials.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

The following is a glossary of terms used in the mathematical sciences statistics and probability.

Lindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a hypothesis testing problem give different results for certain choices of the prior distribution. The problem of the disagreement between the two approaches was discussed in Harold Jeffreys' 1939 textbook; it became known as Lindley's paradox after Dennis Lindley called the disagreement a paradox in a 1957 paper.

Fiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with fiducial inference have fallen out of fashion in favour of frequentist inference, Bayesian inference and decision theory. However, fiducial inference is important in the history of statistics since its development led to the parallel development of concepts and tools in theoretical statistics that are widely used. Some current research in statistical methodology is either explicitly linked to fiducial inference or is closely connected to it.

In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.

Neyman construction is a frequentist method to construct an interval at a confidence level such that if we repeat the experiment many times the interval will contain the true value of some parameter a fraction of the time. It is named after Jerzy Neyman.

In statistics, the 68–95–99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within a band around the mean in a normal distribution with a width of two, four and six standard deviations, respectively; more precisely, 68.27%, 95.45% and 99.73% of the values lie within one, two and three standard deviations of the mean, respectively.

Frequentist inference is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data. An alternative name is frequentist statistics. This is the inference framework in which the well-established methodologies of statistical hypothesis testing and confidence intervals are based. Other than frequentistic inference, the main alternative approach to statistical inference is Bayesian inference, while another is fiducial inference.

In statistics, additive smoothing, also called Laplace smoothing, or Lidstone smoothing, is a technique used to smooth categorical data. Given an observation from a multinomial distribution with trials, a "smoothed" version of the data gives the estimator:

Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst. Cornerstones in this field are computational learning theory, granular computing, bioinformatics, and, long ago, structural probability . The main focus is on the algorithms which compute statistics rooting the study of a random phenomenon, along with the amount of data they must feed on to produce reliable results. This shifts the interest of mathematicians from the study of the distribution laws to the functional properties of the statistics, and the interest of computer scientists from the algorithms for processing data to the information they process.

In statistical inference, the concept of a confidence distribution (CD) has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest. Historically, it has typically been constructed by inverting the upper limits of lower sided confidence intervals of all levels, and it was also commonly associated with a fiducial interpretation, although it is a purely frequentist concept. A confidence distribution is NOT a probability distribution function of the parameter of interest, but may still be a function useful for making inferences.

In Bayesian inference, the Bernstein-von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value.

In probability theory and statistics, an inverse distribution is the distribution of the reciprocal of a random variable. Inverse distributions arise in particular in the Bayesian context of prior distributions and posterior distributions for scale parameters. In the algebra of random variables, inverse distributions are special cases of the class of ratio distributions, in which the numerator random variable has a degenerate distribution.

In statistics, almost sure hypothesis testing or a.s. hypothesis testing utilizes almost sure convergence in order to determine the validity of a statistical hypothesis with probability one. This is to say that whenever the null hypothesis is true, then an a.s. hypothesis test will fail to reject the null hypothesis w.p. 1 for all sufficiently large samples. Similarly, whenever the alternative hypothesis is true, then an a.s. hypothesis test will reject the null hypothesis with probability one, for all sufficiently large samples. Along similar lines, an a.s. confidence interval eventually contains the parameter of interest with probability 1. Dembo and Peres (1994) proved the existence of almost sure hypothesis tests.

In statistics, suppose that we have been given some data, and we are constructing a statistical model of that data. The relative likelihood compares the relative plausibilities of different candidate models or of different values of a parameter of a single model.

References

1. Edwards, Ward, Lindman, Harold, Savage, Leonard J. (1963) "Bayesian statistical inference in psychological research". Psychological Review, 70, 193-242
2. Lee, P.M. (1997) Bayesian Statistics: An Introduction, Arnold. ISBN   0-340-67785-6
3. O'Hagan, A. (1994) Kendall's Advanced Theory of Statistics, Vol 2B, Bayesian Inference, Section 2.51. Arnold, ISBN   0-340-52922-9
4. Jaynes, E. T. (1976). "Confidence Intervals vs Bayesian Intervals", in Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, (W. L. Harper and C. A. Hooker, eds.), Dordrecht: D. Reidel, pp. 175 et seq