Part of a series on statistics |
Probability theory |
---|
In probability theory, an experiment or trial (see below) is any procedure that can be infinitely repeated and has a well-defined set of possible outcomes, known as the sample space. [1] An experiment is said to be random if it has more than one possible outcome, and deterministic if it has only one. A random experiment that has exactly two (mutually exclusive) possible outcomes is known as a Bernoulli trial. [2]
When an experiment is conducted, one (and only one) outcome results— although this outcome may be included in any number of events, all of which would be said to have occurred on that trial. After conducting many trials of the same experiment and pooling the results, an experimenter can begin to assess the empirical probabilities of the various outcomes and events that can occur in the experiment and apply the methods of statistical analysis.
Random experiments are often conducted repeatedly, so that the collective results may be subjected to statistical analysis. A fixed number of repetitions of the same experiment can be thought of as a composed experiment, in which case the individual repetitions are called trials. For example, if one were to toss the same coin one hundred times and record each result, each toss would be considered a trial within the experiment composed of all hundred tosses. [3]
A random experiment is described or modeled by a mathematical construct known as a probability space. A probability space is constructed and defined with a specific kind of experiment or trial in mind.
A mathematical description of an experiment consists of three parts:
An outcome is the result of a single execution of the model. Since individual outcomes might be of little practical use, more complicated events are used to characterize groups of outcomes. The collection of all such events is a sigma-algebra . Finally, there is a need to specify each event's likelihood of happening; this is done using the probability measure function, P.
Once an experiment is designed and established, ω from the sample space Ω, all the events in that contain the selected outcome ω (recall that each event is a subset of Ω) are said to “have occurred”. The probability function P is defined in such a way that, if the experiment were to be repeated an infinite number of times, the relative frequencies of occurrence of each of the events would approach agreement with the values P assigns them.
As a simple experiment, we may flip a coin twice. The sample space (where the order of the two flips is relevant) is {(H, T), (T, H), (T, T), (H, H)} where "H" means "heads" and "T" means "tails". Note that each of (H, T), (T, H), ... are possible outcomes of the experiment. We may define an event which occurs when a "heads" occurs in either of the two flips. This event contains all of the outcomes except (T, T).
In probability theory, the sample space of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually denoted using set notation, and the possible ordered outcomes, or sample points, are listed as elements in the set. It is common to refer to a sample space by the labels S, Ω, or U. The elements of a sample space may be numbers, words, letters, or symbols. They can also be finite, countably infinite, or uncountably infinite.
In probability theory, an event is a set of outcomes of an experiment to which a probability is assigned. A single outcome may be an element of many different events, and different events in an experiment are usually not equally likely, since they may include very different groups of outcomes. An event consisting of only a single outcome is called an elementary event or an atomic event; that is, it is a singleton set. An event that has more than one possible outcomes is called compound event. An event is said to occur if contains the outcome of the experiment. The probability that an event occurs is the probability that contains the outcome of an experiment. An event defines a complementary event, namely the complementary set, and together these define a Bernoulli trial: did the event occur or not?
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event.
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.
A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' can be misleading as its mathematical definition is not actually random nor a variable, but rather it is a function from possible outcomes in a sample space to a measurable space, often to the real numbers.
In mathematical analysis and in probability theory, a σ-algebra on a set X is a nonempty collection Σ of subsets of X closed under complement, countable unions, and countable intersections. The ordered pair is called a measurable space.
In probability theory, a probability space or a probability triple is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models the throwing of a die.
In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variablesXi are identically distributed and independent. Prosaically, a Bernoulli process is a repeated coin flipping, possibly with an unfair coin. Every variable Xi in the sequence is associated with a Bernoulli trial or experiment. They all have the same Bernoulli distribution. Much of what can be said about the Bernoulli process can also be generalized to more than two outcomes ; this generalization is known as the Bernoulli scheme.
In the theory of probability and statistics, a Bernoulli trial is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted. It is named after Jacob Bernoulli, a 17th-century Swiss mathematician, who analyzed them in his Ars Conjectandi (1713).
In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete probability density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.
In probability theory, the law of large numbers (LLN) is a mathematical theorem that states that the average of the results obtained from a large number of independent and identical random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.
In probability theory, Kolmogorov's zero–one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, namely a tail event of independent σ-algebras, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.
In probability theory, an event is said to happen almost surely if it happens with probability 1. In other words, the set of outcomes on which the event does not occur has probability 0, even though the set might not be empty. The concept is analogous to the concept of "almost everywhere" in measure theory. In probability experiments on a finite sample space with a non-zero probability for each outcome, there is no difference between almost surely and surely ; however, this distinction becomes important when the sample space is an infinite set, because an infinite set can have non-empty subsets of probability 0.
In the field of information retrieval, divergence from randomness, one of the first models, is one type of probabilistic model. It is basically used to test the amount of information carried in the documents. It is based on Harter's 2-Poisson indexing-model. The 2-Poisson model has a hypothesis that the level of the documents is related to a set of documents which contains words occur relatively greater than the rest of the documents. It is not a 'model', but a framework for weighting terms using probabilistic methods, and it has a special relationship for term weighting based on notion of eliteness.
In probability theory, the complement of any event A is the event [not A], i.e. the event that A does not occur. The event A and its complement [not A] are mutually exclusive and exhaustive. Generally, there is only one event B such that A and B are both mutually exclusive and exhaustive; that event is the complement of A. The complement of an event A is usually denoted as A′, Ac, A or A. Given an event, the event and its complementary event define a Bernoulli trial: did the event occur or not?
This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.
In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by Maurice Fréchet who commented that the “development of probability theory and expansion of area of its applications have led to necessity to pass from schemes where (random) outcomes of experiments can be described by number or a finite set of numbers, to schemes where outcomes of experiments represent, for example, vectors, functions, processes, fields, series, transformations, and also sets or collections of sets.”
In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin. In theoretical studies, the assumption that a coin is fair is often made by referring to an ideal coin.
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usually abbreviated as i.i.d., iid, or IID. IID was first defined in statistics and finds application in different fields such as data mining and signal processing.
In probability theory, an outcome is a possible result of an experiment or trial. Each possible outcome of a particular experiment is unique, and different outcomes are mutually exclusive. All of the possible outcomes of an experiment form the elements of a sample space.