Expected mean squares

Last updated

In statistics, expected mean squares (EMS) are the expected values of certain statistics arising in partitions of sums of squares in the analysis of variance (ANOVA). They can be used for ascertaining which statistic should appear in the denominator in an F-test for testing a null hypothesis that a particular effect is absent.

Contents

Definition

When the total corrected sum of squares in an ANOVA is partitioned into several components, each attributed to the effect of a particular predictor variable, each of the sums of squares in that partition is a random variable that has an expected value. That expected value divided by the corresponding number of degrees of freedom is the expected mean square for that predictor variable.

Example

The following example is from Longitudinal Data Analysis by Donald Hedeker and Robert D. Gibbons. [1]

Each of s treatments (one of which may be a placebo) is administered to a sample of (capital) N randomly chosen patients, on whom certain measurements are observed at each of (lower-case) n specified times, for (thus the numbers of patients receiving different treatments may differ), and We assume the sets of patients receiving different treatments are disjoint, so patients are nested within treatments and not crossed with treatments. We have

where

The total corrected sum of squares is

The ANOVA table below partitions the sum of squares (where ):

source of variabilitydegrees of freedomsum of squaresmean squareexpected mean square
treatment
time
treatment × time
patients within treatments
error

Use in F-tests

A null hypothesis of interest is that there is no difference between effects of different treatments—thus no difference among treatment means. This may be expressed by saying (with the notation as used in the table above). Under this null hypothesis, the expected mean square for effects of treatments is

The numerator in the F-statistic for testing this hypothesis is the mean square due to differences among treatments, i.e. it is The denominator, however, is not The reason is that the random variable below, although under the null hypothesis it has an F-distribution, is not observable—it is not a statistic—because its value depends on the unobservable parameters and

Instead, one uses as the test statistic the following random variable that is not defined in terms of :

Notes and references

  1. Donald Hedeker, Robert D. Gibbons. Longitudinal Data Analysis. Wiley Interscience. 2006. pp. 21–24

Related Research Articles

In physics, the cross section is a measure of the probability that a specific process will take place when some kind of radiant excitation intersects a localized phenomenon. For example, the Rutherford cross-section is a measure of probability that an alpha-particle will be deflected by a given angle during a collision with an atomic nucleus. Cross section is typically denoted σ (sigma) and is expressed in units of transverse area. In a way, it can be thought of as the size of the object that the excitation must hit in order for the process to occur, but more exactly, it is a parameter of a stochastic process.

Normal distribution Probability distribution

In probability theory, a normaldistribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

Pauli matrices Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices which are Hermitian and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries. They are

Dirac delta function pseudo-function δ such that an integral of δ(x-c)f(x) always takes the value of f(c)

In mathematics, the Dirac delta function is a generalized function or distribution introduced by physicist Paul Dirac. It is used to model the density of an idealized point mass or point charge as a function equal to zero everywhere except for zero and whose integral over the entire real line is equal to one. As there is no function that has these properties, the computations made by theoretical physicists appeared to mathematicians as nonsense until the introduction of distributions by Laurent Schwartz to formalize and validate the computations. As a distribution, the Dirac delta function is a linear function that maps every function to its value at zero. The Kronecker delta function, which is usually defined on a discrete domain and takes values 0 and 1, is a discrete analog of the Dirac delta function.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number . Similarly, an improper integral of a function, , is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if

Hookes law Principle of physics that states that the force (F) needed to extend or compress a spring by some distance X scales linearly with respect to that distance

Hooke's law is a law of physics that states that the force needed to extend or compress a spring by some distance scales linearly with respect to that distance—that is, Fs = kx, where k is a constant factor characteristic of the spring, and x is small compared to the total possible deformation of the spring. The law is named after 17th-century British physicist Robert Hooke. He first stated the law in 1676 as a Latin anagram. He published the solution of his anagram in 1678 as: ut tensio, sic vis. Hooke states in the 1678 work that he was aware of the law since 1660.

Thomson scattering

Thomson scattering is the elastic scattering of electromagnetic radiation by a free charged particle, as described by classical electromagnetism. It is the low-energy limit of Compton scattering: the particle's kinetic energy and photon frequency do not change as a result of the scattering. This limit is valid as long as the photon energy is much smaller than the mass energy of the particle: , or equivalently, if the wavelength of the light is much greater than the Compton wavelength of the particle.

In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

In multivariate statistics, if is a vector of random variables, and is an -dimensional symmetric matrix, then the scalar quantity is known as a quadratic form in .

In mathematics, the Weyl character formula in representation theory describes the characters of irreducible representations of compact Lie groups in terms of their highest weights. It was proved by Hermann Weyl. There is a closely related formula for the character of an irreducible representation of a semisimple Lie algebra. In Weyl's approach to the representation theory of connected compact Lie groups, the proof of the character formula is a key step in proving that every dominant integral element actually arises as the highest weight of some irreducible representation. Important consequences of the character formula are the Weyl dimension formula and the Kostant multiplicity formula.

Newman–Penrose formalism Notation in general relativity

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

Effective medium approximations (EMA) or effective medium theory (EMT) pertain to analytical or theoretical modeling that describes the macroscopic properties of composite materials. EMAs or EMTs are developed from averaging the multiple values of the constituents that directly make up the composite material. At the constituent level, the values of the materials vary and are inhomogeneous. Precise calculation of the many constituent values is nearly impossible. However, theories have been developed that can produce acceptable approximations which in turn describe useful parameters including the effective permittivity and permeability of the materials as a whole. In this sense, effective medium approximations are descriptions of a medium based on the properties and the relative fractions of its components and are derived from calculations, and effective medium theory. There are two widely used formulae.

When an electromagnetic wave travels through a medium in which it gets attenuated, it undergoes exponential decay as described by the Beer–Lambert law. However, there are many possible ways to characterize the wave and how quickly it is attenuated. This article describes the mathematical relationships among:

Uniform convergence in probability is a form of convergence in probability in statistical asymptotic theory and probability theory. It means that, under certain conditions, the empirical frequencies of all events in a certain event-family converge to their theoretical probabilities. Uniform convergence in probability has applications to statistics as well as machine learning as part of statistical learning theory.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

Errors-in-variables models Regression models accounting for possible errors in independent variables

In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

Coulombs law Fundamental physical law of electromagnetism

Coulomb's law, or Coulomb's inverse-square law, is an experimental law of physics that quantifies the amount of force between two stationary, electrically charged particles. The electric force between charged bodies at rest is conventionally called electrostatic force or Coulomb force. The law was first discovered in 1785 by French physicist Charles-Augustin de Coulomb, hence the name. Coulomb's law was essential to the development of the theory of electromagnetism, maybe even its starting point, as it made it possible to discuss the quantity of electric charge in a meaningful way.

In quantum information theory, the classical capacity of a quantum channel is the maximum rate at which classical data can be sent over it error-free in the limit of many uses of the channel. Holevo, Schumacher, and Westmoreland proved the following least upper bound on the classical capacity of any quantum channel :