Standardized mean of a contrast variable

Last updated January 06, 2020

In statistics, the standardized mean of a contrast variable (SMCV or SMC), is a parameter assessing effect size. The SMCV is defined as mean divided by the standard deviation of a contrast variable.^[1]^[2] The SMCV was first proposed for one-way ANOVA cases ^[2] and was then extended to multi-factor ANOVA cases .^[3]

Background

Consistent interpretations for the strength of group comparison, as represented by a contrast, are important.^[4]^[5]

When there are only two groups involved in a comparison, SMCV is the same as the strictly standardized mean difference (SSMD). SSMD belongs to a popular type of effect-size measure called "standardized mean differences"^[6] which includes Cohen's $d$ ^[7] and Glass's $\delta .$ ^[8]

In ANOVA, a similar parameter for measuring the strength of group comparison is standardized effect size (SES).^[9] One issue with SES is that its values are incomparable for contrasts with different coefficients. SMCV does not have such an issue.

Concept

Suppose the random values in t groups represented by random variables $G_{1},G_{2},\ldots ,G_{t}$ have means $\mu _{1},\mu _{2},\ldots ,\mu _{t}$ and variances $\sigma _{1}^{2},\sigma _{2}^{2},\ldots ,\sigma _{t}^{2}$ , respectively. A contrast variable $V$ is defined by

V=\sum _{i=1}^{t}c_{i}G_{i},

where the $c_{i}$ 's are a set of coefficients representing a comparison of interest and satisfy $\sum _{i=1}^{t}c_{i}=0$ . The SMCV of contrast variable $V$ , denoted by $\lambda$ , is defined as^[1]

\lambda ={\frac {\operatorname {E} (V)}{\operatorname {stdev} (V)}}={\frac {\sum _{i=1}^{t}c_{i}\mu _{i}}{\sqrt {{\text{Var}}\left(\sum _{i=1}^{t}c_{i}G_{i}\right)}}}={\frac {\sum _{i=1}^{t}c_{i}\mu _{i}}{\sqrt {\sum _{i=1}^{t}c_{i}^{2}\sigma _{i}^{2}+2\sum _{i=1}^{t}\sum _{j=i}c_{i}c_{j}\sigma _{ij}}}}

where $\sigma _{ij}$ is the covariance of $G_{i}$ and $G_{j}$ . When $G_{1},G_{2},\ldots ,G_{t}$ are independent,

\lambda ={\frac {\sum _{i=1}^{t}c_{i}\mu _{i}}{\sqrt {\sum _{i=1}^{t}c_{i}^{2}\sigma _{i}^{2}}}}.

Classifying rule for the strength of group comparisons

The population value (denoted by $\lambda$ ) of SMCV can be used to classify the strength of a comparison represented by a contrast variable, as shown in the following table.^[1]^[2] This classifying rule has a probabilistic basis due to the link between SMCV and c⁺-probability.^[1]

Effect type	Effect subtype	Thresholds for negative SMCV	Thresholds for positive SMCV
Extra large	Extremely strong	$\lambda \leq -5$	$\lambda \geq 5$
	Very strong	$-5<\lambda \leq -3$	$5>\lambda \geq 3$
	Strong	$-3<\lambda \leq -2$	$3>\lambda \geq 2$
	Fairly strong	$-2<\lambda \leq -1.645$	$2>\lambda \geq 1.645$
Large	Moderate	$-1.645<\lambda \leq -1.28$	$1.645>\lambda \geq 1.28$
Large	Fairly moderate	$-1.28<\lambda \leq -1$	$1.28>\lambda \geq 1$
Medium	Fairly weak	$-1<\lambda \leq -0.75$	$1>\lambda \geq 0.75$
	Weak	$-0.75<\lambda <-0.5$	$0.75>\lambda >0.5$
	Very weak	$-0.5\leq \lambda <-0.25$	$0.5\geq \lambda >0.25$
Small	Extremely weak	$-0.25\leq \lambda <0$	$0.25\geq \lambda >0$
Small	No effect	$\lambda =0$

Statistical estimation and inference

The estimation and inference of SMCV presented below is for one-factor experiments.^[1]^[2] Estimation and inference of SMCV for multi-factor experiments has also been discussed.^[1]^[3]

The estimation of SMCV relies on how samples are obtained in a study. When the groups are correlated, it is usually difficult to estimate the covariance among groups. In such a case, a good strategy is to obtain matched or paired samples (or subjects) and to conduct contrast analysis based on the matched samples. A simple example of matched contrast analysis is the analysis of paired difference of drug effects after and before taking a drug in the same patients. By contrast, another strategy is to not match or pair the samples and to conduct contrast analysis based on the unmatched or unpaired samples. A simple example of unmatched contrast analysis is the comparison of efficacy between a new drug taken by some patients and a standard drug taken by other patients. Methods of estimation for SMCV and c⁺-probability in matched contrast analysis may differ from those used in unmatched contrast analysis.

Unmatched samples

Consider an independent sample of size $n_{i}$ ,

Y_{i}=\left(Y_{i1},Y_{i2},\ldots ,Y_{in_{i}}\right)

from the $i^{\text{th}}(i=1,2,\ldots ,t)$ group $G_{i}$ . $Y_{i}$ 's are independent. Let ${\bar {Y}}_{i}={\frac {1}{n_{i}}}\sum _{j=1}^{n_{i}}Y_{ij}$ ,

s_{i}^{2}={\frac {1}{n_{i}-1}}\sum _{j=1}^{n_{i}}\left(Y_{ij}-{\bar {Y}}_{i}\right)^{2},

N=\sum _{i=1}^{t}n_{i}

and

{\text{MSE }}={\frac {1}{N-t}}\sum _{i=1}^{t}\left(n_{i}-1\right)s_{i}^{2}.

When the $t$ groups have unequal variance, the maximal likelihood estimate (MLE) and method-of-moment estimate (MM) of SMCV ( $\lambda$ ) are, respectively^[1]^[2]

{\hat {\lambda }}_{\text{MLE }}={\frac {\sum _{i=1}^{t}c_{i}{\bar {Y}}_{i}}{\sqrt {\sum _{i=1}^{t}{\frac {n_{i}-1}{n_{i}}}c_{i}^{2}s_{i}^{2}}}}

and

{\hat {\lambda }}_{\text{MM}}={\frac {\sum _{i=1}^{t}c_{i}{\bar {Y}}_{i}}{\sqrt {\sum _{i=1}^{t}c_{i}^{2}s_{i}^{2}}}}.

When the $t$ groups have equal variance, under normality assumption, the uniformly minimal variance unbiased estimate (UMVUE) of SMCV ( $\lambda$ ) is^[1]^[2]

{\hat {\lambda }}_{\text{UMVUE}}={\sqrt {\frac {K}{N-t}}}{\frac {\sum _{i=1}^{t}c_{i}{\bar {Y}}_{i}}{\sqrt {\sum _{i=1}^{t}{\text{MSE }}c_{i}^{2}}}}

where $K={\frac {2\left(\Gamma \left({\frac {N-t}{2}}\right)\right)^{2}}{\left(\Gamma \left({\frac {N-t-1}{2}}\right)\right)^{2}}}$ .

The confidence interval of SMCV can be made using the following non-central t-distribution:^[1]^[2]

T={\frac {\sum _{i=1}^{t}c_{i}{\bar {Y}}_{i}}{\sqrt {\sum _{i=1}^{t}{\text{MSE }}c_{i}^{2}/n_{i}}}}\sim {\text{noncentral }}t(N-t,b\lambda )

where $b={\sqrt {\frac {\sum _{i=1}^{t}c_{i}^{2}}{\sum _{i=1}^{t}c_{i}^{2}/n_{i}}}}.$

Matched samples

In matched contrast analysis, assume that there are $n$ independent samples $\left(Y_{1j},Y_{2j},\cdots ,Y_{tj}\right)$ from $t$ groups ( $G_{i}$ 's), where $i=1,2,\cdots ,t;j=1,2,\cdots ,n$ . Then the $j^{\text{th}}$ observed value of a contrast $V=\sum _{i=1}^{t}c_{i}G_{i}$ is $v_{j}=\sum _{i=1}^{t}c_{i}Y_{i}$ .

Let ${\bar {V}}$ and $s_{V}^{2}$ be the sample mean and sample variance of the contrast variable $V$ , respectively. Under normality assumptions, the UMVUE estimate of SMCV is^[1]

{\hat {\lambda }}_{\text{UMVUE}}={\sqrt {\frac {K}{n-1}}}{\frac {\bar {V}}{s_{V}}}

where $K={\frac {2\left(\Gamma \left({\frac {n-1}{2}}\right)\right)^{2}}{\left(\Gamma \left({\frac {n-2}{2}}\right)\right)^{2}}}.$

A confidence interval for SMCV can be made using the following non-central t-distribution:^[1]

T={\frac {\bar {V}}{s_{V}/{\sqrt {n}}}}\sim {\text{noncentral }}t\left(n-1,{\sqrt {n}}\lambda \right).

Related Research Articles

Multivariate normal distribution generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In probability theory and statistics, the chi-square distribution with $k$ degrees of freedom is the distribution of a sum of the squares of $k$ independent standard normal random variables. The chi-square distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals. When it is being distinguished from the more general noncentral chi-square distribution, this distribution is sometimes called the central chi-square distribution.

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints. The basic idea is to convert a constrained problem into a form such that the derivative test of an unconstrained problem can still be applied. Once stationary points have been identified from the first-order necessary conditions, the definiteness of the bordered Hessian matrix determines whether those points are maxima, minima, or saddle points.

In statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance.

In mathematics, in particular functional analysis, the singular values, or s-numbers of a compact operator T : X → Y acting between Hilbert spaces X and Y, are the square roots of non-negative eigenvalues of the self-adjoint operator T^*T .

In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. In the simplest cases, the result can be either a continuous or a discrete distribution.

In probability theory and statistics, the noncentral chi-square distribution is a generalization of the chi-square distribution. It often arises in the power analysis of statistical tests in which the null distribution is a chi-square distribution; important examples of such tests are the likelihood-ratio tests.

In probability theory and statistics, the noncentral chi distribution is a generalization of the chi distribution.

In continuum mechanics, the Cauchy stress tensor $, true stress tensor, or simply called the stress tensor is a second order tensor named after Augustin-Louis Cauchy. The tensor consists of nine components that completely define the state of stress at a point inside a material in the deformed state, placement, or configuration. The tensor relates a unit-length direction vector n to the traction vector T (n) across an imaginary surface perpendicular to n :$

In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator.

The possibility that there might be more than one dimension of time has occasionally been discussed in physics and philosophy.

In probability theory and statistics, the Conway–Maxwell–Poisson distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family, has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.

In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCA is used for estimating the unknown regression coefficients in a standard linear regression model.

Multiple-try Metropolis (MTM) is a sampling method that is a modified form of the Metropolis–Hastings method, first presented by Liu, Liang, and Wong in 2000. It is designed to help the sampling trajectory converge faster, by increasing both the step size and the acceptance rate.

Normal-inverse-gamma distribution four-parameter family of multivariate continuous probability distributions

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

In the branch of mathematics called graph theory, the strength of an undirected graph corresponds to the minimum ratio edges removed/components created in a decomposition of the graph in question. It is a method to compute partitions of the set of vertices and detect zones of high concentration of edges, and is analogous to graph toughness which is defined similarly for vertex removal.

In the field of mathematical analysis, a general Dirichlet series is an infinite series that takes the form of

In the mathematical theory of random matrices, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices. The theorem is named after Ukrainian mathematicians Vladimir Marchenko and Leonid Pastur who proved this result in 1967.

In mathematics, a determinantal point process is a stochastic point process, the probability distribution of which is characterized as a determinant of some function. Such processes arise as important tools in random matrix theory, combinatorics, physics, and wireless network modeling.

References

1 2 3 4 5 6 7 8 9 10 11 Zhang XHD (2011). Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research. Cambridge University Press. ISBN 978-0-521-73444-8.
1 2 3 4 5 6 7 Zhang XHD (2009). "A method for effectively comparing gene effects in multiple conditions in RNAi and expression-profiling research". Pharmacogenomics. 10: 345–58. doi:10.2217/14622416.10.3.345. PMID 20397965.
1 2 Zhang XHD (2010). "Assessing the size of gene or RNAi effects in multifactor high-throughput experiments". Pharmacogenomics. 11: 199–213. doi:10.2217/PGS.09.136. PMID 20136359.
↑ Rosenthal R, Rosnow RL, Rubin DB (2000). Contrasts and Effect Sizes in Behavioral Research. Cambridge University Press. ISBN 0-521-65980-9.
↑ Huberty CJ (2002). "A history of effect size indices". Educational and Psychological Measurement. 62: 227–40. doi:10.1177/0013164402062002002.
↑ Kirk RE (1996). "Practical significance: A concept whose time has come". Educational and Psychological Measurement. 56: 746–59. doi:10.1177/0013164496056005002.
↑ Cohen J (1962). "The statistical power of abnormal-social psychological research: A review". Journal of Abnormal and Social Psychology. 65: 145–53. doi:10.1037/h0045186. PMID 13880271.
↑ Glass GV (1976). "Primary, secondary, and meta-analysis of research". Educational Researcher. 5: 3–8. doi:10.3102/0013189X005010003.
↑ Steiger JH (2004). "Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis". Psychological Methods. 9: 164–82. doi:10.1037/1082-989x.9.2.164. PMID 15137887.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[ZhangBook2011-1] 1 2 3 4 5 6 7 8 9 10 11 Zhang XHD (2011). Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research. Cambridge University Press. ISBN 978-0-521-73444-8.

[ZhangPharmacogenomics2009-2] 1 2 3 4 5 6 7 Zhang XHD (2009). "A method for effectively comparing gene effects in multiple conditions in RNAi and expression-profiling research". Pharmacogenomics. 10: 345–58. doi:10.2217/14622416.10.3.345. PMID 20397965.

[ZhangPharmacogenomics2010-3] 1 2 Zhang XHD (2010). "Assessing the size of gene or RNAi effects in multifactor high-throughput experiments". Pharmacogenomics. 11: 199–213. doi:10.2217/PGS.09.136. PMID 20136359.

[RosenthaletalBook2000-4] Rosenthal R, Rosnow RL, Rubin DB (2000). Contrasts and Effect Sizes in Behavioral Research. Cambridge University Press. ISBN 0-521-65980-9.

[Huberty2002-5] Huberty CJ (2002). "A history of effect size indices". Educational and Psychological Measurement. 62: 227–40. doi:10.1177/0013164402062002002.

[Kirk1996-6] Kirk RE (1996). "Practical significance: A concept whose time has come". Educational and Psychological Measurement. 56: 746–59. doi:10.1177/0013164496056005002.

[Cohen1962-7] Cohen J (1962). "The statistical power of abnormal-social psychological research: A review". Journal of Abnormal and Social Psychology. 65: 145–53. doi:10.1037/h0045186. PMID 13880271.

[Glass1976-8] Glass GV (1976). "Primary, secondary, and meta-analysis of research". Educational Researcher. 5: 3–8. doi:10.3102/0013189X005010003.

[9] Steiger JH (2004). "Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis". Psychological Methods. 9: 164–82. doi:10.1037/1082-989x.9.2.164. PMID 15137887.