Concordance correlation coefficient

Last updated

In statistics, the concordance correlation coefficient measures the agreement between two variables, e.g., to evaluate reproducibility or for inter-rater reliability.

Contents

Definition

The form of the concordance correlation coefficient as [1]

where and are the means for the two variables and and are the corresponding variances. is the correlation coefficient between the two variables.

This follows from its definition [1] as

When the concordance correlation coefficient is computed on a -length data set (i.e., paired data values , for ), the form is

where the mean is computed as

and the variance

and the covariance

Whereas the ordinary correlation coefficient (Pearson's) is immune to whether the biased or unbiased versions for estimation of the variance is used, the concordance correlation coefficient is not. In the original article Lin suggested the 1/N normalization, [1] while in another article Nickerson appears to have used the 1/(N-1), [2] i.e., the concordance correlation coefficient may be computed slightly differently between implementations.

Relation to other measures of correlation

The concordance correlation coefficient is nearly identical to some of the measures called intra-class correlations. Comparisons of the concordance correlation coefficient with an "ordinary" intraclass correlation on different data sets found only small differences between the two correlations, in one case on the third decimal. [2] It has also been stated [3] that the ideas for concordance correlation coefficient "are quite similar to results already published by Krippendorff [4] in 1970".

In the original article [1] Lin suggested a form for multiple classes (not just 2). Over ten years later a correction to this form was issued. [5]

One example of the use of the concordance correlation coefficient is in a comparison of analysis method for functional magnetic resonance imaging brain scans. [6]

Related Research Articles

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-1/2 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way. It has become vital in the building of the Standard Model.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

<span class="mw-page-title-main">Correlation</span> Statistical concept

In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics it usually refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

<span class="mw-page-title-main">Pearson correlation coefficient</span> Measure of linear correlation

In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviations; thus, it is essentially a normalized measurement of the covariance, such that the result always has a value between −1 and 1. As with covariance itself, the measure can only reflect a linear correlation of variables, and ignores many other types of relationships or correlations. As a simple example, one would expect the age and height of a sample of children from a primary school to have a Pearson correlation coefficient significantly greater than 0, but less than 1.

<span class="mw-page-title-main">Spearman's rank correlation coefficient</span> Nonparametric measure of rank correlation

In statistics, Spearman's rank correlation coefficient or Spearman's ρ, named after Charles Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

When studying and formulating Albert Einstein's theory of general relativity, various mathematical structures and techniques are utilized. The main tools used in this geometrical theory of gravitation are tensor fields defined on a Lorentzian manifold representing spacetime. This article is a general description of the mathematics of general relativity.

In general relativity, the metric tensor is the fundamental object of study. The metric captures all the geometric and causal structure of spacetime, being used to define notions such as time, distance, volume, curvature, angle, and separation of the future and the past.

In probability theory, calculation of the sum of normally distributed random variables is an instance of the arithmetic of random variables.

The sensitivity index or discriminability index or detectability index is a dimensionless statistic used in signal detection theory. A higher index indicates that the signal can be more readily detected.

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot need not be a statistic — the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.

Bayesian linear regression is a type of conditional modeling in which the mean of one variable is described by a linear combination of other variables, with the goal of obtaining the posterior probability of the regression coefficients and ultimately allowing the out-of-sample prediction of the regressandconditional on observed values of the regressors. The simplest and most widely used version of this model is the normal linear model, in which given is distributed Gaussian. In this model, and under a particular choice of prior probabilities for the parameters—so-called conjugate priors—the posterior can be found analytically. With more arbitrarily chosen priors, the posteriors generally have to be approximated.

In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. When determining the numerical relationship between two variables of interest, using their correlation coefficient will give misleading results if there is another confounding variable that is numerically related to both variables of interest. This misleading information can be avoided by controlling for the confounding variable, which is done by computing the partial correlation coefficient. This is precisely the motivation for including other right-side variables in a multiple regression; but while multiple regression gives unbiased results for the effect size, it does not give a numerical value of a measure of the strength of the relationship between the two variables of interest.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

A paired difference test, better known as a paired comparison, is a type of location test that is used when comparing two sets of paired measurements to assess whether their population means differ. A paired difference test is designed for situations where there is dependence between pairs of measurements. That applies in a within-subjects study design, i.e., in a study where the same set of subjects undergo both of the conditions being compared.

A non-expanding horizon (NEH) is an enclosed null surface whose intrinsic structure is preserved. An NEH is the geometric prototype of an isolated horizon which describes a black hole in equilibrium with its exterior from the quasilocal perspective. It is based on the concept and geometry of NEHs that the two quasilocal definitions of black holes, weakly isolated horizons and isolated horizons, are developed.

Lagrangian field theory is a formalism in classical field theory. It is the field-theoretic analogue of Lagrangian mechanics. Lagrangian mechanics is used to analyze the motion of a system of discrete particles each with a finite number of degrees of freedom. Lagrangian field theory applies to continua and fields, which have an infinite number of degrees of freedom.

Batch normalization is a method used to make training of artificial neural networks faster and more stable through normalization of the layers' inputs by re-centering and re-scaling. It was proposed by Sergey Ioffe and Christian Szegedy in 2015.

References

  1. 1 2 3 4 Lawrence I-Kuei Lin (March 1989). "A concordance correlation coefficient to evaluate reproducibility". Biometrics . 45 (1): 255–268. doi:10.2307/2532051. JSTOR   2532051. PMID   2720055.
  2. 1 2 Carol A. E. Nickerson (December 1997). "A Note on "A Concordance Correlation Coefficient to Evaluate Reproducibility". Biometrics . 53 (4): 1503–1507. doi:10.2307/2533516. JSTOR   2533516.
  3. Reinhold Müller; Petra Büttner (December 1994). "A critical discussion of intraclass correlation coefficients". Statistics in Medicine . 13 (23–24): 2465–2476. doi:10.1002/sim.4780132310. PMID   7701147.
  4. Klaus Krippendorff (1970). E. F. Borgatta (ed.). "Bivariate Agreement Coefficients for Reliability of Data". Sociological Methodology. 2. San Francisco: Jossey-Bass: 139–150. doi:10.2307/270787. JSTOR   270787.
  5. Lawrence I-Kuei Lin (March 2000). "A Note on the Concordance Correlation Coefficient". Biometrics . 56: 324–325. doi: 10.1111/j.0006-341X.2000.00324.x .
  6. N Lange; S C Strother; J R Anderson; F A Nielsen; A P Holmes; T Kolenda; R Savoy; L K Hansen (September 1999). "Plurality and resemblance in fMRI data analysis". NeuroImage . 10 (3 Pt 1): 282–303. CiteSeerX   10.1.1.158.6688 . doi:10.1006/NIMG.1999.0472. ISSN   1053-8119. PMID   10458943. Wikidata   Q21012624.

For a small Excel and VBA implementation by Peter Urbani see here