Tschuprow's T

Last updated

  

Contents

Tschuprow's T

In statistics, Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). It is closely related to Cramér's V, coinciding with it for square contingency tables. It was published by Alexander Tschuprow (alternative spelling: Chuprov) in 1939. [1]

Statistics study of the collection, organization, analysis, interpretation, and presentation of data

Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with all aspects of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

In statistics, Cramér's V is a measure of association between two nominal variables, giving a value between 0 and +1 (inclusive). It is based on Pearson's chi-squared statistic and was published by Harald Cramér in 1946.

Alexander Alexandrovich Chuprov Russian statistician who worked on mathematical statistics, sample survey theory and demography.

Definition

For an r × c contingency table with r rows and c columns, let be the proportion of the population in cell and let

and

Then the mean square contingency is given as

and Tschuprow's T as


Properties

T equals zero if and only if independence holds in the table, i.e., if and only if . T equals one if and only there is perfect dependence in the table, i.e., if and only if for each i there is only one j such that and vice versa. Hence, it can only equal 1 for square tables. In this it differs from Cramér's V, which can be equal to 1 for any rectangular table.

Estimation

If we have a multinomial sample of size n, the usual way to estimate T from the data is via the formula

where is the proportion of the sample in cell . This is the empirical value of T. With the Pearson chi-square statistic, this formula can also be written as

Pearson's chi-squared test (χ2) is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is suitable for unpaired data from large samples. It is the most widely used of many chi-squared tests – statistical procedures whose results are evaluated by reference to the chi-squared distribution. Its properties were first investigated by Karl Pearson in 1900. In contexts where it is important to improve a distinction between the test statistic and its distribution, names similar to Pearson χ-squared test or statistic are used.

See also

Other measures of correlation for nominal data:

In statistics, the phi coefficient is a measure of association for two binary variables. Introduced by Karl Pearson, this measure is similar to the Pearson correlation coefficient in its interpretation. In fact, a Pearson correlation coefficient estimated for two binary variables will return the phi coefficient. The square of the phi coefficient is related to the chi-squared statistic for a 2×2 contingency table

In statistics, the uncertainty coefficient, also called proficiency, entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil and is based on the concept of information entropy.

Other related articles:

Related Research Articles

Normal distribution probability distribution

In probability theory, the normaldistribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. A random variable with a Gaussian distribution is said to be normally distributed and is called a normal deviate.

Rayleigh distribution

In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. It is essentially a chi distribution with two degrees of freedom.

In electrostatics, the coefficients of potential determine the relationship between the charge and electrostatic potential, which is purely geometric:

In mathematics, in the area of number theory, a Gaussian period is a certain kind of sum of roots of unity. The periods permit explicit calculations in cyclotomic fields connected with Galois theory and with harmonic analysis. They are basic in the classical theory called cyclotomy. Closely related is the Gauss sum, a type of exponential sum which is a linear combination of periods.

Thomson scattering

Thomson scattering is the elastic scattering of electromagnetic radiation by a free charged particle, as described by classical electromagnetism. It is just the low-energy limit of Compton scattering: the particle kinetic energy and photon frequency do not change as a result of the scattering. This limit is valid as long as the photon energy is much smaller than the mass energy of the particle: , or equivalently, if the wavelength of the light is much greater than the Compton wavelength of the particle.

In statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended.

In statistics, a contingency table is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business intelligence, engineering and scientific research. They provide a basic picture of the interrelation between two variables and can help find interactions between them. The term contingency table was first used by Karl Pearson in "On the Theory of Contingency and Its Relation to Association and Normal Correlation", part of the Drapers' Company Research Memoirs Biometric Series I published in 1904.

In mathematics, the Legendre chi function is a special function whose Taylor series is also a Dirichlet series, given by

Noncentral chi-squared distribution

In probability theory and statistics, the noncentral chi-squared or noncentral distribution is a generalization of the chi-squared distribution. This distribution often arises in the power analysis of statistical tests in which the null distribution is a chi-squared distribution; important examples of such tests are the likelihood-ratio tests.

ARGUS distribution

In physics, the ARGUS distribution, named after the particle physics experiment ARGUS, is the probability distribution of the reconstructed invariant mass of a decayed particle candidate in continuum background.

The Hückel method or Hückel molecular orbital theory, proposed by Erich Hückel in 1930, is a very simple linear combination of atomic orbitals molecular orbitals method for the determination of energies of molecular orbitals of π-electrons in π-delocalized molecules, such as ethylene, benzene, butadiene, and pyridine. It is the theoretical basis for Hückel's rule for the aromaticity of π-electron cyclic, planar systems. It was later extended to conjugated molecules such as pyridine, pyrrole and furan that contain atoms other than carbon, known in this context as heteroatoms. A more dramatic extension of the method to include σ-electrons, known as the extended Hückel method, was developed by Roald Hoffmann. The extended Hückel method gives some degree of quantitative accuracy for organic molecules in general and was used to test the Woodward–Hoffmann rules.

ADM formalism Hamiltonian formulation of general relativity

The ADM formalism is a Hamiltonian formulation of general relativity that plays an important role in canonical quantum gravity and numerical relativity. It was first published in 1959.

The time-evolving block decimation (TEBD) algorithm is a numerical scheme used to simulate one-dimensional quantum many-body systems, characterized by at most nearest-neighbour interactions. It is dubbed Time-evolving Block Decimation because it dynamically identifies the relevant low-dimensional Hilbert subspaces of an exponentially larger original Hilbert space. The algorithm, based on the Matrix Product States formalism, is highly efficient when the amount of entanglement in the system is limited, a requirement fulfilled by a large class of quantum many-body systems in one dimension.

In mathematics, the Schur orthogonality relations, which is proven by Issai Schur through Schur's lemma, express a central fact about representations of finite groups. They admit a generalization to the case of compact groups in general, and in particular compact Lie groups, such as the rotation group SO(3).

In a Fourier transform (FT), the Fourier transformed function is obtained from by:

The system size expansion, also known as van Kampen's expansion or the Ω-expansion, is a technique pioneered by Nico van Kampen used in the analysis of stochastic processes. Specifically, it allows one to find an approximation to the solution of a master equation with nonlinear transition rates. The leading order term of the expansion is given by the linear noise approximation, in which the master equation is approximated by a Fokker–Planck equation with linear coefficients determined by the transition rates and stoichiometry of the system.

In representation theory of mathematics, the Waldspurger formula relates the special values of two L-functions of two related admissible irreducible representations. Let k be the base field, f be an automorphic form over k, π be the representation associated via the Jacquet–Langlands correspondence with f. Goro Shimura (1976) proved this formula, when and f is a cusp form; Günter Harder made the same discovery at the same time in an unpublished paper. Marie-France Vignéras (1980) proved this formula, when and f is a newform. Jean-Loup Waldspurger, for whom the formula is named, reproved and generalized the result of Vignéras in 1985 via a totally different method which was widely used thereafter by mathematicians to prove similar formulas.

References

  1. Tschuprow, A. A. (1939) Principles of the Mathematical Theory of Correlation; translated by M. Kantorowitsch. W. Hodge & Co.