Transmission disequilibrium test

Last updated

The transmission disequilibrium test (TDT) was proposed by Spielman, McGinnis and Ewens (1993) [1] as a family-based association test for the presence of genetic linkage between a genetic marker and a trait. It is an application of McNemar's test.

Contents

A specificity of the TDT is that it will detect genetic linkage only in the presence of genetic association. While genetic association can be caused by population structure, genetic linkage will not be affected, which makes the TDT robust to the presence of population structure.

The case of trios: one affected child per family

Description of the test

We first describe the TDT in the case where families consist of trios (two parents and one affected child). Our description follows the notations used in Spielman, McGinnis & Ewens (1993). [1]

The TDT measures the over-transmission of an allele from heterozygous parents to affected offsprings. The n affected offsprings have 2n parents. These can be represented by the transmitted and the non-transmitted alleles and at some genetic locus. Summarizing the data in a 2 by 2 table gives:

Non-transmitted allele
Transmitted alleleM1M2Total
M1aba + b
M2cdc + d
Totala + cb + d2n

The derivation of the TDT shows that one should only use the heterozygous parents (total number b+c). The TDT tests whether the proportions b/(b+c) and c/(b+c) are compatible with probabilities (0.5, 0.5). This hypothesis can be tested using a binomial (asymptotically chi-square) test with one degree of freedom:

Outline of the test derivation

A derivation of the test consists of using a population genetics model to obtain the expected proportions for the quantities and in the table above. In particular, one can show that under nearly all disease models the expected proportion of and are identical. This result motivates the use of a binomial (asymptotically ) test to test whether these proportions are equal.

On the other hand, one can also show that under such models the proportions and are not equal to the product of the marginals probabilities , and , . A rewording of this statement would be that the type of the transmitted allele is not, in general, independent of the type of the non-transmitted allele. A consequence is that a test for homogeneity/independence does not test the appropriate hypothesis, and thus, only heterozygous parents are included.

Extension to two affected child per family

Extension of the test

The TDT can be readily extended beyond the case of trios. We keep following the notations of Spielman, McGinnis & Ewens (1993). [1] Consider a total of heterozygous parents. We use the fact that the transmission to different children are independent. The information can be then summarized in three categories:

= number of parents who transmit to both children.
= number of parents who transmit to one child and to another.
= number of parents who transmit to both children.

Using the notations of the previous paragraph we have:

leading to the chi-squared test statistic:

Relation with another linkage statistic

The comparison with the more traditional (at least at the time when the TDT was proposed) linkage test proposed by Blackwelder and Elston 1985 [2] is informative. The Blackwelder and Elston approach uses the total number of haplotypes identical by descent (mean haplotype sharing). This measure ignores the allelic state of a marker and simply compares the number of times a parent transmits the same allele to both affected children with the number of times a different allele is transmitted. The test statistic is:

Under the null hypothesis of no linkage the expected proportions of (i, h  i  j, j) are (0.25, 0.5, 0.25). One can derive a simple chi-square statistic with 2 degrees of freedom:

It clearly appears that the total statistic (with two degree of freedom) is the sum of two independent components: one is the traditional linkage measure and the other is the TDT statistic.

Modified version

More recently, Wittkowski KM, Liu X. (2002/2004) [3] proposed a modification to the TDT that can be more powerful under some alternatives, although the asymptotic properties under the null hypothesis are equivalent.

The motivating idea for this modification is the fact that, while the transmissions of both allele from parents to a child are independent, the effects of other filial genetic or environmental covariates on penetrance are the same for both alleles transmitted to the same child. This situation can be important if, for example, the genetic marker is linked to a disease locus with a strong selection against heterozygous individuals. This observation suggests to shift the statistical model from a set of independent transmissions to a set of independent children (see Sasieni (1997) [4] for the corresponding problem in case-control association tests). While this observation does not affect the distribution under the null hypothesis of no linkage, it allows, for some disease models, to design a more powerful test.

In this modified TDT test the children are stratified by parental type and the modified test statistic becomes:

where is the number of PQ children from parents with the PQ and QQ types.

Software for computing TDT

Beagle

Related Research Articles

Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the frequency of an existing gene variant (allele) in a population due to random chance.

Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and population structure.

<span class="mw-page-title-main">Hardy–Weinberg principle</span> Principle in genetics

In population genetics, the Hardy–Weinberg principle, also known as the Hardy–Weinberg equilibrium, model, theorem, or law, states that allele and genotype frequencies in a population will remain constant from generation to generation in the absence of other evolutionary influences. These influences include genetic drift, mate choice, assortative mating, natural selection, sexual selection, mutation, gene flow, meiotic drive, genetic hitchhiking, population bottleneck, founder effect,inbreeding and outbreeding depression.

Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separated onto different chromatids during chromosomal crossover, and are therefore said to be more linked than markers that are far apart. In other words, the nearer two genes are on a chromosome, the lower the chance of recombination between them, and the more likely they are to be inherited together. Markers on different chromosomes are perfectly unlinked, although the penetrance of potentially deleterious alleles may be influenced by the presence of other alleles, and these other alleles may be located on other chromosomes than that on which a particular potentially deleterious allele is located.

<span class="mw-page-title-main">Quantitative genetics</span> Study of the inheritance of continuously variable traits

Quantitative genetics is the study of quantitative traits, which are phenotypes that vary continuously—such as height or mass—as opposed to phenotypes and gene-products that are discretely identifiable—such as eye-colour, or the presence of a particular biochemical.

In mathematics, the Heisenberg group, named after Werner Heisenberg, is the group of 3×3 upper triangular matrices of the form

In population genetics, linkage disequilibrium (LD) is the non-random association of alleles at different loci in a given population. Loci are said to be in linkage disequilibrium when the frequency of association of their different alleles is higher or lower than expected if the loci were independent and associated randomly.

The effective population size (Ne) is size of an idealised population would experience the same rate of genetic drift or increase in inbreeding as in the real population. Idealised populations are based on unrealistic but convenient assumptions including random mating, simultaneous birth of each new generation, constant population size. For most quantities of interest and most real populations, Ne is smaller than the census population size N of a real population. The same population may have multiple effective population sizes for different properties of interest, including genetic drift and inbreeding.

Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.

<span class="mw-page-title-main">Noncentral chi-squared distribution</span> Noncentral generalization of the chi-squared distribution

In probability theory and statistics, the noncentral chi-squared distribution is a noncentral generalization of the chi-squared distribution. It often arises in the power analysis of statistical tests in which the null distribution is a chi-squared distribution; important examples of such tests are the likelihood-ratio tests.

McNemar's test is a statistical test used on paired nominal data. It is applied to 2 × 2 contingency tables with a dichotomous trait, with matched pairs of subjects, to determine whether the row and column marginal frequencies are equal. It is named after Quinn McNemar, who introduced it in 1947. An application of the test in genetics is the transmission disequilibrium test for detecting linkage disequilibrium.

The Cochran–Armitage test for trend, named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and an ordinal variable with k categories. It modifies the Pearson chi-squared test to incorporate a suspected ordering in the effects of the k categories of the second variable. For example, doses of a treatment can be ordered as 'low', 'medium', and 'high', and we may suspect that the treatment benefit cannot become smaller as the dose increases. The trend test is often used as a genotype-based test for case-control genetic association studies.

Genetic hitchhiking, also called genetic draft or the hitchhiking effect, is when an allele changes frequency not because it itself is under natural selection, but because it is near another gene that is undergoing a selective sweep and that is on the same DNA chain. When one gene goes through a selective sweep, any other nearby polymorphisms that are in linkage disequilibrium will tend to change their allele frequencies too. Selective sweeps happen when newly appeared mutations are advantageous and increase in frequency. Neutral or even slightly deleterious alleles that happen to be close by on the chromosome 'hitchhike' along with the sweep. In contrast, effects on a neutral locus due to linkage disequilibrium with newly appeared deleterious mutations are called background selection. Both genetic hitchhiking and background selection are stochastic (random) evolutionary forces, like genetic drift.

Population structure is the presence of a systematic difference in allele frequencies between subpopulations. In a randomly mating population, allele frequencies are expected to be roughly similar between groups. However, mating tends to be non-random to some degree, causing structure to arise. For example, a barrier like a river can separate two groups of the same species and make it difficult for potential mates to cross; if a mutation occurs, over many generations it can spread and become common in one subpopulation while being completely absent in the other.

<span class="mw-page-title-main">Zygosity</span> Degree of similarity of the alleles in an organism

Zygosity is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism.

The haplotype-relative-risk (HRR) method is a family-based method for determining gene allele association to a disease in the presence of actual genetic linkage. Nuclear families with one affected child are sampled using the parental haplotypes not transmitted as a control. While similar to the genotype relative risk (RR), the HRR provides a solution to the problem of population stratification by only sampling within family trios. The HRR method was first proposed by Rubinstein in 1981 then detailed in 1987 by Rubinstein and Falk and is an important tool in genetic association studies.

Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.

Additive disequilibrium (D) is a statistic that estimates the difference between observed genotypic frequencies and the genotypic frequencies that would be expected under Hardy–Weinberg equilibrium. At a biallelic locus with alleles 1 and 2, the additive disequilibrium exists according to the equations

Genomic control (GC) is a statistical method that is used to control for the confounding effects of population stratification in genetic association studies. The method was originally outlined by Bernie Devlin and Kathryn Roeder in a 1999 paper. It involves using a set of anonymous genetic markers to estimate the effect of population structure on the distribution of the chi-square statistic. The distribution of the chi-square statistics for a given allele that is suspected to be associated with a given trait can then be compared to the distribution of the same statistics for an allele that is expected not to be related to the trait. The method is supposed to involve the use of markers that are not linked to the marker being tested for a possible association. In theory, it takes advantage of the tendency of population structure to cause overdispersion of test statistics in association analyses. The genomic control method is as robust as family-based designs, despite being applied to population-based data. It has the potential to lead to a decrease in statistical power to detect a true association, and it may also fail to eliminate the biasing effects of population stratification. A more robust form of the genomic control method can be performed by expressing the association being studied as two Cochran–Armitage trend tests, and then applying the method to each test separately.

Kelly's is a test statistic that can be used to test a genetic region for deviations from the neutral model, based on the squared correlation of allelic identity between loci.

References

  1. 1 2 3 Spielman RS, McGinnis RE, Ewens WJ (Mar 1993). "Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM)". Am J Hum Genet. 52 (3): 506–16. PMC   1682161 . PMID   8447318.
  2. Blackwelder WC, Elston RC (1985). "A comparison of sib-pair linkage tests for disease susceptibility loci". Genetic Epidemiology. 2 (1): 85–97. doi:10.1002/gepi.1370020109. PMID   3863778.
  3. Wittkowski KM, Liu X (2002). "A statistically valid alternative to the TDT". Hum. Hered. 54 (3): 157–64. doi:10.1159/000068840. PMID   12626848.
    Ewens WJ, Spielman RS (2004). "The TDT is a statistically valid test: comments on Wittkowski and Liu". Hum. Hered. 58 (1): 59–60, author reply 60–1, discussion 61–2. doi:10.1159/000081458. PMID   15604566.
  4. Sasieni PD (Dec 1997). "From genotypes to genes: doubling the sample size". Biometrics. 53 (4): 1253–61. doi:10.2307/2533494. JSTOR   2533494. PMID   9423247.