Genetic association

Last updated September 23, 2025

Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.

Studies of genetic association aim to test whether single-locus alleles or genotype frequencies or more generally, multilocus haplotype frequencies differ between two groups of individuals (usually diseased subjects and healthy controls). Genetic association studies are based on the principle that genotypes can be compared "directly", i.e. with the sequences of the actual genomes or exomes via whole genome sequencing or whole exome sequencing. Before 2010, DNA sequencing methods were used.

Description

Genetic association can be between phenotypes, such as visible characteristics such as flower color or height, between a phenotype and a genetic polymorphism, such as a single nucleotide polymorphism (SNP), or between two genetic polymorphisms. Association between genetic polymorphisms occurs when there is non-random association of their alleles as a result of their proximity on the same chromosome; this is known as genetic linkage.^{[ citation needed ]}

Linkage disequilibrium (LD) is a term used in the study of population genetics for the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is not the same as linkage, which is the phenomenon whereby two or more loci on a chromosome have reduced recombination between them because of their physical proximity to each other. LD describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies.^{[ citation needed ]}

Genetic association studies are performed to determine whether a genetic variant is associated with a disease or trait: if association is present, a particular allele, genotype or haplotype of a polymorphism or polymorphisms will be seen more often than expected by chance in an individual carrying the trait. Thus, a person carrying one or two copies of a high-risk variant is at increased risk of developing the associated disease or having the associated trait.^{[ citation needed ]}

Studies

Case-control designs

Case control studies are a classical epidemiological tool. Case-control studies use subjects who already have a disease, trait or other condition and determine if there are characteristics of these patients that differ from those who do not have the disease or trait. In genetic case-control studies, the frequency of alleles or genotypes is compared between the cases and controls. The cases will have been diagnosed with the disease under study, or have the trait under test; the controls, who are either known to be unaffected, or who have been randomly selected from the population. A difference in the frequency of an allele or genotype of the polymorphism under test between the two groups indicates that the genetic marker may increase risk of the disease or likelihood of the trait, or be in linkage disequilibrium with a polymorphism which does. Haplotypes can also show association with a disease or trait. One of the earliest successes in this field was finding a single base mutation in the non-coding region of the APOC3 gene (apolipoprotein C3 gene) that associated with higher risks of hypertriglyceridemia and atherosclerosis^[1] using a case-control design.

One problem with the case-control design is that genotype and haplotype frequencies vary between ethnic or geographic populations. If the case and control populations are not well matched for ethnicity or geographic origin then false positive association can occur because of the confounding effects of population stratification.

Family based designs

Family based association designs aim to avoid the potential confounding effects of population stratification by using the parents or unaffected siblings as controls for the case (the affected offspring/sibling). Two similar tests are most commonly used, the transmission disequilibrium test (TDT) and haploid-relative-risk (HRR). Both measure association of genetic markers in nuclear families by transmission from parent to offspring. If an allele increases the risk of having a disease then that allele is expected to be transmitted from parent to offspring more often in populations with the disease.

Quantitative trait association

A quantitative trait (see quantitative trait locus) is a measurable trait that shows continuous variation, such as height or weight. Quantitative traits often have a 'normal' distribution in the population. In addition to the case control design, quantitative trait association can also be performed using an unrelated population sample or family trios in which the quantitative trait is measured in the offspring.

Evidence

Evidence of the association is based on MNAs that are usually based on studies that include large sample sizes like genome-wide association studies. With only little empirical data associating study results and study design, bias in genetic association studies are not well understood.^[2]

Reporting outcomes

Based on traits either continuous or binary, heritability (going from 0 to 1 with 0 meaning non-hertiable) outcomes are usually reported respectively by the proportion of phenotypic variation that can be attributed to genetic variation or the proportion of variance attributed to genetic variation.^[3]

References

↑ Rees, A; Shoulders, CC; Galton, DJ; Baralle, FE (1983). "DNA polymorphism adjacent to human apoprotein A-1 gene: relation to hypertriglyceridaemia". Lancet. 321 (8322): 444–446. doi:10.1016/s0140-6736(83)91440-x. PMID 6131168. S2CID 29511911.
↑ Sagoo, Gurdeep S; Little, Julian; Higgins, Julian P. T (2009-03-03). "Systematic Reviews of Genetic Association Studies". PLOS Medicine. 6 (3) e1000028. Public Library of Science (PLoS). doi: 10.1371/journal.pmed.1000028 . ISSN 1549-1676. PMC 2650724 . PMID 19260758.
↑ Dahlqwist, Elisabeth; Magnusson, Patrik K. E.; Pawitan, Yudi; Sjölander, Arvid (2019). "On the relationship between the heritability and the attributable fraction". Human Genetics. 138 (4). Springer Science and Business Media LLC: 425–435. doi:10.1007/s00439-019-02006-8. ISSN 0340-6717. PMC 6483966 . PMID 30941497.

External links

A list of computer programs for genetic analysis including genetic association analysis
GWAS Central – a central database of summary-level genetic association findings.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Rees, A; Shoulders, CC; Galton, DJ; Baralle, FE (1983). "DNA polymorphism adjacent to human apoprotein A-1 gene: relation to hypertriglyceridaemia". Lancet. 321 (8322): 444–446. doi:10.1016/s0140-6736(83)91440-x. PMID 6131168. S2CID 29511911.

[Sagoo_Little_Higgins_p=e1000028-2] Sagoo, Gurdeep S; Little, Julian; Higgins, Julian P. T (2009-03-03). "Systematic Reviews of Genetic Association Studies". PLOS Medicine. 6 (3) e1000028. Public Library of Science (PLoS). doi: 10.1371/journal.pmed.1000028 . ISSN 1549-1676. PMC 2650724 . PMID 19260758.

[Dahlqwist_Magnusson_Pawitan_Sjölander_2019_pp._425–435-3] Dahlqwist, Elisabeth; Magnusson, Patrik K. E.; Pawitan, Yudi; Sjölander, Arvid (2019). "On the relationship between the heritability and the attributable fraction". Human Genetics. 138 (4). Springer Science and Business Media LLC: 425–435. doi:10.1007/s00439-019-02006-8. ISSN 0340-6717. PMC 6483966 . PMID 30941497.

[1]

[2]

[3]