Genetic association

Last updated

Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.

Contents

Studies of genetic association aim to test whether single-locus alleles or genotype frequencies or more generally, multilocus haplotype frequencies differ between two groups of individuals (usually diseased subjects and healthy controls). Genetic association studies are based on the principle that genotypes can be compared "directly", i.e. with the sequences of the actual genomes or exomes via whole genome sequencing or whole exome sequencing. Before 2010, DNA sequencing methods were used.

Description

Genetic association can be between phenotypes, such as visible characteristics such as flower color or height, between a phenotype and a genetic polymorphism, such as a single nucleotide polymorphism (SNP), or between two genetic polymorphisms. Association between genetic polymorphisms occurs when there is non-random association of their alleles as a result of their proximity on the same chromosome; this is known as genetic linkage.[ citation needed ]

Linkage disequilibrium (LD) is a term used in the study of population genetics for the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is not the same as linkage, which is the phenomenon whereby two or more loci on a chromosome have reduced recombination between them because of their physical proximity to each other. LD describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies.[ citation needed ]

Genetic association studies are performed to determine whether a genetic variant is associated with a disease or trait: if association is present, a particular allele, genotype or haplotype of a polymorphism or polymorphisms will be seen more often than expected by chance in an individual carrying the trait. Thus, a person carrying one or two copies of a high-risk variant is at increased risk of developing the associated disease or having the associated trait.[ citation needed ]

Studies

Case-control designs

Case control studies are a classical epidemiological tool. Case-control studies use subjects who already have a disease, trait or other condition and determine if there are characteristics of these patients that differ from those who do not have the disease or trait. In genetic case-control studies, the frequency of alleles or genotypes is compared between the cases and controls. The cases will have been diagnosed with the disease under study, or have the trait under test; the controls, who are either known to be unaffected, or who have been randomly selected from the population. A difference in the frequency of an allele or genotype of the polymorphism under test between the two groups indicates that the genetic marker may increase risk of the disease or likelihood of the trait, or be in linkage disequilibrium with a polymorphism which does. Haplotypes can also show association with a disease or trait. One of the earliest successes in this field was finding a single base mutation in the non-coding region of the APOC3 gene (apolipoprotein C3 gene) that associated with higher risks of hypertriglyceridemia and atherosclerosis [1] using a case-control design.

One problem with the case-control design is that genotype and haplotype frequencies vary between ethnic or geographic populations. If the case and control populations are not well matched for ethnicity or geographic origin then false positive association can occur because of the confounding effects of population stratification.

Family based designs

Family based association designs aim to avoid the potential confounding effects of population stratification by using the parents or unaffected siblings as controls for the case (the affected offspring/sibling). Two similar tests are most commonly used, the transmission disequilibrium test (TDT) and haploid-relative-risk (HRR). Both measure association of genetic markers in nuclear families by transmission from parent to offspring. If an allele increases the risk of having a disease then that allele is expected to be transmitted from parent to offspring more often in populations with the disease.

Quantitative trait association

A quantitative trait (see quantitative trait locus) is a measurable trait that shows continuous variation, such as height or weight. Quantitative traits often have a 'normal' distribution in the population. In addition to the case control design, quantitative trait association can also be performed using an unrelated population sample or family trios in which the quantitative trait is measured in the offspring.

Evidence

Evidence of the association is based on MNAs that are usually based on studies that include large sample sizes like genome-wide association studies. With only little empirical data associating study results and study design, bias in genetic association studies are not well understood. [2]

Reporting outcomes

Based on traits either continuous or binary, heritability (going from 0 to 1 with 0 meaning non-hertiable) outcomes are usually reported respectively by the proportion of phenotypic variation that can be attributed to genetic variation or the proportion of variance attributed to genetic variation. [3]

See also

Related Research Articles

The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a specific gene depends on the number of copies of each chromosome found in that species, also referred to as ploidy. In diploid species like humans, two full sets of chromosomes are present, meaning each individual has two alleles for any given gene. If both alleles are the same, the genotype is referred to as homozygous. If the alleles are different, the genotype is referred to as heterozygous.

Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separated onto different chromatids during chromosomal crossover, and are therefore said to be more linked than markers that are far apart. In other words, the nearer two genes are on a chromosome, the lower the chance of recombination between them, and the more likely they are to be inherited together. Markers on different chromosomes are perfectly unlinked, although the penetrance of potentially deleterious alleles may be influenced by the presence of other alleles, and these other alleles may be located on other chromosomes than that on which a particular potentially deleterious allele is located.

<span class="mw-page-title-main">Single-nucleotide polymorphism</span> Single nucleotide in genomic DNA at which different sequence alternatives exist

In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently large fraction of the population, many publications do not apply such a frequency threshold.

<span class="mw-page-title-main">Haplotype</span> Group of genes from one parent

A haplotype is a group of alleles in an organism that are inherited together from a single parent.

A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying the actual genes that cause the trait variation.

In population genetics, linkage disequilibrium (LD) is the non-random association of alleles at different loci in a given population. Loci are said to be in linkage disequilibrium when the frequency of association of their different alleles is higher or lower than expected if the loci were independent and associated randomly.

Genotyping is the process of determining differences in the genetic make-up (genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their parents. Traditionally genotyping is the use of DNA sequences to define biological populations by use of molecular tools. It does not usually involve defining the genes of an individual.

A molecular marker is a molecule, sampled from some source, that gives information about its source. For example, DNA is a molecular marker that gives information about the organism from which it was taken. For another example, some proteins can be molecular markers of Alzheimer's disease in a person from which they are taken. Molecular markers may be non-biological. Non-biological markers are often used in environmental studies.

A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenotypes without genotyping every SNP in a chromosomal region. This reduces the expense and time of mapping genome areas associated with disease, since it eliminates the need to study every individual SNP. Tag SNPs are useful in whole-genome SNP association studies in which hundreds of thousands of SNPs across the entire genome are genotyped.

Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant- and animal- breeding.

<span class="mw-page-title-main">Genome-wide association study</span> Study of genetic variants in different individuals

In genomics, a genome-wide association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.

<span class="mw-page-title-main">1000 Genomes Project</span> International research effort on genetic variation

The 1000 Genomes Project (1KGP), taken place from January 2008 to 2015, was an international research effort to establish the most detailed catalogue of human genetic variation at the time. Scientists planned to sequence the genomes of at least one thousand anonymous healthy participants from a number of different ethnic groups within the following three years, using advancements in newly developed technologies. In 2010, the project finished its pilot phase, which was described in detail in a publication in the journal Nature. In 2012, the sequencing of 1092 genomes was announced in a Nature publication. In 2015, two papers in Nature reported results and the completion of the project and opportunities for future research.

Population structure is the presence of a systematic difference in allele frequencies between subpopulations. In a randomly mating population, allele frequencies are expected to be roughly similar between groups. However, mating tends to be non-random to some degree, causing structure to arise. For example, a barrier like a river can separate two groups of the same species and make it difficult for potential mates to cross; if a mutation occurs, over many generations it can spread and become common in one subpopulation while being completely absent in the other.

The dog leukocyte antigen (DLA) is a part of the major histocompatibility complex (MHC) in dogs, encoding genes in the MHC. The DLA and MHC system are interchangeable terms in canines. The MHC plays a critical role in the immune response system and consists of three regions: class I, class II and class III. DLA genes belong to the first two classes, which are involved in the regulation of antigens in the immune system. The class II genes are highly polymorphic, with many different alleles/haplotypes that have been linked to diseases, allergies, and autoimmune conditions such as diabetes, polyarthritus, and hypothyroidism in canines.

In genetics, association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes, uncovering genetic associations.

Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn. It is important to note that nested association mapping is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population, the details of which are described below.

The haplotype-relative-risk (HRR) method is a family-based method for determining gene allele association to a disease in the presence of actual genetic linkage. Nuclear families with one affected child are sampled using the parental haplotypes not transmitted as a control. While similar to the genotype relative risk (RR), the HRR provides a solution to the problem of population stratification by only sampling within family trios. The HRR method was first proposed by Rubinstein in 1981 then detailed in 1987 by Rubinstein and Falk and is an important tool in genetic association studies.

Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.

This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. Overlapping and related terms can be found in Glossary of cellular and molecular biology, Glossary of ecology, and Glossary of biology.

<span class="mw-page-title-main">Complex traits</span>

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as quantitative genetics and quantitative trait loci (QTL) mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.

References

  1. Rees, A; Shoulders, CC; Galton, DJ; Baralle, FE (1983). "DNA polymorphism adjacent to human apoprotein A-1 gene: relation to hypertriglyceridaemia". Lancet. 321 (8322): 444–446. doi:10.1016/s0140-6736(83)91440-x. PMID   6131168. S2CID   29511911.
  2. Sagoo, Gurdeep S; Little, Julian; Higgins, Julian P. T (2009-03-03). "Systematic Reviews of Genetic Association Studies". PLOS Medicine. 6 (3). Public Library of Science (PLoS): e1000028. doi: 10.1371/journal.pmed.1000028 . ISSN   1549-1676. PMC   2650724 . PMID   19260758.
  3. Dahlqwist, Elisabeth; Magnusson, Patrik K. E.; Pawitan, Yudi; Sjölander, Arvid (2019). "On the relationship between the heritability and the attributable fraction". Human Genetics. 138 (4). Springer Science and Business Media LLC: 425–435. doi:10.1007/s00439-019-02006-8. ISSN   0340-6717. PMC   6483966 . PMID   30941497.