A quantitative trait locus (QTL) is a locus (section of DNA) that correlates with variation of a quantitative trait in the phenotype of a population of organisms. [1] QTLs are mapped by identifying which molecular markers (such as SNPs or AFLPs) correlate with an observed trait. This is often an early step in identifying the actual genes that cause the trait variation.
A quantitative trait locus (QTL) is a region of DNA which is associated with a particular phenotypic trait, which varies in degree and which can be attributed to polygenic effects, i.e., the product of two or more genes, and their environment. [2] These QTLs are often found on different chromosomes. The number of QTLs which explain variation in the phenotypic trait indicates the genetic architecture of a trait. It may indicate that plant height is controlled by many genes of small effect, or by a few genes of large effect.[ citation needed ]
Typically, QTLs underlie continuous traits (those traits which vary continuously, e.g. height) as opposed to discrete traits (traits that have two or several character values, e.g. red hair in humans, a recessive trait, or smooth vs. wrinkled peas used by Mendel in his experiments).
Moreover, a single phenotypic trait is usually determined by many genes. Consequently, many QTLs are associated with a single trait. Another use of QTLs is to identify candidate genes underlying a trait. The DNA sequence of any genes in this region can then be compared to a database of DNA for genes whose function is already known, this task being fundamental for marker-assisted crop improvement. [3]
Mendelian inheritance was rediscovered at the beginning of the 20th century. As Mendel's ideas spread, geneticists began to connect Mendel's rules of inheritance of single factors to Darwinian evolution. For early geneticists, it was not immediately clear that the smooth variation in traits like body size (i.e., incomplete dominance) was caused by the inheritance of single genetic factors. Although Darwin himself observed that inbred features of fancy pigeons were inherited in accordance with Mendel's laws (although Darwin did not actually know about Mendel's ideas when he made the observation), it was not obvious that these features selected by fancy pigeon breeders can similarly explain quantitative variation in nature. [4]
An early attempt by William Ernest Castle to unify the laws of Mendelian inheritance with Darwin's theory of speciation invoked the idea that species become distinct from one another as one species or the other acquires a novel Mendelian factor. [5] Castle's conclusion was based on the observation that novel traits that could be studied in the lab and that show Mendelian inheritance patterns reflect a large deviation from the wild type, and Castle believed that acquisition of such features is the basis of "discontinuous variation" that characterizes speciation. [5] Darwin discussed the inheritance of similar mutant features but did not invoke them as a requirement of speciation. [4] Instead Darwin used the emergence of such features in breeding populations as evidence that mutation can occur at random within breeding populations, which is a central premise of his model of selection in nature. [4] Later in his career, Castle would refine his model for speciation to allow for small variation to contribute to speciation over time. He also was able to demonstrate this point by selectively breeding laboratory populations of rats to obtain a hooded phenotype over several generations. [6]
Castle's was perhaps the first attempt made in the scientific literature to direct evolution by artificial selection of a trait with continuous underlying variation, however the practice had previously been widely employed in the development of agriculture to obtain livestock or plants with favorable features from populations that show quantitative variation in traits like body size or grain yield.[ citation needed ]
Castle's work was among the first to attempt to unify the recently rediscovered laws of Mendelian inheritance with Darwin's theory of evolution. Still, it would be almost thirty years until the theoretical framework for evolution of complex traits would be widely formalized. [7] In an early summary of the theory of evolution of continuous variation, Sewall Wright, a graduate student who trained under Castle, summarized contemporary thinking about the genetic basis of quantitative natural variation: "As genetic studies continued, ever smaller differences were found to mendelize, and any character, sufficiently investigated, turned out to be affected by many factors." [7] Wright and others formalized population genetics theory that had been worked out over the preceding 30 years explaining how such traits can be inherited and create stably breeding populations with unique characteristics. Quantitative trait genetics today leverages Wright's observations about the statistical relationship between genotype and phenotype in families and populations to understand how certain genetic features can affect variation in natural and derived populations.[ citation needed ]
Polygenic inheritance refers to inheritance of a phenotypic characteristic (trait) that is attributable to two or more genes and can be measured quantitatively. Multifactorial inheritance refers to polygenic inheritance that also includes interactions with the environment. Unlike monogenic traits, polygenic traits do not follow patterns of Mendelian inheritance (discrete categories). Instead, their phenotypes typically vary along a continuous gradient depicted by a bell curve. [8]
An example of a polygenic trait is human skin color variation. Several genes factor into determining a person's natural skin color, so modifying only one of those genes can change skin color slightly or in some cases, such as for SLC24A5, moderately. Many disorders with genetic components are polygenic, including autism, cancer, diabetes and numerous others. Most phenotypic characteristics are the result of the interaction of multiple genes.[ citation needed ]
Multifactorially inherited diseases are said to constitute the majority of genetic disorders affecting humans which will result in hospitalization or special care of some kind. [9] [10]
Traits controlled both by the environment and by genetic factors are called multifactorial. Usually, multifactorial traits outside of illness result in what we see as continuous characteristics in organisms, especially human organisms such as: height, [9] skin color, and body mass. [11] All of these phenotypes are complicated by a great deal of give-and-take between genes and environmental effects. [9] The continuous distribution of traits such as height and skin color described above, reflects the action of genes that do not manifest typical patterns of dominance and recessiveness. Instead the contributions of each involved locus are thought to be additive. Writers have distinguished this kind of inheritance as polygenic, or quantitative inheritance. [12]
Thus, due to the nature of polygenic traits, inheritance will not follow the same pattern as a simple monohybrid or dihybrid cross. [10] Polygenic inheritance can be explained as Mendelian inheritance at many loci, [9] resulting in a trait which is normally-distributed. If n is the number of involved loci, then the coefficients of the binomial expansion of (a + b)2n will give the frequency of distribution of all n allele combinations. For sufficiently high values of n, this binomial distribution will begin to resemble a normal distribution. From this viewpoint, a disease state will become apparent at one of the tails of the distribution, past some threshold value. Disease states of increasing severity will be expected the further one goes past the threshold and away from the mean. [12]
A mutation resulting in a disease state is often recessive, so both alleles must be mutant in order for the disease to be expressed phenotypically. A disease or syndrome may also be the result of the expression of mutant alleles at more than one locus. When more than one gene is involved, with or without the presence of environmental triggers, we say that the disease is the result of multifactorial inheritance.[ citation needed ]
The more genes involved in the cross, the more the distribution of the genotypes will resemble a normal, or Gaussian distribution. [9] This shows that multifactorial inheritance is polygenic, and genetic frequencies can be predicted by way of a polyhybrid Mendelian cross. Phenotypic frequencies are a different matter, especially if they are complicated by environmental factors.[ citation needed ]
The paradigm of polygenic inheritance as being used to define multifactorial disease has encountered much disagreement. Turnpenny (2004) discusses how simple polygenic inheritance cannot explain some diseases such as the onset of Type I diabetes mellitus, and that in cases such as these, not all genes are thought to make an equal contribution. [12]
The assumption of polygenic inheritance is that all involved loci make an equal contribution to the symptoms of the disease. This should result in a normal (Gaussian) distribution of genotypes. When it does not, the idea of polygenetic inheritance cannot be supported for that illness.[ citation needed ]
The above are well-known examples of diseases having both genetic and environmental components. Other examples involve atopic diseases such as eczema or dermatitis, [9] spina bifida (open spine), and anencephaly (open skull). [13]
While schizophrenia is widely believed to be multifactorially genetic by biopsychiatrists, no characteristic genetic markers have been determined with any certainty.[ citation needed ]
If it is shown that the brothers and sisters of the patient have the disease, then there is a strong chance that the disease is genetic[ citation needed ] and that the patient will also be a genetic carrier. This is not quite enough as it also needs to be proven that the pattern of inheritance is non-Mendelian. This would require studying dozens, even hundreds of different family pedigrees before a conclusion of multifactorial inheritance is drawn. This often takes several years.[ citation needed ]
If multifactorial inheritance is indeed the case, then the chance of the patient contracting the disease is reduced only if cousins and more distant relatives have the disease. [13] While multifactorially-inherited diseases tend to run in families, inheritance will not follow the same pattern as a simple monohybrid or dihybrid cross. [10]
If a genetic cause is suspected and little else is known about the illness, then it remains to be seen exactly how many genes are involved in the phenotypic expression of the disease. Once that is determined, the question must be answered: if two people have the required genes, why are there differences in expression between them? Generally, what makes the two individuals different are likely to be environmental factors. Due to the involved nature of genetic investigations needed to determine such inheritance patterns, this is not usually the first avenue of investigation one would choose to determine etiology.[ citation needed ]
For organisms whose genomes are known, one might now try to exclude genes in the identified region whose function is known with some certainty not to be connected with the trait in question. If the genome is not available, it may be an option to sequence the identified region and determine the putative functions of genes by their similarity to genes with known function, usually in other genomes. This can be done using BLAST, an online tool that allows users to enter a primary sequence and search for similar sequences within the BLAST database of genes from various organisms. It is often not the actual gene underlying the phenotypic trait, but rather a region of DNA that is closely linked with the gene [14]
Another interest of statistical geneticists using QTL mapping is to determine the complexity of the genetic architecture underlying a phenotypic trait. For example, they may be interested in knowing whether a phenotype is shaped by many independent loci, or by a few loci, and do those loci interact. This can provide information on how the phenotype may be evolving. [15]
In a recent development, classical QTL analyses were combined with gene expression profiling i.e. by DNA microarrays. Such expression QTLs (eQTLs) describe cis- and trans-controlling elements for the expression of often disease-associated genes. [16] Observed epistatic effects have been found beneficial to identify the gene responsible by a cross-validation of genes within the interacting loci with metabolic pathway- and scientific literature databases.[ citation needed ]
The simplest method for QTL mapping is analysis of variance (ANOVA, sometimes called "marker regression") at the marker loci. In this method, in a backcross, one may calculate a t-statistic to compare the averages of the two marker genotype groups. For other types of crosses (such as the intercross), where there are more than two possible genotypes, one uses a more general form of ANOVA, which provides a so-called F-statistic. The ANOVA approach for QTL mapping has three important weaknesses. First, we do not receive separate estimates of QTL location and QTL effect. QTL location is indicated only by looking at which markers give the greatest differences between genotype group averages, and the apparent QTL effect at a marker will be smaller than the true QTL effect as a result of recombination between the marker and the QTL. Second, we must discard individuals whose genotypes are missing at the marker. Third, when the markers are widely spaced, the QTL may be quite far from all markers, and so the power for QTL detection will decrease.[ citation needed ]
Lander and Botstein developed interval mapping, which overcomes the three disadvantages of analysis of variance at marker loci. [17] Interval mapping is currently the most popular approach for QTL mapping in experimental crosses. The method makes use of a genetic map of the typed markers, and, like analysis of variance, assumes the presence of a single QTL. In interval mapping, each locus is considered one at a time and the logarithm of the odds ratio (LOD score) is calculated for the model that the given locus is a true QTL. The odds ratio is related to the Pearson correlation coefficient between the phenotype and the marker genotype for each individual in the experimental cross. [18]
The term 'interval mapping' is used for estimating the position of a QTL within two markers (often indicated as 'marker-bracket'). Interval mapping is originally based on the maximum likelihood but there are also very good approximations possible with simple regression.[ citation needed ]
The principle for QTL mapping is: 1) The likelihood can be calculated for a given set of parameters (particularly QTL effect and QTL position) given the observed data on phenotypes and marker genotypes. 2) The estimates for the parameters are those where the likelihood is highest. 3) A significance threshold can be established by permutation testing. [19]
Conventional methods for the detection of quantitative trait loci (QTLs) are based on a comparison of single QTL models with a model assuming no QTL. For instance in the "interval mapping" method [20] the likelihood for a single putative QTL is assessed at each location on the genome. However, QTLs located elsewhere on the genome can have an interfering effect. As a consequence, the power of detection may be compromised, and the estimates of locations and effects of QTLs may be biased (Lander and Botstein 1989; Knapp 1991). Even nonexisting so-called "ghost" QTLs may appear (Haley and Knott 1992; Martinez and Curnow 1992). Therefore, multiple QTLs could be mapped more efficiently and more accurately by using multiple QTL models. [21] One popular approach to handle QTL mapping where multiple QTL contribute to a trait is to iteratively scan the genome and add known QTL to the regression model as QTLs are identified. This method, termed composite interval mapping determine both the location and effects size of QTL more accurately than single-QTL approaches, especially in small mapping populations where the effect of correlation between genotypes in the mapping population may be problematic.[ citation needed ]
In this method, one performs interval mapping using a subset of marker loci as covariates. These markers serve as proxies for other QTLs to increase the resolution of interval mapping, by accounting for linked QTLs and reducing the residual variation. The key problem with CIM concerns the choice of suitable marker loci to serve as covariates; once these have been chosen, CIM turns the model selection problem into a single-dimensional scan. The choice of marker covariates has not been solved, however. Not surprisingly, the appropriate markers are those closest to the true QTLs, and so if one could find these, the QTL mapping problem would be complete anyway.
Inclusive composite interval mapping (ICIM) has also been proposed as a potential method for QTL mapping. [22]
Family-based QTL mapping, or Family-pedigree based mapping (Linkage and association mapping), involves multiple families instead of a single family. Family-based QTL mapping has been the only way for mapping of genes where experimental crosses are difficult to make. However, due to some advantages, now plant geneticists are attempting to incorporate some of the methods pioneered in human genetics. [23] Using family-pedigree based approach has been discussed (Bink et al. 2008). Family-based linkage and association has been successfully implemented (Rosyara et al. 2009) [24]
An allele, or allelomorph, is a variant of the sequence of nucleotides at a particular location, or locus, on a DNA molecule.
In genetics, dominance is the phenomenon of one variant (allele) of a gene on a chromosome masking or overriding the effect of a different variant of the same gene on the other copy of the chromosome. The first variant is termed dominant and the second is called recessive. This state of having two different variants of the same gene on each chromosome is originally caused by a mutation in one of the genes, either new or inherited. The terms autosomal dominant or autosomal recessive are used to describe gene variants on non-sex chromosomes (autosomes) and their associated traits, while those on sex chromosomes (allosomes) are termed X-linked dominant, X-linked recessive or Y-linked; these have an inheritance and presentation pattern that depends on the sex of both the parent and the child. Since there is only one copy of the Y chromosome, Y-linked traits cannot be dominant or recessive. Additionally, there are other forms of dominance, such as incomplete dominance, in which a gene variant has a partial effect compared to when it is present on both chromosomes, and co-dominance, in which different variants on each chromosome both show their associated traits.
Non-Mendelian inheritance is any pattern in which traits do not segregate in accordance with Mendel's laws. These laws describe the inheritance of traits linked to single genes on chromosomes in the nucleus. In Mendelian inheritance, each parent contributes one of two possible alleles for a trait. If the genotypes of both parents in a genetic cross are known, Mendel's laws can be used to determine the distribution of phenotypes expected for the population of offspring. There are several situations in which the proportions of phenotypes observed in the progeny do not match the predicted values.
Genetic architecture is the underlying genetic basis of a phenotypic trait and its variational properties. Phenotypic variation for quantitative traits is, at the most basic level, the result of the segregation of alleles at quantitative trait loci (QTL). Environmental factors and other external influences can also play a role in phenotypic variation. Genetic architecture is a broad term that can be described for any given individual based on information regarding gene and allele number, the distribution of allelic and mutational effects, and patterns of pleiotropy, dominance, and epistasis.
A polygene is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait, thus contributing to multiple-gene inheritance, a type of non-Mendelian inheritance, as opposed to single-gene inheritance, which is the core notion of Mendelian inheritance. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to distinguish the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.
Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.
Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant- and animal- breeding.
In genetics, association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes, uncovering genetic associations.
Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn. It is important to note that nested association mapping is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population, the details of which are described below.
In statistical genetics, inclusive composite interval mapping (ICIM) has been proposed as an approach to QTL mapping for populations derived from bi-parental crosses. QTL mapping is based on genetic linkage map and phenotypic data to attempt to locate individual genetic factors on chromosomes and to estimate their genetic effects.
GeneNetwork is a combined database and open-source bioinformatics data analysis software resource for systems genetics. This resource is used to study gene regulatory networks that link DNA sequence differences to corresponding differences in gene and protein expression and to variation in traits such as health and disease risk. Data sets in GeneNetwork are typically made up of large collections of genotypes and phenotypes from groups of individuals, including humans, strains of mice and rats, and organisms as diverse as Drosophila melanogaster, Arabidopsis thaliana, and barley. The inclusion of genotypes makes it practical to carry out web-based gene mapping to discover those regions of genomes that contribute to differences among individuals in mRNA, protein, and metabolite levels, as well as differences in cell function, anatomy, physiology, and behavior.
A recombinant inbred strain or recombinant inbred line (RIL) is an organism with chromosomes that incorporate an essentially permanent set of recombination events between chromosomes inherited from two or more inbred strains. F1 and F2 generations are produced by intercrossing the inbred strains; pairs of the F2 progeny are then mated to establish inbred strains through long-term inbreeding.
Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.
Molecular breeding is the application of molecular biology tools, often in plant breeding and animal breeding. In the broad sense, molecular breeding can be defined as the use of genetic manipulation performed at the level of DNA to improve traits of interest in plants and animals, and it may also include genetic engineering or gene manipulation, molecular marker-assisted selection, and genomic selection. More often, however, molecular breeding implies molecular marker-assisted breeding (MAB) and is defined as the application of molecular biotechnologies, specifically molecular markers, in combination with linkage maps and genomics, to alter and improve plant or animal traits on the basis of genotypic assays.
Oligogenic inheritance describes a trait that is influenced by a few genes. Oligogenic inheritance represents an intermediate between monogenic inheritance in which a trait is determined by a single causative gene, and polygenic inheritance, in which a trait is influenced by many genes and often environmental factors.
Mendelian traits behave according to the model of monogenic or simple gene inheritance in which one gene corresponds to one trait. Discrete traits with simple Mendelian inheritance patterns are relatively rare in nature, and many of the clearest examples in humans cause disorders. Discrete traits found in humans are common examples for teaching genetics.
A human disease modifier gene is a modifier gene that alters expression of a human gene at another locus that in turn causes a genetic disease. Whereas medical genetics has tended to distinguish between monogenic traits, governed by simple, Mendelian inheritance, and quantitative traits, with cumulative, multifactorial causes, increasing evidence suggests that human diseases exist on a continuous spectrum between the two.
This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. It has been designed as a companion to Glossary of cellular and molecular biology, which contains many overlapping and related terms; other related glossaries include Glossary of biology and Glossary of ecology.
Complex traits are phenotypes that are controlled by two or more genes and do not follow Mendel's Law of Dominance. They may have a range of expression which is typically continuous. Both environmental and genetic factors often impact the variation in expression. Human height is a continuous trait meaning that there is a wide range of heights. There are an estimated 50 genes that affect the height of a human. Environmental factors, like nutrition, also play a role in a human's height. Other examples of complex traits include: crop yield, plant color, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits. Complex traits are also known as polygenic traits and multigenic traits.
Multifactorial diseases are not confined to any specific pattern of single gene inheritance and are likely to be caused when multiple genes come together along with the effects of environmental factors.
{{cite journal}}
: |last13=
has generic name (help) S2CID 195367115.{{cite web}}
: CS1 maint: archived copy as title (link)Euphytica 2008, 161:85–96.