In statistical genetics, inclusive composite interval mapping (ICIM) has been proposed as an approach to QTL (quantitative trait locus) mapping for populations derived from bi-parental crosses. QTL mapping is based on genetic linkage map and phenotypic data and attempts to locate individual genetic factors on chromosomes and to estimate their genetic effects.
Two genetic assumptions used in ICIM are (1) the genotypic value of an individual is the summation of effects from all genes affecting the trait of interest; and (2) linked QTL are separated by at least one blank marker interval. Under the two assumptions, they proved that additive effect of the QTL located in a marker interval can be completely absorbed by the regression coefficients of the two flanking markers, while the QTL dominance effect causes marker dominance effects, as well as additive by additive and dominance by dominance interactions between the two flanking markers. By including two multiplication variables between flanking markers, the additive and dominance effects of one QTL can be completely absorbed. As a consequence, an inclusive linear model of phenotype regressing on all genetic markers (and marker multiplications) can be used to fit the positions, and additive (and dominance) effects of all QTL in the genome. [1] [2] [3] A two-step strategy was adopted in ICIM for additive and dominance QTL mapping. In the first step, stepwise regression was applied to identify the most significant marker variables in the linear model. In the second step, one-dimensional scanning or interval mapping was conducted for detecting QTL and estimating its additive and dominance effects, based on the phenotypic values adjusted by the regression model in the first step.
Through computer simulations they studied the asymptotic properties of ICIM in additive QTL mapping as well. The test statistic LOD score linearly increases as the increase in population size. The larger of the QTL effect, the greater the corresponding LOD score increases. When population size is greater than 200, the position estimation of ICIM for QTL explaining more than 5% of the phenotypic variance is unbiased. For smaller population size, there is a tendency that the QTL was identified towards the center of the chromosome. When population size is greater than 200, the effect estimation of ICIM for QTL explaining more than 5% of phenotypic variance is unbiased. For smaller sample size, the QTL effect was always overestimated.
Under the same assumptions in additive and dominance QTL mapping of ICIM, an additive by additive epistatic effect between two interacting QTL can be completely absorbed by the four marker interaction variables between the two pairs of flanking markers [5]. That is to say, the coefficients of four marker interactions of two pairs of flanking markers contain the genetic information of the additive by additive epistasis between the two marker intervals. [4] As a consequence, a linear model of phenotype regressing on both markers and marker multiplications can fit the positions and effects of all QTL and their digenic interactions. Similar to the additive QTL mapping of ICIM, two-step strategy was also adopted in additive by additive epistasis mapping. In the first step, stepwise regression was applied to identify the most significant marker and marker interactions. In the second step, two-dimensional scanning was conducted for detecting additive by additive QTL and estimating the genetic effects, based on the phenotypic values adjusted by the regression model in the first step.
Take a barley doubled haploid population [5] as an example, nine additive QTL affecting kernel weight were identified to be distributed on five out of the seven chromosomes, explaining 81% of the phenotypic variance. In this population additive effects have explained most of the phenotypic variance, approximating the estimated heritability in the broad sense, which indicates that most of the genetic variance was caused by additive QTL.
Besides that, ICIM has been successfully used in wild and cultivated soybeans in mapping conserved salt tolerance QTL, [6] in rice mapping tiller angle QTL, [7] and grain length QTL, [8] in wheat mapping flour and noodle color components and yellow pigment content, [9] and adult-plant resistance to stripe rust QTL, [10] etc. Some of these detected QTL has been fine mapped.
Bi-parental populations are mostly used in QTL linkage mapping. QTL not segregating between the two parents cannot be detected. To find most, if not all, genes controlling a trait of interest, multiple parents have to be used. Complex cross populations have been proposed in recent years for this purpose. These crosses allow a more powerful understanding of the genetic basis of quantitative traits in more relevant genetic backgrounds. They extended ICIM to map Maize Nested Association Mapping (NAM). [11] [12] design recently proposed by the Buckler laboratory at Cornell University. QTL detection efficiency of ICIM in this design was investigated through extensive simulations. In the actual maize NAM population, ICIM detected a total of 52 additive QTL affecting the silk flowering time in maize. These QTL have explained 79% of the phenotypic variance in this population.
There is software that implements ICIM additive and epistasis mapping. Its function is: (1) implementation of mapping methods including single marker analysis, interval mapping, ICIM for additive and dominance, ICIM for digenic epistasis, selective phenotyping, etc.; (2) QTL linkage analysis more than twenty mapping populations derived from bi-parental cross, including backcross, double haploid, recombinant inbred lines, etc.; (3) Power analysis for simulated populations under the genetic models user defined; and (4) QTL mapping for non-idealized chromosome segment substitution lines. [13]
Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. It measures how much of the variation of a trait can be attributed to variation of genetic factors, as opposed to variation of environmental factors. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"
A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying and sequencing the actual genes that cause the trait variation.
Genetic architecture is the underlying genetic basis of a phenotypic trait and its variational properties. Phenotypic variation for quantitative traits is, at the most basic level, the result of the segregation of alleles at quantitative trait loci (QTL). Environmental factors and other external influences can also play a role in phenotypic variation. Genetic architecture is a broad term that can be described for any given individual based on information regarding gene and allele number, the distribution of allelic and mutational effects, and patterns of pleiotropy, dominance, and epistasis.
A "polygene” or "multiple gene inheritance" is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to characterise the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.
Gene mapping describes the methods used to identify the locus of a gene and the distances between genes. Gene mapping can also describe the distances between different sites within a gene.
Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.
Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant and animal breeding.
A doubled haploid (DH) is a genotype formed when haploid cells undergo chromosome doubling. Artificial production of doubled haploids is important in plant breeding.
Expression quantitative trait loci (eQTLs) are genomic loci that explain variation in expression levels of mRNAs.
In genetics, association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes, uncovering genetic associations.
Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn. It is important to note that nested association mapping is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population, the details of which are described below.
Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.
Molecular breeding is the application of molecular biology tools, often in plant breeding and animal breeding. In Broad sense, Molecular breeding can be defined as the use of genetic manipulation performed at DNA levels to improve traits of interest in plants and animals, and it may also include genetic engineering or gene manipulation, molecular marker-assisted selection, and genomic selection. But more often, however, molecular breeding implies molecular marker-assisted breeding (MAB) and is defined as the application of molecular biotechnologies, specifically molecular markers, in combination with linkage maps and genomics, to alter and improve plant or animal traits on the basis of genotypic assays.
Linkage based QTL mapping is a variant of QTL mapping.
A sequence related amplified polymorphism (SRAP) is a molecular technique, developed by G. Li and C. F. Quiros in 2001, for detecting genetic variation in the open reading frames (ORFs) of genomes of plants and related organisms.
Genetic variance is a concept outlined by the English biologist and statistician Ronald Fisher in his fundamental theorem of natural selection which he outlined in his 1930 book The Genetical Theory of Natural Selection which postulates that the rate of change of biological fitness can be calculated by the genetic variance of the fitness itself. Fisher tried to give a statistical formula about how the change of fitness in a population can be attributed to changes in the allele frequency. Fisher made no restrictive assumptions in his formula concerning fitness parameters, mate choices or the number of alleles and loci involved.
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants. This is done by directly quantifying the chance genetic similarity of unrelated individuals and comparing it to their measured similarity on a trait; if two unrelated individuals are relatively similar genetically and also have similar trait measurements, then the measured genetics are likely to causally influence that trait, and the correlation can to some degree tell how much. This can be illustrated by plotting the squared pairwise trait differences between individuals against their estimated degree of relatedness. The GCTA framework can be applied in a variety of settings. For example, it can be used to examine changes in heritability over aging and development. It can also be extended to analyse bivariate genetic correlations between traits. There is an ongoing debate about whether GCTA generates reliable or stable estimates of heritability when used on current SNP data. The method is based on the outdated and false dichotomy of genes versus the environment. It also suffers from serious methodological weaknesses, such as susceptibility to population stratification.
A human disease modifier gene is a modifier gene that alters expression of a human gene at another locus that in turn causes a genetic disease. Whereas medical genetics has tended to distinguish between monogenic traits, governed by simple, Mendelian inheritance, and quantitative traits, with cumulative, multifactorial causes, increasing evidence suggests that human diseases exist on a continuous spectrum between the two.
Epistasis is a phenomenon in genetics in which the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, respectively termed modifier genes. In other words, the effect of the mutation is dependent on the genetic background in which it appears. Epistatic mutations therefore have different effects on their own than when they occur together. Originally, the term epistasis specifically meant that the effect of a gene variant is masked by that of a different gene.
Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as QTL mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.