Inclusive composite interval mapping

Last updated

In statistical genetics, inclusive composite interval mapping (ICIM) has been proposed as an approach to QTL (quantitative trait locus) mapping for populations derived from bi-parental crosses. QTL mapping is based on genetic linkage map and phenotypic data and attempts to locate individual genetic factors on chromosomes and to estimate their genetic effects.

Contents

Additive and dominance QTL mapping

Two genetic assumptions used in ICIM are (1) the genotypic value of an individual is the summation of effects from all genes affecting the trait of interest; and (2) linked QTL are separated by at least one blank marker interval. Under the two assumptions, they proved that additive effect of the QTL located in a marker interval can be completely absorbed by the regression coefficients of the two flanking markers, while the QTL dominance effect causes marker dominance effects, as well as additive by additive and dominance by dominance interactions between the two flanking markers. By including two multiplication variables between flanking markers, the additive and dominance effects of one QTL can be completely absorbed. As a consequence, an inclusive linear model of phenotype regressing on all genetic markers (and marker multiplications) can be used to fit the positions, and additive (and dominance) effects of all QTL in the genome. [1] [2] [3] A two-step strategy was adopted in ICIM for additive and dominance QTL mapping. In the first step, stepwise regression was applied to identify the most significant marker variables in the linear model. In the second step, one-dimensional scanning or interval mapping was conducted for detecting QTL and estimating its additive and dominance effects, based on the phenotypic values adjusted by the regression model in the first step.

Genetic and statistical properties in additive QTL mapping

Through computer simulations they studied the asymptotic properties of ICIM in additive QTL mapping as well. The test statistic LOD score linearly increases as the increase in population size. The larger of the QTL effect, the greater the corresponding LOD score increases. When population size is greater than 200, the position estimation of ICIM for QTL explaining more than 5% of the phenotypic variance is unbiased. For smaller population size, there is a tendency that the QTL was identified towards the center of the chromosome. When population size is greater than 200, the effect estimation of ICIM for QTL explaining more than 5% of phenotypic variance is unbiased. For smaller sample size, the QTL effect was always overestimated.

Digenic epistasis mapping

Under the same assumptions in additive and dominance QTL mapping of ICIM, an additive by additive epistatic effect between two interacting QTL can be completely absorbed by the four marker interaction variables between the two pairs of flanking markers [5]. That is to say, the coefficients of four marker interactions of two pairs of flanking markers contain the genetic information of the additive by additive epistasis between the two marker intervals. [4] As a consequence, a linear model of phenotype regressing on both markers and marker multiplications can fit the positions and effects of all QTL and their digenic interactions. Similar to the additive QTL mapping of ICIM, two-step strategy was also adopted in additive by additive epistasis mapping. In the first step, stepwise regression was applied to identify the most significant marker and marker interactions. In the second step, two-dimensional scanning was conducted for detecting additive by additive QTL and estimating the genetic effects, based on the phenotypic values adjusted by the regression model in the first step.

Applications in real mapping populations

Take a barley doubled haploid population [5] as an example, nine additive QTL affecting kernel weight were identified to be distributed on five out of the seven chromosomes, explaining 81% of the phenotypic variance. In this population additive effects have explained most of the phenotypic variance, approximating the estimated heritability in the broad sense, which indicates that most of the genetic variance was caused by additive QTL.

Besides that, ICIM has been successfully used in wild and cultivated soybeans in mapping conserved salt tolerance QTL, [6] in rice mapping tiller angle QTL, [7] and grain length QTL, [8] in wheat mapping flour and noodle color components and yellow pigment content, [9] and adult-plant resistance to stripe rust QTL, [10] etc. Some of these detected QTL has been fine mapped.

Joint QTL mapping in multiple families or populations

Bi-parental populations are mostly used in QTL linkage mapping. QTL not segregating between the two parents cannot be detected. To find most, if not all, genes controlling a trait of interest, multiple parents have to be used. Complex cross populations have been proposed in recent years for this purpose. These crosses allow a more powerful understanding of the genetic basis of quantitative traits in more relevant genetic backgrounds. They extended ICIM to map Maize Nested Association Mapping (NAM). [11] [12] design recently proposed by the Buckler laboratory at Cornell University. QTL detection efficiency of ICIM in this design was investigated through extensive simulations. In the actual maize NAM population, ICIM detected a total of 52 additive QTL affecting the silk flowering time in maize. These QTL have explained 79% of the phenotypic variance in this population.

Software for QTL mapping

There is software that implements ICIM additive and epistasis mapping. Its function is: (1) implementation of mapping methods including single marker analysis, interval mapping, ICIM for additive and dominance, ICIM for digenic epistasis, selective phenotyping, etc.; (2) QTL linkage analysis more than twenty mapping populations derived from bi-parental cross, including backcross, double haploid, recombinant inbred lines, etc.; (3) Power analysis for simulated populations under the genetic models user defined; and (4) QTL mapping for non-idealized chromosome segment substitution lines. [13]

Related Research Articles

Heritability Estimation of effect of genetic variation on phenotypic variation of a trait

Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. It measures how much of the variation of a trait can be attributed to variation of genetic factors, as opposed to variation of environmental factors. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"

A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying and sequencing the actual genes that cause the trait variation.

Genetic architecture is the underlying genetic basis of a phenotypic trait and its variational properties. Phenotypic variation for quantitative traits is, at the most basic level, the result of the segregation of alleles at quantitative trait loci (QTL). Environmental factors and other external influences can also play a role in phenotypic variation. Genetic architecture is a broad term that can be described for any given individual based on information regarding gene and allele number, the distribution of allelic and mutational effects, and patterns of pleiotropy, dominance, and epistasis.

A "polygene” or "multiple gene inheritance" is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to characterise the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.

Gene mapping Methods used to identify the locus of a gene and the distances between genes

Gene mapping describes the methods used to identify the locus of a gene and the distances between genes. Gene mapping can also describe the distances between different sites within a gene.

Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.

Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant and animal breeding.

A doubled haploid (DH) is a genotype formed when haploid cells undergo chromosome doubling. Artificial production of doubled haploids is important in plant breeding.

Expression quantitative trait loci (eQTLs) are genomic loci that explain variation in expression levels of mRNAs.

In genetics, association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes, uncovering genetic associations.

Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn. It is important to note that nested association mapping is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population, the details of which are described below.

Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.

Molecular breeding is the application of molecular biology tools, often in plant breeding and animal breeding. In Broad sense, Molecular breeding can be defined as the use of genetic manipulation performed at DNA levels to improve traits of interest in plants and animals, and it may also include genetic engineering or gene manipulation, molecular marker-assisted selection, and genomic selection. But more often, however, molecular breeding implies molecular marker-assisted breeding (MAB) and is defined as the application of molecular biotechnologies, specifically molecular markers, in combination with linkage maps and genomics, to alter and improve plant or animal traits on the basis of genotypic assays.

Linkage based QTL mapping is a variant of QTL mapping.

A sequence related amplified polymorphism (SRAP) is a molecular technique, developed by G. Li and C. F. Quiros in 2001, for detecting genetic variation in the open reading frames (ORFs) of genomes of plants and related organisms.

Genetic variance

Genetic variance is a concept outlined by the English biologist and statistician Ronald Fisher in his fundamental theorem of natural selection which he outlined in his 1930 book The Genetical Theory of Natural Selection which postulates that the rate of change of biological fitness can be calculated by the genetic variance of the fitness itself. Fisher tried to give a statistical formula about how the change of fitness in a population can be attributed to changes in the allele frequency. Fisher made no restrictive assumptions in his formula concerning fitness parameters, mate choices or the number of alleles and loci involved.

Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants. This is done by directly quantifying the chance genetic similarity of unrelated individuals and comparing it to their measured similarity on a trait; if two unrelated individuals are relatively similar genetically and also have similar trait measurements, then the measured genetics are likely to causally influence that trait, and the correlation can to some degree tell how much. This can be illustrated by plotting the squared pairwise trait differences between individuals against their estimated degree of relatedness. The GCTA framework can be applied in a variety of settings. For example, it can be used to examine changes in heritability over aging and development. It can also be extended to analyse bivariate genetic correlations between traits. There is an ongoing debate about whether GCTA generates reliable or stable estimates of heritability when used on current SNP data. The method is based on the outdated and false dichotomy of genes versus the environment. It also suffers from serious methodological weaknesses, such as susceptibility to population stratification.

A human disease modifier gene is a modifier gene that alters expression of a human gene at another locus that in turn causes a genetic disease. Whereas medical genetics has tended to distinguish between monogenic traits, governed by simple, Mendelian inheritance, and quantitative traits, with cumulative, multifactorial causes, increasing evidence suggests that human diseases exist on a continuous spectrum between the two.

Epistasis Genetic phenomenon in which a gene mutations effect depends on mutations in other genes

Epistasis is a phenomenon in genetics in which the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, respectively termed modifier genes. In other words, the effect of the mutation is dependent on the genetic background in which it appears. Epistatic mutations therefore have different effects on their own than when they occur together. Originally, the term epistasis specifically meant that the effect of a gene variant is masked by that of a different gene.

Complex traits

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as QTL mapping rather than classical genetics methods. Examples of complex traits include height, circadian rhythms, enzyme kinetics, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits.

References

  1. Li, H., G. Ye and J. Wang (2007). "A Modified Algorithm for the Improvement of Composite Interval Mapping". Genetics. 175 (1): 361–374. doi:10.1534/genetics.106.066811. PMC   1775001 . PMID   17110476.CS1 maint: multiple names: authors list (link)
  2. Wang J. (2009). "Inclusive composite interval mapping of quantitative trait genes". Acta Agron. Sin. 35: 3239–245.
  3. Zhang, L., H. Li, Z. Li, and J. Wang (2008). "Interactions Between Markers Can Be Caused by the Dominance Effect of Quantitative Trait Loci". Genetics. 180 (2): 1177–1190. doi:10.1534/genetics.108.092122. PMC   2567366 . PMID   18780741.CS1 maint: multiple names: authors list (link)
  4. Li, H., Z. Li and J. Wang (2008). "Inclusive composite interval mapping (ICIM) for digenic epistasis of quantitative traits in biparental populations". Theor. Appl. Genet. 116 (2): 243–260. doi:10.1007/s00122-007-0663-5. PMID   17985112.CS1 maint: multiple names: authors list (link)
  5. Tinker, N. A., D. E. Mather, B. G. Rossnagel, K. J. Kasha, A. Kleinhofs, P. M. Hayes, D. E. Falk, T. Ferguson, L. P. Shugar, W. G. Legge, R. B. Irvine, T. M. Choo, K. G. Briggs, S. E. Ullrich, J. D. Franckowiak, T. K. Blake, R. J. Graf, S. M. Dofing, M. A. Saghai Maroof, G. J. Scoles, D. Hoffman, L. S. Dahleen, A. Kilian, F. Chen, R. M. Biyashev, D. A. Kudrna, and B. J. Steffenson (1996). "Regions of the genome that affect agronomic performance in two-row barley" (PDF). Crop Science. 36 (4): 1053–1062. doi:10.2135/cropsci1996.0011183X003600040040x. Archived from the original (PDF) on 2011-07-03. Retrieved 2010-04-07.CS1 maint: multiple names: authors list (link)
  6. Hamwieh, A.; D. Xu (2008). "Conserved salt tolerance quantitative trait locus (QTL) in wild and cultivated soybeans". Breeding Science. 58 (4): 355–359. doi: 10.1270/jsbbs.58.355 .
  7. Chen, P., L. Jiang, C. Yu, W. Zhang, J. Wang, and J. Wan (2008). "The identification and mapping of a tiller angle QTL on rice chromosome 9". Crop Science. 48 (5): 1799–1806. doi:10.2135/cropsci2007.12.0702. Archived from the original on 2009-02-08. Retrieved 2010-04-26.CS1 maint: multiple names: authors list (link)
  8. Wan, X., J. Wan, L. Jiang, J. Wang, H. Zhai, J. Weng, H. Wang, C. Lei, J. Wang, X. Zhang, Z. Cheng, X. Guo (2006). "QTL analysis for rice grain length and fine mapping of an identified QTL with stable and major effects". Theoretical and Applied Genetics. 112 (7): 1258–1270. doi:10.1007/s00122-006-0227-0.CS1 maint: multiple names: authors list (link)
  9. Zhang, Y., Y. Wu, Y. Xiao, Z. He, Y. Zhang, J. Yan, Y. Zhang, X. Xia, and C. Ma (2009). "QTL mapping for flour and noodle colour components and yellow pigment content in common wheat". Euphytica. 165 (3): 435–444. doi:10.1007/s10681-008-9744-z.CS1 maint: multiple names: authors list (link)
  10. Lu, Y., C. Lan, S. Liang, X. Zhou, D. Liu, G. Zhou, Q. Lu, J. Jing, M. Wang, X. Xia, and Z. He (2009). "QTL mapping for adult-plant resistance to stripe rust in Italian common wheat cultivars Libellula and Strampelli". Theoretical and Applied Genetics. 119 (8): 1349–1359. doi:10.1007/s00122-009-1139-6. PMID   19756474.CS1 maint: multiple names: authors list (link)
  11. Michael D. McMullen; Stephen Kresovich; Hector Sanchez Villeda; Peter Bradbury; Huihui Li; Qi Sun; Sherry Flint-Garcia; Jeffry Thornsberry; Charlotte Acharya; Christopher Bottoms; Patrick Brown; Chris Browne; Magen Eller; Kate Guill; Carlos Harjes; Dallas Kroon; Nick Lepak; Sharon E. Mitchell; Brooke Peterson; Gael Pressoir; Susan Romero; Marco Oropeza Rosas; Stella Salvo; Heather Yates; Mark Hanson; Elizabeth Jones; Stephen Smith; Jeffrey C. Glaubitz; Major Goodman; Doreen Ware; James B. Holland; Edward S. Buckler (2009). "Genetic Properties of the Maize Nested Association Mapping Population". Science. 325 (737): 737–740. doi:10.1126/science.1174320. PMID   19661427.
  12. Edward S. Buckler; James B. Holland; Peter J. Bradbury; Charlotte B. Acharya; Patrick J. Brown; Chris Browne; Elhan Ersoz; Sherry Flint-Garcia; Arturo Garcia; Jeffrey C. Glaubitz; Major M. Goodman; Carlos Harjes; Kate Guill; Dallas E. Kroon; Sara Larsson; Nicholas K. Lepak; Huihui Li; Sharon E. Mitchell; Gael Pressoir; Jason A. Peiffer; Marco Oropeza Rosas; Torbert R. Rocheford; M. Cinta Romay; Susan Romero; Stella Salvo; Hector Sanchez Villeda; H. Sofia da Silva; Qi Sun; Feng Tian; Narasimham Upadyayula; Doreen Ware; Heather Yates; Jianming Yu; Zhiwu Zhang; Stephen Kresovich; Michael D. McMullen (2009). "The Genetic Architecture of Maize Flowering Time". Science. 325 (5941): 714–718. doi:10.1126/science.1174276. PMID   19661422.
  13. Wang J; X. Wan; J. Crossa; J. Crouch; J. Weng; H. Zhai; J. Wan (2006). "QTL mapping of grain length in rice (Oryza sativa L.) using chromosome segment substitution lines". Genet. Res. 88 (2): 93–104. doi:10.1017/S0016672306008408. PMID   17125584.