Genetic map function

Last updated

In genetics, mapping functions are used to model the relationship between map distance (measured in map units or centimorgans) between markers and recombination frequency between markers. One utility of this is that it allows values to be obtained for genetic distances, which is typically not estimable, from recombination fractions, which typically are. [1]

Contents

The simplest mapping function is the Morgan Mapping Function, eponymously devised by Thomas Hunt Morgan. Other well-known mapping functions include the Haldane Mapping Function introduced by J. B. S. Haldane in 1919, [2] and the Kosambi Mapping Function introduced by Damodar Dharmananda Kosambi in 1944. [3] [4] Few mapping functions are used in practice other than Haldane and Kosambi. [5] The main difference between them is in how crossover interference is incorporated. [6]

Morgan Mapping Function

Where d is the distance in map units, the Morgan Mapping Function states that the recombination frequency r can be expressed as . This assumes that one crossover occurs, at most, in an interval between two loci, and that the probability of the occurrence of this crossover is proportional to the map length of the interval.

Where d is the distance in map units, the recombination frequency r can be expressed as:

The equation only holds when as, otherwise, recombination frequency would exceed 50%. Therefore, the function cannot approximate recombination frequencies beyond short distances. [4]

Haldane Mapping Function

Overview

Two properties of the Haldane Mapping Function is that it limits recombination frequency up to, but not beyond 50%, and that it represents a linear relationship between the frequency of recombination and map distance up to recombination frequencies of 10%. [7] It also assumes that crossovers occur at random positions and that they do so independent of one another. This assumption therefore also assumes no crossover interference takes place; [5] but using this assumption allows Haldane to model the mapping function using a Poisson distribution. [4]

Definitions

Formula

Inverse

Kosambi Mapping Function

Overview

The Kosambi mapping function was introduced to account for the impact played by crossover interference on recombination frequency. It introduces a parameter C, representing the coefficient of coincidence, and sets it equal to 2r. For loci which are strongly linked, interference is strong; otherwise, interference decreases towards zero. [5] Interference declines according to the linear function i = 1 - 2r. [8]

Formula

Inverse

Comparison and application

Below 10% recombination frequency, there is little mathematical difference between different mapping functions and the relationship between map distance and recombination frequency is linear (that is, 1 map unit = 1% recombination frequency). [8] When genome-wide SNP sampling and mapping data is present, the difference between the functions is negligible outside of regions of high recombination, such as recombination hotspots or ends of chromosomes. [6]

While many mapping functions now exist, [9] [10] [11] in practice functions other than Haldane and Kosambi are rarely used. [5] More specifically, the Haldane function is preferred when distance between markers is relatively small, whereas the Kosambi function is preferred when distances between markers is larger and crossovers need to be accounted for. [12]

Related Research Articles

<span class="mw-page-title-main">Homeomorphism</span> Mapping which preserves all topological properties of a given space

In mathematics and more specifically in topology, a homeomorphism, also called topological isomorphism, or bicontinuous function, is a bijective and continuous function between topological spaces that has a continuous inverse function. Homeomorphisms are the isomorphisms in the category of topological spaces—that is, they are the mappings that preserve all the topological properties of a given space. Two spaces with a homeomorphism between them are called homeomorphic, and from a topological viewpoint they are the same.

<span class="mw-page-title-main">Chromosomal crossover</span> Cellular process

Chromosomal crossover, or crossing over, is the exchange of genetic material during sexual reproduction between two homologous chromosomes' non-sister chromatids that results in recombinant chromosomes. It is one of the final phases of genetic recombination, which occurs in the pachytene stage of prophase I of meiosis during a process called synapsis. Synapsis begins before the synaptonemal complex develops and is not completed until near the end of prophase I. Crossover usually occurs when matching regions on matching chromosomes break and then reconnect to the other chromosome.

Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separated onto different chromatids during chromosomal crossover, and are therefore said to be more linked than markers that are far apart. In other words, the nearer two genes are on a chromosome, the lower the chance of recombination between them, and the more likely they are to be inherited together. Markers on different chromosomes are perfectly unlinked, although the penetrance of potentially deleterious alleles may be influenced by the presence of other alleles, and these other alleles may be located on other chromosomes than that on which a particular potentially deleterious allele is located.

A quantitative trait locus (QTL) is a locus that correlates with variation of a quantitative trait in the phenotype of a population of organisms. QTLs are mapped by identifying which molecular markers correlate with an observed trait. This is often an early step in identifying the actual genes that cause the trait variation.

In population genetics, linkage disequilibrium (LD) is a measure of non-random association between segments of DNA (alleles) at different positions on the chromosome (loci) in a given population based on a comparison between the frequency at which two alleles are detected together at the same loci and the frequencies at which each allele is detected at that loci overall, whether it occurs with or without the other allele of interest. Loci are said to be in linkage disequilibrium when the frequency of being detected together is higher or lower than expected if the loci were independent and associated randomly.

A polygene is a member of a group of non-epistatic genes that interact additively to influence a phenotypic trait, thus contributing to multiple-gene inheritance, a type of non-Mendelian inheritance, as opposed to single-gene inheritance, which is the core notion of Mendelian inheritance. The term "monozygous" is usually used to refer to a hypothetical gene as it is often difficult to distinguish the effect of an individual gene from the effects of other genes and the environment on a particular phenotype. Advances in statistical methodology and high throughput sequencing are, however, allowing researchers to locate candidate genes for the trait. In the case that such a gene is identified, it is referred to as a quantitative trait locus (QTL). These genes are generally pleiotropic as well. The genes that contribute to type 2 diabetes are thought to be mostly polygenes. In July 2016, scientists reported identifying a set of 355 genes from the last universal common ancestor (LUCA) of all organisms living on Earth.

<span class="mw-page-title-main">Gene mapping</span> Process of locating specific genes

Gene mapping or genome mapping describes the methods used to identify the location of a gene on a chromosome and the distances between genes. Gene mapping can also describe the distances between different sites within a gene.

<span class="mw-page-title-main">Genetic distance</span> Measure of divergence between populations

Genetic distance is a measure of the genetic divergence between species or between populations within a species, whether the distance measures time from common ancestor or degree of differentiation. Populations with many similar alleles have small genetic distances. This indicates that they are closely related and have a recent common ancestor.

Coalescent theory is a model of how alleles sampled from a population may have originated from a common ancestor. In the simplest case, coalescent theory assumes no recombination, no natural selection, and no gene flow or population structure, meaning that each variant is equally likely to have been passed from one generation to the next. The model looks backward in time, merging alleles into a single ancestral copy according to a random process in coalescence events. Under this model, the expected time between successive coalescence events increases almost exponentially back in time. Variance in the model comes from both the random passing of alleles from one generation to the next, and the random occurrence of mutations in these alleles.

In genetics, completelinkage is defined as the state in which two loci are so close together that alleles of these loci are virtually never separated by crossing over. The closer the physical location of two genes on the DNA, the less likely they are to be separated by a crossing-over event. In the case of male Drosophila there is complete absence of recombinant types due to absence of crossing over. This means that all of the genes that start out on a single chromosome, will end up on that same chromosome in their original configuration. In the absence of recombination, only parental phenotypes are expected.

Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant- and animal- breeding.

In genetics, a centimorgan or map unit (m.u.) is a unit for measuring genetic linkage. It is defined as the distance between chromosome positions for which the expected average number of intervening chromosomal crossovers in a single generation is 0.01. It is often used to infer distance along a chromosome. However, it is not a true physical distance.

A doubled haploid (DH) is a genotype formed when haploid cells undergo chromosome doubling. Artificial production of doubled haploids is important in plant breeding.

In genetics, association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci (QTLs) that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes, uncovering genetic associations.

Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn. It is important to note that nested association mapping is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population, the details of which are described below.

In statistical genetics, inclusive composite interval mapping (ICIM) has been proposed as an approach to QTL mapping for populations derived from bi-parental crosses. QTL mapping is based on genetic linkage map and phenotypic data to attempt to locate individual genetic factors on chromosomes and to estimate their genetic effects.

A recombinant inbred strain or recombinant inbred line (RIL) is an organism with chromosomes that incorporate an essentially permanent set of recombination events between chromosomes inherited from two or more inbred strains. F1 and F2 generations are produced by intercrossing the inbred strains; pairs of the F2 progeny are then mated to establish inbred strains through long-term inbreeding.

Quantitative trait loci mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.

A sequence related amplified polymorphism (SRAP) is a molecular technique, developed by G. Li and C. F. Quiros in 2001, for detecting genetic variation in the open reading frames (ORFs) of genomes of plants and related organisms.

In genetics, the crossover value is the linked frequency of chromosomal crossover between two gene loci (markers). For a fixed set of genetic and environmental conditions, recombination in a particular region of a linkage structure (chromosome) tends to be constant and the same is then true for the crossover value which is used in the production of genetic maps.

References

  1. Broman, Karl W.; Sen, Saunak (2009). A guide to QTL mapping with R/qtl. Statistics for biology and health. Dordrecht: Springer. p. 14. ISBN   978-0-387-92124-2. OCLC   669122118.
  2. Haldane, J.B.S. (1919). "The combination of linkage values, and the calculation of distances between the loci of linked factors". Journal of Genetics. 8 (29): 299–309.
  3. Kosambi, D. D. (1943). "The Estimation of Map Distances from Recombination Values". Annals of Eugenics. 12 (1): 172–175. doi:10.1111/j.1469-1809.1943.tb02321.x. ISSN   2050-1420.
  4. 1 2 3 Wu, Rongling; Ma, Chang-Xing; Casella, George (2007). Statistical genetics of quantitative traits: linkage, maps, and QTL. New York: Springer. p. 65. ISBN   978-0-387-20334-8. OCLC   141385359.
  5. 1 2 3 4 Ruvinsky, Anatoly; Graves, Jennifer A. Marshall, eds. (2005). Mammalian genomics. Wallingford, Oxfordshire, UK ; Cambridge, MA, USA: CABI Pub. p. 15. ISBN   978-0-85199-910-4.
  6. 1 2 Peñalba, Joshua V.; Wolf, Jochen B. W. (2020). "From molecules to populations: appreciating and estimating recombination rate variation". Nature Reviews Genetics. 21 (8): 476–492. doi:10.1038/s41576-020-0240-1. ISSN   1471-0064.
  7. "mapping function". Oxford Reference. doi:10.1093/oi/authority.20110803100132641?rskey=srzx3w&result=6 (inactive 2024-04-30). Retrieved 2024-04-29.{{cite web}}: CS1 maint: DOI inactive as of April 2024 (link)
  8. 1 2 Hartl, Daniel L.; Jones, Elizabeth W. (2005). Genetics: analysis of genes and genomes (7th ed.). Sudbury, Mass.: Jones and Bartlett. p. 168. ISBN   978-0-7637-1511-3.
  9. Crow, J F (1990). "Mapping functions". Genetics. 125 (4): 669–671. doi:10.1093/genetics/125.4.669. ISSN   1943-2631. PMC   1204092 . PMID   2204577.
  10. Felsenstein, Joseph (1979). "A Mathematically Tractable Family of Genetic Mapping Functions with Different Amounts of Interference". Genetics. 91 (4): 769–775. doi:10.1093/genetics/91.4.769. PMC   1216865 . PMID   17248911.
  11. Pascoe, L.; Morton, N.E. (1987). "The use of map functions in multipoint mapping". American Journal of Human Genetics. 40 (2): 174–183. PMC   1684067 . PMID   3565379.
  12. Aluru, Srinivas, ed. (2006). Handbook of computational molecular biology. CRC Press. pp. 17-10–17-11. ISBN   978-1-58488-406-4.

Further reading