A supergene is a chromosomal region encompassing multiple neighboring genes that are inherited together because of close genetic linkage, i.e. much less recombination than would normally be expected. [1] This mode of inheritance can be due to genomic rearrangements between supergene variants.
A supergene region can contain few, functionally related genes that clearly contribute to a shared phenotype. [2]
Supergenes have cis-effects due to multiple loci (which may be within a gene, or within a single gene's regulatory region), and tight linkage. They are classically polymorphic, whereby different supergene variants code for different phenotypes.
Classic supergenes include many sex chromosomes, the Primula heterostyly locus, which controls "pin" and "thrum" types, and the locus controlling Batesian mimetic polymorphism in Papilio memnon butterflies. Recently discovered supergenes are responsible for complex phenotypes including color-morphs in the white-throated sparrow. [3] [4] [5]
Primula supergene. Pin and thrum morphs of Primula have effects on genetic compatibility (pin style x thrum pollen, or thrum style x pin pollen matings are successful, while pin x pin, and thrum x thrum matings are rarely successful due to pollen-style incompatibility), and have different style length, anther height in the corolla tube, pollen size, and papilla size on the stigma. Each of these effects is controlled by a different locus in the same supergene, but recombinants are occasionally found with traits combining those of "pin" and "thrum" morphs.
The earliest use of the term "supergene" may be in an article by A. Ernst (1936) in the journal Archiv der Julius Klaus-Stiftung für Vererbungsforschung, Sozialanthropologie und Rassenhygiene. [6]
Classically, supergenes were hypothesized to have evolved from less tightly-linked genes coming together via chromosomal rearrangement or reduced crossing over, due to selection for particular multilocus phenotypes. For instance, in Batesian mimicry supergenes in species such as Papilio memnon, genes are required to affect hind-wing, fore-wing, and body colour, and also the presence or absence of long projections (the "tails" of swallowtail butterflies).
The case for the accumulative origin for supergenes was originally based on the work of Nabours on polymorphism for colour and pattern in grouse locusts (Tetrigidae). In Acridium arenosum the colour-patterns are controlled by thirteen genes on the same chromosome, which reassort (recombine) fairly easily. They also occur in Apotettix eurycephalus where they form two tightly linked groups, between which there is 7% crossing-over. Furthermore, in Paratettix texanus there appears to be complete suppression of crossing-over among 24 out of 25 of the colour-pattern genes, which can be distinguished by comparing their effects with those found in other species. Analysis of Nabour's data by Darlington & Mather concluded that the genes responsible for the morphs of Paratettix texanus have been gradually aggregated into a group which acts as a single switch-mechanism. [7] [8] [9] This explanation was accepted by E.B. Ford and incorporated into his accounts of ecological genetics. [10] [11]
This process might involve suppression of crossing-over, translocation of chromosome fragments and possibly occasional cistron duplication. That crossing-over can be suppressed by selection has been known for many years; Detlefsen and Roberts were able to reduce recombination between the loci for white eyes (w) and miniature wings (m) in Drosophila melanogaster from the normal 36% to 6% in one line and 0.6% in another. [12] [13]
Debate has tended to centre round the question, could the component genes in a super-gene have started off on separate chromosomes, with subsequent reorganization, or is it necessary for them to start on the same chromosome? Many scientists today believe the latter, because some linkage disequilibrium is initially needed to select for tighter linkage, and linkage disequilibrium requires both the previous existence of polymorphisms via some other process, like natural selection, favouring gene combinations. [14] If genes are weakly linked, it is probable that the rarer advantageous haplotype dies out, leading to the loss of polymorphism at the other locus.
Most people, following J.R.G. Turner, therefore argue that supergenes arose in situ due to selection for correlated and epistatic traits, which just happened to have been possible to select via the existence of suitable loci closely linked to the original variant. [15] Turner calls this a "sieve" explanation, and the Turner explanation might be called the "Turner sieve" hypothesis. [16] Maynard Smith agreed with this view in his authoritative textbook. [17] Nevertheless, the question is not definitively settled. The problem is connected to an even larger question, the evolution of evolvability.
Genomic rearrangements such as inversions can suppress recombination.
Suppressed recombination leads to accumulation of repetitive elements (including to degenerative expansion) in early supergene evolution [Ref Papaya, Fire ant], and to changes in gene expression [ref Fire ant, anther smut].
Gene complexes, in contrast, are simply tightly linked groups of genes, often created via gene duplication (sometimes called tandem duplication if the duplicates remain side-by-side). Here, each gene has similar though slightly diverged function. For example, the human major histocompatibility complex (MHC) region is a complex of tightly linked genes all acting in the immune system, but has no claim to be a supergene, even though the component genes very likely have epistatic effects and are in strong disequilibrium due in part to selection.
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and population structure.
Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separated onto different chromatids during chromosomal crossover, and are therefore said to be more linked than markers that are far apart. In other words, the nearer two genes are on a chromosome, the lower the chance of recombination between them, and the more likely they are to be inherited together. Markers on different chromosomes are perfectly unlinked, although the penetrance of potentially deleterious alleles may be influenced by the presence of other alleles, and these other alleles may be located on other chromosomes than that on which a particular potentially deleterious allele is located.
In biology, polymorphism is the occurrence of two or more clearly different morphs or forms, also referred to as alternative phenotypes, in the population of a species. To be classified as such, morphs must occupy the same habitat at the same time and belong to a panmictic population.
In population genetics, linkage disequilibrium (LD) is a measure of non-random association between segments of DNA (alleles) at different positions on the chromosome (loci) in a given population based on a comparison between the frequency at which two alleles are detected together at the same loci versus the frequencies at which each allele is simply detected at that same loci. Loci are said to be in linkage disequilibrium when the frequency of being detected together is higher or lower than expected if the loci were independent and associated randomly.
Balancing selection refers to a number of selective processes by which multiple alleles are actively maintained in the gene pool of a population at frequencies larger than expected from genetic drift alone. Balancing selection is rare compared to purifying selection. It can occur by various mechanisms, in particular, when the heterozygotes for the alleles under consideration have a higher fitness than the homozygote. In this way genetic polymorphism is conserved.
An inversion is a chromosome rearrangement in which a segment of a chromosome becomes inverted within its original position. An inversion occurs when a chromosome undergoes a two breaks within the chromosomal arm, and the segment between the two breaks inserts itself in the opposite direction in the same chromosome arm. The breakpoints of inversions often happen in regions of repetitive nucleotides, and the regions may be reused in other inversions. Chromosomal segments in inversions can be as small as 1 kilobases or as large as 100 megabases. The number of genes captured by an inversion can range from a handful of genes to hundreds of genes. Inversions can happen either through ectopic recombination between repetitive sequences, or through chromosomal breakage followed by non-homologous end joining.
The viceroy is a North American butterfly. It was long thought to be a Batesian mimic of the monarch butterfly, but since the viceroy is also distasteful to predators, it is now considered a Müllerian mimic instead.
Genetics, a discipline of biology, is the science of heredity and variation in living organisms.
Heterostyly is a unique form of polymorphism and herkogamy in flowers. In a heterostylous species, two or three morphological types of flowers, termed "morphs", exist in the population. On each individual plant, all flowers share the same morph. The flower morphs differ in the lengths of the pistil and stamens, and these traits are not continuous. The morph phenotype is genetically linked to genes responsible for a unique system of self-incompatibility, termed heteromorphic self-incompatibility, that is, the pollen from a flower on one morph cannot fertilize another flower of the same morph.
Genetic association is when one or more genotypes within a population co-occur with a phenotypic trait more often than would be expected by chance occurrence.
In genetics, a locus is a specific, fixed position on a chromosome where a particular gene or genetic marker is located. Each chromosome carries many genes, with each gene occupying a different position or locus; in humans, the total number of protein-coding genes in a complete haploid set of 23 chromosomes is estimated at 19,000–20,000.
Recombination hotspots are regions in a genome that exhibit elevated rates of recombination relative to a neutral expectation. The recombination rate within hotspots can be hundreds of times that of the surrounding region. Recombination hotspots result from higher DNA break formation in these regions, and apply to both mitotic and meiotic cells. This appellation can refer to recombination events resulting from the uneven distribution of programmed meiotic double-strand breaks.
Genetic hitchhiking, also called genetic draft or the hitchhiking effect, is when an allele changes frequency not because it itself is under natural selection, but because it is near another gene that is undergoing a selective sweep and that is on the same DNA chain. When one gene goes through a selective sweep, any other nearby polymorphisms that are in linkage disequilibrium will tend to change their allele frequencies too. Selective sweeps happen when newly appeared mutations are advantageous and increase in frequency. Neutral or even slightly deleterious alleles that happen to be close by on the chromosome 'hitchhike' along with the sweep. In contrast, effects on a neutral locus due to linkage disequilibrium with newly appeared deleterious mutations are called background selection. Both genetic hitchhiking and background selection are stochastic (random) evolutionary forces, like genetic drift.
Background selection describes the loss of genetic diversity at a locus due to negative selection against deleterious alleles with which it is in linkage disequilibrium. The name emphasizes the fact that the genetic background, or genomic environment, of a mutation has a significant impact on whether it will be preserved versus lost from a population. Background selection contradicts the assumption of the neutral theory of molecular evolution that the fixation or loss of a neutral allele can be described by one-locus models of genetic drift, independently from other loci. As well as reducing neutral nucleotide diversity, background selection reduces the fixation probability of beneficial mutations, and increases the fixation probability of deleterious mutations.
Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a trait of interest is selected based on a marker linked to a trait of interest, rather than on the trait itself. This process has been extensively researched and proposed for plant- and animal- breeding.
Cepaea is a genus of large air-breathing land snails, terrestrial pulmonate gastropod molluscs in the family Helicidae. The shells are often brightly coloured and patterned with brown stripes. The two species in this genus, C. nemoralis and C. hortensis, are widespread and common in Western and Central Europe and have been introduced to North America. Both have been influential model species for ongoing studies of genetics and natural selection. Like many Helicidae, these snails use love darts during mating.
The following outline is provided as an overview of and topical guide to genetics:
This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. It has been designed as a companion to Glossary of cellular and molecular biology, which contains many overlapping and related terms; other related glossaries include Glossary of biology and Glossary of ecology.
Distyly is a type of heterostyly in which a plant demonstrates reciprocal herkogamy. This breeding system is characterized by two separate flower morphs, where individual plants produce flowers that either have long styles and short stamens, or that have short styles and long stamens. However, distyly can refer to any plant that shows some degree of self-incompatibility and has two morphs if at least one of the following characteristics is true; there is a difference in style length, filament length, pollen size or shape, or the surface of the stigma. Specifically these plants exhibit intra-morph self-incompatibility, flowers of the same style morph are incompatible. Distylous species that do not exhibit true self-incompatibility generally show a bias towards inter-morph crosses - meaning they exhibit higher success rates when reproducing with an individual of the opposite morph.
Berdan EL, Flatt T, Kozak GM, Lotterhos KE, Wielstra B. 2022 Genomic architecture of supergenes: connecting form and function. Phil. Trans. R. Soc. B 377: 20210192