Functional divergence

Last updated

Functional divergence is the process by which genes, after gene duplication, shift in function from an ancestral function. Functional divergence can result in either subfunctionalization, where a paralog specializes one of several ancestral functions, or neofunctionalization, where a totally new functional capability evolves. It is thought that this process of gene duplication and functional divergence is a major originator of molecular novelty and has produced the many large protein families that exist today. [1] [2]

Functional divergence is just one possible outcome of gene duplication events. Other fates include nonfunctionalization where one of the paralogs acquires deleterious mutations and becomes a pseudogene and superfunctionalization (reinforcement), [3] where both paralogs maintain original function. While gene, chromosome, or whole genome duplication events are considered the canonical sources of functional divergence of paralogs, orthologs (genes descended from speciation events) can also undergo functional divergence [4] [5] [6] [7] and horizontal gene transfer can also result in multiple copies of a gene in a genome, providing the opportunity for functional divergence.

Many well known protein families are the result of this process, such as the ancient gene duplication event that led to the divergence of hemoglobin and myoglobin, the more recent duplication events that led to the various subunit expansions (alpha and beta) of vertebrate hemoglobins, [8] or the expansion of G-protein alpha subunits [9]

See also

Related Research Articles

Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.

Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.

<span class="mw-page-title-main">Gene family</span> Set of several similar genes

A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions. One such family are the genes for human hemoglobin subunits; the ten genes are in two clusters on different chromosomes, called the α-globin and β-globin loci. These two gene clusters are thought to have arisen as a result of a precursor gene being duplicated approximately 500 million years ago.

<span class="mw-page-title-main">Sequence homology</span> Shared ancestry between DNA, RNA or protein sequences

Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).

<span class="mw-page-title-main">Gene</span> Sequence of DNA or RNA that codes for an RNA or protein product

In biology, the word gene can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes.

Neutral mutations are changes in DNA sequence that are neither beneficial nor detrimental to the ability of an organism to survive and reproduce. In population genetics, mutations in which natural selection does not affect the spread of the mutation in a species are termed neutral mutations. Neutral mutations that are inheritable and not linked to any genes under selection will be lost or will replace all other alleles of the gene. That loss or fixation of the gene proceeds based on random sampling known as genetic drift. A neutral mutation that is in linkage disequilibrium with other alleles that are under selection may proceed to loss or fixation via genetic hitchhiking and/or background selection.

Orphan genes, ORFans, or taxonomically restricted genes (TRGs) are genes that lack a detectable homologue outside of a given species or lineage. Most genes have known homologues. Two genes are homologous when they share an evolutionary history, and the study of groups of homologous genes allows for an understanding of their evolutionary history and divergence. Common mechanisms that have been uncovered as sources for new genes through studies of homologues include gene duplication, exon shuffling, gene fusion and fission, etc. Studying the origins of a gene becomes more difficult when there is no evident homologue. The discovery that about 10% or more of the genes of the average microbial species is constituted by orphan genes raises questions about the evolutionary origins of different species as well as how to study and uncover the evolutionary origins of orphan genes.

Genomic phylostratigraphy is a novel genetic statistical method developed in order to date the origin of specific genes by looking at its homologs across species. It was first developed by Ruđer Bošković Institute in Zagreb, Croatia. The system links genes to their founder gene, allowing us to then determine their age. This could help us better understand many evolutionary processes such as patterns of gene birth throughout evolution, or the relationship between the age of a transcriptome throughout embryonic development. Bioinformatic tools like GenEra have been developed to calculate relative gene ages based on genomic phylostratigraphy.

<span class="mw-page-title-main">Hemoglobin subunit zeta</span> Mammalian protein found in Homo sapiens

Hemoglobin subunit zeta is a protein that in humans is encoded by the HBZ gene.

<span class="mw-page-title-main">NBPF3</span> Protein-coding gene in the species Homo sapiens

Neuroblastoma breakpoint family, member 3, also known as NBPF3, is a human gene of the neuroblastoma breakpoint family, which resides on chromosome 1 of the human genome. NBPF3 is located at 1p36.12, immediately upstream of genes ALPL and RAP1GAP.

<span class="mw-page-title-main">Gene redundancy</span>

Gene redundancy is the existence of multiple genes in the genome of an organism that perform the same function. Gene redundancy can result from gene duplication. Such duplication events are responsible for many sets of paralogous genes. When an individual gene in such a set is disrupted by mutation or targeted knockout, there can be little effect on phenotype as a result of gene redundancy, whereas the effect is large for the knockout of a gene with only one copy. Gene knockout is a method utilized in some studies aiming to characterize the maintenance and fitness effects functional overlap.

RNA polymerase IV is an enzyme that synthesizes small interfering RNA (siRNA) in plants, which silence gene expression. RNAP IV belongs to a family of enzymes that catalyze the process of transcription known as RNA Polymerases, which synthesize RNA from DNA templates. Discovered via phylogenetic studies of land plants, genes of RNAP IV are thought to have resulted from multistep evolution processes that occurred in RNA Polymerase II phylogenies. Such an evolutionary pathway is supported by the fact that RNAP IV is composed of 12 protein subunits that are either similar or identical to RNA polymerase II, and is specific to plant genomes. Via its synthesis of siRNA, RNAP IV is involved in regulation of heterochromatin formation in a process known as RNA directed DNA Methylation (RdDM).

<span class="mw-page-title-main">Subfunctionalization</span>

Subfunctionalization was proposed by Stoltzfus (1999) and Force et al. (1999) as one of the possible outcomes of functional divergence that occurs after a gene duplication event, in which pairs of genes that originate from duplication, or paralogs, take on separate functions. Subfunctionalization is a neutral mutation process of constructive neutral evolution; meaning that no new adaptations are formed. During the process of gene duplication paralogs simply undergo a division of labor by retaining different parts (subfunctions) of their original ancestral function. This partitioning event occurs because of segmental gene silencing leading to the formation of paralogs that are no longer duplicates, because each gene only retains a single function. It is important to note that the ancestral gene was capable of performing both functions and the descendant duplicate genes can now only perform one of the original ancestral functions.

Evolution by gene duplication is an event by which a gene or part of a gene can have two identical copies that can not be distinguished from each other. This phenomenon is understood to be an important source of novelty in evolution, providing for an expanded repertoire of molecular activities. The underlying mutational event of duplication may be a conventional gene duplication mutation within a chromosome, or a larger-scale event involving whole chromosomes (aneuploidy) or whole genomes (polyploidy). A classic view, owing to Susumu Ohno, which is known as Ohno model, he explains how duplication creates redundancy, the redundant copy accumulates beneficial mutations which provides fuel for innovation. Knowledge of evolution by gene duplication has advanced more rapidly in the past 15 years due to new genomic data, more powerful computational methods of comparative inference, and new evolutionary models.

<span class="mw-page-title-main">OrthoDB</span>

OrthoDB presents a catalog of orthologous protein-coding genes across vertebrates, arthropods, fungi, plants, and bacteria. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each major radiation along the species phylogeny. The database of orthologs presents available protein descriptors, together with Gene Ontology and InterPro attributes, which serve to provide general descriptive annotations of the orthologous groups, and facilitate comprehensive orthology database querying. OrthoDB also provides computed evolutionary traits of orthologs, such as gene duplicability and loss profiles, divergence rates, sibling groups, and gene intron-exon architectures.

<span class="mw-page-title-main">Neofunctionalization</span>

Neofunctionalization, one of the possible outcomes of functional divergence, occurs when one gene copy, or paralog, takes on a totally new function after a gene duplication event. Neofunctionalization is an adaptive mutation process; meaning one of the gene copies must mutate to develop a function that was not present in the ancestral gene. In other words, one of the duplicates retains its original function, while the other accumulates molecular changes such that, in time, it can perform a different task.

<span class="mw-page-title-main">Infologs</span>

Infologs are independently designed synthetic genes derived from one or a few genes where substitutions are systematically incorporated to maximize information. Infologs are designed for perfect diversity distribution to maximize search efficiency.

<i>De novo</i> gene birth Evolution of novel genes from non-genic DNA sequence

De novo gene birth is the process by which new genes evolve from DNA sequences that were ancestrally non-genic. De novo genes represent a subset of novel genes, and may be protein-coding or instead act as RNA genes. The processes that govern de novo gene birth are not well understood, although several models exist that describe possible mechanisms by which de novo gene birth may occur.

<span class="mw-page-title-main">TEDDM1</span> Protein-coding gene in the species Homo sapiens

Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.


<span class="mw-page-title-main">MROH9</span> Mammalian gene

Maestro heat-like repeat-containing protein family member 9 (MROH9) is a protein which in humans is encoded by the MROH9 gene. The word ‘maestro’ itself is an acronym, standing for male-specific transcription in the developing reproductive organs (MRO). MRO genes belong to the MROH family, which includes MROH9.

References

  1. Gu, X (Jul 2003). "Functional divergence in protein (family) sequence evolution". Genetica. Contemporary Issues in Genetics and Evolution. 118 (2–3): 133–41. doi:10.1007/978-94-010-0229-5_4. ISBN   978-94-010-3982-6. PMID   12868604.
  2. Fay, JC; Wu, CI (2003). "Sequence divergence, functional constraint, and selection in protein evolution". Annu Rev Genom Hum Genet. 4: 213–35. doi: 10.1146/annurev.genom.4.020303.162528 . PMID   14527302.
  3. Dvornyk, V; Vinogradova, ON; Nevo, E (2002). "Long-term microclimatic stress causes rapid adaptive radiation of kaiABC clock gene family in a cyanobacterium, Nostoc linckia, from "Evolution Canyons" I and II, Israel". Proc Natl Acad Sci USA. 99 (4): 2082–2087. Bibcode:2002PNAS...99.2082D. doi: 10.1073/pnas.261699498 . PMC   123721 . PMID   11842226.
  4. Studer, RA; Robinson-Rechavi, M (2009). "How confident can we be that orthologs are similar, but paralogs differ?". Trends in Genetics. 25 (5): 210–6. doi:10.1016/j.tig.2009.03.004. PMID   19368988.
  5. Studer; Robinson-Rechavi, M (2010). "Large-scale analysis of orthologs and paralogs under covarion-like and constant-but-different models of amino acid evolution". Molecular Biology and Evolution. 27 (11): 2618–2627. doi:10.1093/molbev/msq149. PMC   2955734 . PMID   20551039.
  6. Gharib, WH; Robinson-Rechavi, M (2011). "When orthologs diverge between human and mouse". Briefings in Bioinformatics. 12 (5): 436–441. doi:10.1093/bib/bbr031. PMC   3178054 . PMID   21677033.
  7. Nehrt, NL; Clark, WT; Radivojac, P; Hahn, MW (2011). "Testing the ortholog conjecture with comparative functional genomic data from mammals". PLOS Computational Biology. 7 (6): e1002073. Bibcode:2011PLSCB...7E2073N. doi:10.1371/journal.pcbi.1002073. PMC   3111532 . PMID   21695233.
  8. Storz, Jay F.; Hoffmann, Federico G.; Opazo, Juan C.; Moriyama, Hideaki (March 2008). "Adaptive Functional Divergence Among Triplicated α-Globin Genes in Rodents". Genetics. 178 (3): 1623–1638. doi:10.1534/genetics.107.080903. PMC   2278084 . PMID   18245844.
  9. Zheng, Y; Xu, D; Gu, X (2007). "Functional divergence after gene duplication and sequence-structure relationship: a case study of G-protein alpha subunits". Journal of Experimental Zoology Part B: Molecular and Developmental Evolution. 308 (1): 85–96. doi: 10.1002/jez.b.21140 . PMID   17094082.