Gene cluster

Last updated November 01, 2024

A gene cluster is a group of two or more genes found within an organism's DNA that encode similar polypeptides or proteins which collectively share a generalized function and are often located within a few thousand base pairs of each other. The size of gene clusters can vary significantly, from a few genes to several hundred genes.^[1] Portions of the DNA sequence of each gene within a gene cluster are found to be identical; however, the protein encoded by each gene is distinct from the proteins encoded by the other genes within the cluster. Gene clusters often result from expansions of a single gene caused by repeated duplication events, and may be observed near one another on the same chromosome or on different, but homologous chromosomes. An example of a gene cluster is the Hox gene, which is made up of eight genes and is part of the Homeobox gene family.

Formation

Historically, four models have been proposed for the formation and persistence of gene clusters.

Gene duplication and divergence

This model has been generally accepted since the mid-1970s. It postulates that gene clusters were formed as a result of gene duplication and divergence.^[2] These gene clusters include the Hox gene cluster, the human β-globin gene cluster, and four clustered human growth hormone (hGH)/chorionic somatomammotropin genes.^[3]

Conserved gene clusters, such as Hox and the human β-globin gene cluster, may be formed as a result of the process of gene duplication and divergence. A gene is duplicated during cell division, so that its descendants have two end-to-end copies of the gene where it had one copy, initially coding for the same protein or otherwise having the same function. In the course of subsequent evolution, they diverge, so that the products they code for have different but related functions, with the genes still being adjacent on the chromosome.^[4] Ohno theorized that the origin of new genes during evolution was dependent on gene duplication. If only a single copy of a gene existed in the genome of a species, the proteins transcribed from this gene would be essential to their survival. Because there was only a single copy of the gene, they could not undergo mutations which would potentially result in new genes; however, gene duplication allows essential genes to undergo mutations in the duplicated copy, which would ultimately give rise to new genes over the course of evolution.

^[5] Mutations in the duplicated copy were tolerated because the original copy contained genetic information for the essential gene's function. Species who have gene clusters have a selective evolutionary advantage because natural selection must keep the genes together.^[1]^[6] Over a short span of time, the new genetic information exhibited by the duplicated copy of the essential gene would not serve a practical advantage; however, over a long, evolutionary time period, the genetic information in the duplicated copy may undergo additional and drastic mutations in which the proteins of the duplicated gene served a different role than those of the original essential gene.^[5] Over the long, evolutionary time period, the two similar genes would diverge so the proteins of each gene were unique in their functions. Hox gene clusters, ranging in various sizes, are found among several phyla.

Hox cluster

When gene duplication occurs to produce a gene cluster, one or multiple genes may be duplicated at once. In the case of the Hox gene, a shared ancestral ProtoHox cluster was duplicated, resulting in genetic clusters in the Hox gene as well as the ParaHox gene, an evolutionary sister complex of the Hox gene.^[7] It is unknown the exact number of genes contained in the duplicated Protohox cluster; however, models exist suggesting that the duplicated Protohox cluster originally contained four, three, or two genes.^[8]

In the case where a gene cluster is duplicated, some genes may be lost. Loss of genes is dependent of the number of genes originating in the gene cluster. In the four gene model, the ProtoHox cluster contained four genes which resulted in two twin clusters: the Hox cluster and the ParaHox cluster.^[7] As its name indicates, the two gene model gave rise to the Hox cluster and the ParaHox cluster as a result of the ProtoHox cluster which contained only two genes. The three gene model was originally proposed in conjunction with the four gene model;^[8] however, rather than the Hox cluster and the ParaHox cluster resulting from a cluster containing three genes, the Hox cluster and ParaHox cluster were as a result of single gene tandem duplication, identical genes found adjacent on the same chromosome.^[7] This was independent of duplication of the ancestral ProtoHox cluster.

Cis vs. trans duplication

Gene duplication may occur via cis-duplication or trans duplication. Cis-duplication, or intrachromosomal duplication, entails the duplication of genes within the same chromosome whereas trans duplication, or interchromosomal duplication, consists of duplicating genes on neighboring but separate chromosomes.^[7] The formations of the Hox cluster and of the ParaHox cluster were results of intrachromosomal duplication, although they were initially thought to be interchromosomal.^[8]

Fisher Model

The Fisher Model was proposed in 1930 by Ronald Fisher. Under the Fisher Model, gene clusters are a result of two alleles working well with one another. In other words, gene clusters may exhibit co-adaptation.^[3] The Fisher Model was considered unlikely and later dismissed as an explanation for gene cluster formation.^[2]^[3]

Coregulation Model

Under the coregulation model, genes are organized into clusters, each consisting of a single promoter and a cluster of coding sequences, which are therefore co-regulated, showing coordinated gene expression.^[3] Coordinated gene expression was once considered to be the most common mechanism driving the formation of gene clusters.^[1] However coregulation and thus coordinated gene expression cannot drive the formation of gene clusters.^[3]

Molarity Model

The Molarity Model considers the constraints of cell size. Transcribing and translating genes together is beneficial to the cell.^[9] thus the formation of clustered genes generates a high local concentration of cytoplasmic protein products. Spatial segregation of protein products has been observed in bacteria; however, the Molarity Model does not consider co-transcription or distribution of genes found within an operon.^[2]

Gene clusters vs. tandem arrays

Tandem duplication is the process in which one gene is duplicated and the resulting copy is found adjacent to the original gene. Tandemly arrayed genes are formed as a result of tandem duplications. Gene-duplication-notext.png — Tandem duplication is the process in which one gene is duplicated and the resulting copy is found adjacent to the original gene. Tandemly arrayed genes are formed as a result of tandem duplications.

Repeated genes can occur in two major patterns: gene clusters and tandem arrays, or formerly called tandemly arrayed genes. Although similar, gene clusters and tandemly arrayed genes may be distinguished from one another.

Gene Clusters

Gene clusters are found to be close to one another when observed on the same chromosome. They are dispersed randomly; however, gene clusters are normally within, at most, a few thousand bases of each other. The distance between each gene in the gene cluster can vary. The DNA found between each repeated gene in the gene cluster is non-conserved.^[10] Portions of the DNA sequence of a gene is found to be identical in genes contained in a gene cluster.^[5] Gene conversion is the only method in which gene clusters may become homogenized. Although the size of a gene cluster may vary, it rarely comprises more than 50 genes, making clusters stable in number. Gene clusters change over a long evolutionary time period, which does not result in genetic complexity.^[10]

Tandem arrays

Tandem arrays are a group of genes with the same or similar function that are repeated consecutively without space between each gene. The genes are organized in the same orientation.^[10] Unlike gene clusters, tandemly arrayed genes are found to consist of consecutive, identical repeats, separated only by a nontranscribed spacer region.

^[11] While the genes contained in a gene cluster encode for similar proteins, identical proteins or functional RNAs are encoded by tandemly arrayed genes. Unequal recombination, which changes the number of repeats by placing duplicated genes next to the original gene. Unlike gene clusters, tandemly arrayed genes rapidly change in response to the needs of the environment, causing an increase in genetic complexity.^[11]

Gene conversion allows tandemly arrayed genes to become homogenized, or identical.^[11] Gene conversion may be allelic or ectopic. Allelic gene conversion occurs when one allele of a gene is converted to the other allele as a result of mismatch base pairing during meiosis homologous recombination.^[12] Ectopic gene conversion occurs when one homologous DNA sequence is replaced by another. Ectopic gene conversion is the driving force for concerted evolution of gene families.^[13]

Tandemly arrayed genes are essential to maintain large gene families, such as ribosomal RNA. In the eukaryotic genome, tandemly arrayed genes make up ribosomal RNA. Tandemly repeated rRNAs are essential to maintain the RNA transcript. One RNA gene may not be able to provide a sufficient amount of RNA. In this situation, tandem repeats of the gene allow a sufficient amount of RNA to be provided. For example, human embryonic cells contain 5-10 million ribosomes and double in number within 24 hours. In order to provide a substantive number of ribosomes, multiple RNA polymerases must consecutively transcribe multiple rRNA genes.^[11]

Related Research Articles

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA, which then may undergo error-prone repair, cause an error during other forms of repair, or cause an error during replication. Mutations may also result from substitution,insertion or deletion of segments of DNA due to mobile genetic elements.

<span class="mw-page-title-main">Homeobox</span> DNA pattern affecting anatomy development

A homeobox is a DNA sequence, around 180 base pairs long, that regulates large-scale anatomical features in the early stages of embryonic development. Mutations in a homeobox may change large-scale anatomical features of the full-grown organism.

Genetic recombination is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryotes, genetic recombination during meiosis can lead to a novel set of genetic information that can be further passed on from parents to offspring. Most recombination occurs naturally and can be classified into two types: (1) interchromosomal recombination, occurring through independent assortment of alleles whose loci are on different but homologous chromosomes ; & (2) intrachromosomal recombination, occurring through crossing over.

Molecular evolution describes how inherited DNA and/or RNA change over evolutionary time, and the consequences of this for proteins and other components of cells and organisms. Molecular evolution is the basis of phylogenetic approaches to describing the tree of life. Molecular evolution overlaps with population genetics, especially on shorter timescales. Topics in molecular evolution include the origins of new genes, the genetic nature of complex traits, the genetic basis of adaptation and speciation, the evolution of development, and patterns and processes underlying genomic changes during evolution.

Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by gene duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes are usually identified when genome sequence analysis finds gene-like sequences that lack regulatory sequences needed for transcription or translation, or whose coding sequences are obviously defective due to frameshifts or premature stop codons. Pseudogenes are a type of junk DNA.

Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.

A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions. One such family are the genes for human hemoglobin subunits; the ten genes are in two clusters on different chromosomes, called the α-globin and β-globin loci. These two gene clusters are thought to have arisen as a result of a precursor gene being duplicated approximately 500 million years ago.

The ribosomal DNA consists of a group of ribosomal RNA encoding genes and related regulatory elements, and is widespread in similar configuration in all domains of life. The ribosomal DNA encodes the non-coding ribosomal RNA, integral structural elements in the assembly of ribosomes, its importance making it the most abundant section of RNA found in cells of eukaryotes. Additionally, these segments includes regulatory sections, such as an promotor specific to the RNA polymerase I, as well as both transcribed and non-transcribed spacer segments.

Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).

Hox genes, a subset of homeobox genes, are a group of related genes that specify regions of the body plan of an embryo along the head-tail axis of animals. Hox proteins encode and specify the characteristics of 'position', ensuring that the correct structures form in the correct places of the body. For example, Hox genes in insects specify which appendages form on a segment, and Hox genes in vertebrates specify the types and shape of vertebrae that will form. In segmented animals, Hox proteins thus confer segmental or positional identity, but do not form the actual segments themselves.

In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes.

Masatoshi Nei was a Japanese-born American evolutionary biologist.

The 2R hypothesis or Ohno's hypothesis, first proposed by Susumu Ohno in 1970, is a hypothesis that the genomes of the early vertebrate lineage underwent two whole genome duplications, and thus modern vertebrate genomes reflect paleopolyploidy. The name derives from the 2 rounds of duplication originally hypothesized by Ohno, but refined in a 1994 version, and the term 2R hypothesis was probably coined in 1999. Variations in the number and timings of genome duplications typically still are referred to as examples of the 2R hypothesis.

Homeobox protein Hox-D13 is a protein that in humans is encoded by the HOXD13 gene. This gene belongs to the homeobox family of genes. The homeobox genes encode a highly conserved family of transcription factors that play an important role in morphogenesis in all multicellular organisms.

Gene redundancy is the existence of multiple genes in the genome of an organism that perform the same function. Gene redundancy can result from gene duplication. Such duplication events are responsible for many sets of paralogous genes. When an individual gene in such a set is disrupted by mutation or targeted knockout, there can be little effect on phenotype as a result of gene redundancy, whereas the effect is large for the knockout of a gene with only one copy. Gene knockout is a method utilized in some studies aiming to characterize the maintenance and fitness effects functional overlap.

Concerted evolution is the phenomenon where paralogous genes within one species are more closely related to one another than to members of the same gene family in closely related species. In other terms, when specific members of a family are investigated, a greater amount of similarity is found within a species rather than between species. This is suggesting that members within this family do not in fact evolve independently of one another.

Evolution by gene duplication is an event by which a gene or part of a gene can have two identical copies that can not be distinguished from each other. This phenomenon is understood to be an important source of novelty in evolution, providing for an expanded repertoire of molecular activities. The underlying mutational event of duplication may be a conventional gene duplication mutation within a chromosome, or a larger-scale event involving whole chromosomes (aneuploidy) or whole genomes (polyploidy). A classic view, owing to Susumu Ohno, which is known as Ohno model, he explains how duplication creates redundancy, the redundant copy accumulates beneficial mutations which provides fuel for innovation. Knowledge of evolution by gene duplication has advanced more rapidly in the past 15 years due to new genomic data, more powerful computational methods of comparative inference, and new evolutionary models.

Genome evolution is the process by which a genome changes in structure (sequence) or size over time. The study of genome evolution involves multiple fields such as structural analysis of the genome, the study of genomic parasites, gene and ancient genome duplications, polyploidy, and comparative genomics. Genome evolution is a constantly changing and evolving field due to the steadily growing number of sequenced genomes, both prokaryotic and eukaryotic, available to the scientific community and the public at large.

This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. It has been designed as a companion to Glossary of cellular and molecular biology, which contains many overlapping and related terms; other related glossaries include Glossary of biology and Glossary of ecology.

References

1 2 3 Yi G, Sze SH, Thon MR (May 2007). "Identifying clusters of functionally related genes in genomes". Bioinformatics. 23 (9): 1053–60. doi:10.1093/bioinformatics/btl673. PMID 17237058.
1 2 3 Lawrence J (December 1999). "Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes" (PDF). Current Opinion in Genetics & Development. 9 (6): 642–8. doi:10.1016/s0959-437x(99)00025-8. PMID 10607610. Archived from the original (PDF) on 2010-05-28.
1 2 3 4 5 Lawrence JG, Roth JR (August 1996). "Selfish operons: horizontal transfer may drive the evolution of gene clusters". Genetics. 143 (4): 1843–60. doi:10.1093/genetics/143.4.1843. PMC 1207444 . PMID 8844169.
↑ Ohno S (1970). Evolution by gene duplication. Springer-Verlag. ISBN 978-0-04-575015-3.
1 2 3 Klug W, Cummings M, Spencer C, Pallodino M (2009). "Chromosome Mutations: Variation in chromosome number and arrangement". In Wilbur B (ed.). Concepts of Genetics (9 ed.). San Francisco, CA: Pearson Benjamin Cumming. pp. 213–214. ISBN 978-0-321-54098-0.
↑ Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N (March 1999). "The use of gene clusters to infer functional coupling". Proceedings of the National Academy of Sciences of the United States of America. 96 (6): 2896–901. Bibcode:1999PNAS...96.2896O. doi: 10.1073/pnas.96.6.2896 . PMC 15866 . PMID 10077608.
1 2 3 4 Garcia-Fernàndez J (February 2005). "Hox, ParaHox, ProtoHox: facts and guesses". Heredity. 94 (2): 145–52. doi: 10.1038/sj.hdy.6800621 . PMID 15578045.
1 2 3 Garcia-Fernàndez J (December 2005). "The genesis and evolution of homeobox gene clusters". Nature Reviews. Genetics. 6 (12): 881–92. doi:10.1038/nrg1723. PMID 16341069. S2CID 42823485.
↑ Gómez MJ, Cases I, Valencia A (2004). "Gene order in Prokaryotes: conservation and implications". In Vicente M, Tamames J, Valencia A, Mingorance J (eds.). Molecules in Time and Space: Bacterial Shape, Division, and Phylogeny. New York: Klumer Academic/Plenum Publishers. pp. 221–224. doi:10.1007/0-306-48579-6_11. ISBN 978-0-306-48578-7.
1 2 3 Graham GJ (July 1995). "Tandem genes and clustered genes". Journal of Theoretical Biology. 175 (1): 71–87. Bibcode:1995JThBi.175...71G. doi:10.1006/jtbi.1995.0122. PMID 7564393.
1 2 3 4 Lodish H, Berk A, Kaiser C, Krieger M, Bretscher A, Ploegh H, Amon A, Scott M (2013). "Genes, Genomics, and Chromosomes". Molecular Cell Biology (7th ed.). New York: W.H. Freeman Company. pp. 227–230. ISBN 978-1-4292-3413-9.
↑ Galtier N, Piganeau G, Mouchiroud D, Duret L (October 2001). "GC-content evolution in mammalian genomes: the biased gene conversion hypothesis". Genetics. 159 (2): 907–11. doi:10.1093/genetics/159.2.907. PMC 1461818 . PMID 11693127.
↑ Duret L, Galtier N (2009). "Biased gene conversion and the evolution of mammalian genomic landscapes". Annual Review of Genomics and Human Genetics. 10: 285–311. doi:10.1146/annurev-genom-082908-150001. PMID 19630562. S2CID 9126286.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Yi-1] 1 2 3 Yi G, Sze SH, Thon MR (May 2007). "Identifying clusters of functionally related genes in genomes". Bioinformatics. 23 (9): 1053–60. doi:10.1093/bioinformatics/btl673. PMID 17237058.

[Lawrence-2] 1 2 3 Lawrence J (December 1999). "Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes" (PDF). Current Opinion in Genetics & Development. 9 (6): 642–8. doi:10.1016/s0959-437x(99)00025-8. PMID 10607610. Archived from the original (PDF) on 2010-05-28.

[Lawrence&Rothe-3] 1 2 3 4 5 Lawrence JG, Roth JR (August 1996). "Selfish operons: horizontal transfer may drive the evolution of gene clusters". Genetics. 143 (4): 1843–60. doi:10.1093/genetics/143.4.1843. PMC 1207444 . PMID 8844169.

[Ohno_1970-4] Ohno S (1970). Evolution by gene duplication. Springer-Verlag. ISBN 978-0-04-575015-3.

[Klug-5] 1 2 3 Klug W, Cummings M, Spencer C, Pallodino M (2009). "Chromosome Mutations: Variation in chromosome number and arrangement". In Wilbur B (ed.). Concepts of Genetics (9 ed.). San Francisco, CA: Pearson Benjamin Cumming. pp. 213–214. ISBN 978-0-321-54098-0.

[6] Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N (March 1999). "The use of gene clusters to infer functional coupling". Proceedings of the National Academy of Sciences of the United States of America. 96 (6): 2896–901. Bibcode:1999PNAS...96.2896O. doi: 10.1073/pnas.96.6.2896 . PMC 15866 . PMID 10077608.

[Garcia_2005_Hox_facts-7] 1 2 3 4 Garcia-Fernàndez J (February 2005). "Hox, ParaHox, ProtoHox: facts and guesses". Heredity. 94 (2): 145–52. doi: 10.1038/sj.hdy.6800621 . PMID 15578045.

[Garcia-Fernandez-8] 1 2 3 Garcia-Fernàndez J (December 2005). "The genesis and evolution of homeobox gene clusters". Nature Reviews. Genetics. 6 (12): 881–92. doi:10.1038/nrg1723. PMID 16341069. S2CID 42823485.

[9] Gómez MJ, Cases I, Valencia A (2004). "Gene order in Prokaryotes: conservation and implications". In Vicente M, Tamames J, Valencia A, Mingorance J (eds.). Molecules in Time and Space: Bacterial Shape, Division, and Phylogeny. New York: Klumer Academic/Plenum Publishers. pp. 221–224. doi:10.1007/0-306-48579-6_11. ISBN 978-0-306-48578-7.

[Graham-10] 1 2 3 Graham GJ (July 1995). "Tandem genes and clustered genes". Journal of Theoretical Biology. 175 (1): 71–87. Bibcode:1995JThBi.175...71G. doi:10.1006/jtbi.1995.0122. PMID 7564393.

[Lodish-11] 1 2 3 4 Lodish H, Berk A, Kaiser C, Krieger M, Bretscher A, Ploegh H, Amon A, Scott M (2013). "Genes, Genomics, and Chromosomes". Molecular Cell Biology (7th ed.). New York: W.H. Freeman Company. pp. 227–230. ISBN 978-1-4292-3413-9.

[Galtier-12] Galtier N, Piganeau G, Mouchiroud D, Duret L (October 2001). "GC-content evolution in mammalian genomes: the biased gene conversion hypothesis". Genetics. 159 (2): 907–11. doi:10.1093/genetics/159.2.907. PMC 1461818 . PMID 11693127.

[Duret-13] Duret L, Galtier N (2009). "Biased gene conversion and the evolution of mammalian genomic landscapes". Annual Review of Genomics and Human Genetics. 10: 285–311. doi:10.1146/annurev-genom-082908-150001. PMID 19630562. S2CID 9126286.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]