DNA transposons are DNA sequences, sometimes referred to "jumping genes", that can move and integrate to different locations within the genome. [1] They are class II transposable elements (TEs) that move through a DNA intermediate, as opposed to class I TEs, retrotransposons, that move through an RNA intermediate. [2] DNA transposons can move in the DNA of an organism via a single-or double-stranded DNA intermediate. [3] DNA transposons have been found in both prokaryotic and eukaryotic organisms. They can make up a significant portion of an organism's genome, particularly in eukaryotes. In prokaryotes, TE's can facilitate the horizontal transfer of antibiotic resistance or other genes associated with virulence. After replicating and propagating in a host, all transposon copies become inactivated and are lost unless the transposon passes to a genome by starting a new life cycle with horizontal transfer. [4] DNA transposons do not randomly insert themselves into the genome, but rather show preference for specific sites.
With regard to movement, DNA transposons can be categorized as autonomous and nonautonomous. [5] Autonomous ones can move on their own, while nonautonomous ones require the presence of another transposable element's gene, transposase, to move. There are three main classifications for movement for DNA transposons: "cut and paste," [6] "rolling circle" (Helitrons), [7] and "self-synthesizing" (Polintons). [8] These distinct mechanisms of movement allow them to move around the genome of an organism. Since DNA transposons cannot synthesize DNA, they replicate using the host replication machinery. These three main classes are then further broken down into 23 different superfamilies characterized by their structure, sequence, and mechanism of action. [9]
DNA transposons are a cause of gene expression alterations. As newly inserted DNA into active coding sequences, they can disrupt normal protein functions and cause mutations. Class II TEs make up about 3% of the human genome. Today, there are no active DNA transposons in the human genome. Therefore, the elements found in the human genome are called fossils.
Traditionally, DNA transposons move around in the genome by a cut and paste method. The system requires a transposase enzyme that catalyzes the movement of the DNA from its current location in the genome and inserts it in a new location. Transposition requires three DNA sites on the transposon: two at each end of the transposon called terminal inverted repeats and one at the target site. The transposase will bind to the terminal inverted repeats of the transposon and mediate synapsis of the transposon ends. The transposase enzyme then disconnects the element from the flanking DNA of the original donor site and mediates the joining reaction that links the transposon to the new insertion site. The addition of the new DNA into the target site causes short gaps on either side of the inserted segment. [10] Host systems repair these gaps resulting in the target sequence duplication (TSD) that are characteristic of transposition. In many reactions, the transposon is completely excised from the donor site in what is called a "cut and paste" [11] transposition and inserted into the target DNA to form a simple insertion. Occasionally, genetic material not originally in the transposable element gets copied and moved as well.
Helitrons are also a group of eukaryotic class II TEs. Helitrons do not follow the classical "cut and paste" mechanism. Instead, they are hypothesized to move around the genome via a rolling circle like mechanism. This process involves making a nick to a circular strand by an enzyme, which separates the DNA into two single strands. The initiation protein then remains attached to the 5' Phosphate on the nicked strand, exposing the 3' hydroxyl of the complementary strand. This allows a polymerase enzyme to begin replication on the un-nicked strand. Eventually the entire strand is replicated at which point the newly synthesized DNA disassociates and is replicated in parallel with the original template strand. [12] Helitrons encode an unknown protein which is thought to have HUH endonuclease function as well as 5' to 3' helicase activity. This enzyme would make a single stranded cut in the DNA which explains the lack of Target Site Duplications found in Helitrons. Helitrons were also the first class of transposable elements to be discovered computationally and marked a paradigm shift in the way that whole genomes were studied. [13]
Polintons are also a group of eukaryotic class II TEs. As one of the most complex known DNA transposons in eukaryotes, they make up the genomes of protists, fungi, and animals, such as the entamoeba, soybean rust, and chicken, respectively. They contain genes with homology to viral proteins and which are often found in eukaryotic genomes, like polymerase and retroviral integrase. However, there is no known protein functionally similarly to the viral capsid or envelope proteins. They share their many structural characteristics with linear plasmids, bacteriophages and adenoviruses, which replicate using protein-primed DNA polymerases. Polintons have been proposed to go through a similar self-synthesis by their polymerase. Polintons, 15–20 kb long, encode up to 10 individual proteins. For replication, they utilize a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. First, during host genome replication, a single-stranded extra-chromosomal Polinton element is excised from the host DNA using the integrase, forming a racket-like structure. Second, the Polinton undergoes replication using the DNA polymerase B, with initiation started by a terminal protein, which may encoded in some linear plasmids. Once the double stranded Polinton is generated, the integrase serves to insert it into the host genome. Polintons exhibit high variability between difference species and may tightly regulated, resulting in a low frequency rate in many genomes. [14]
As of the most recent update in 2023, 31 superfamilies of DNA transposons were recognized and annotated in Repbase, a database of repetitive DNA elements maintained by the Genetic Information Research Institute: [15] [16]
DNA transposons, like all transposons, are quite impactful with respect to gene expression. A sequence of DNA may insert itself into a previously functional gene and create a mutation. This can happen in three distinct ways: 1. alteration of function, 2. chromosomal rearrangement, and 3. a source of novel genetic material. [17] Since DNA transposons may happen to take parts of genomic sequences with them, exon shuffling may occur. Exon shuffling is the creation of novel gene products due to the new placement of two previously unrelated exons through transposition. [18] Because of their ability to alter DNA expression, transposons have become an important target of research in genetic engineering.
Barbara McClintock first discovered and described DNA transposons in Zea mays , [19] during the 1940s; this is an achievement that would earn her the Nobel Prize in 1983. She described the Ac/Ds system where the Ac unit (activator) was autonomous but the Ds genomic unit required the presence of the activator in order to move. This TE is one of the most visually obvious as it was able to cause the maize to change color from yellow to brown/spotted on individual kernels.
The Mariner/Tc1 transposon, found in many animals but studied in Drosophila was first described by Jacobson and Hartl. [20] Mariner is well known for being able to excise and insert horizontally in to a new organism. [21] Thousands of copies of the TE have been found interspersed in the human genome as well as other animals.
The Hobo transposons in Drosophila have been extensively studied due to their ability to cause gonadal dysgenesis. [22] The insertion and subsequent expression of hobo-like sequences results in the loss of germ cells in the gonads of developing flies.
Bacterial transposons are especially good at facilitating horizontal gene transfer between microbes. Transposition facilitates the transfer and accumulation of antibiotic resistance genes. In bacteria, transposable elements can easily jump between the chromosomal genome and plasmids. In a 1982 study by Devaud et al., a multi-drug resistant strain of Acinetobacter was isolated and examined. Evidence pointed to the transfer of a plasmid in to the bacterium, where the resistance genes were transposed in to the chromosomal genome. [23]
Transposons may have an effect on the promotion of genetic diversity of many organisms. DNA transposons can drive the evolution of genomes by promoting the relocation of sections of DNA sequences. As a result, this can alter gene regulatory regions and phenotypes. [24] The discovery of transposons was made by Barbara McClintock who noticed that these elements could actually change the color of the maize plants she was studying, providing quick evidence of one outcome from transposon movement. [25] Another example is the Tol2 DNA transposon in medaka fish that is said to be the result of their variety in pigmentation patterns. [26] These examples show that transposons can greatly influence the process of evolution by rapidly inducing changes in the genome.
All DNA transposons are inactive in the human genome. [27] Inactivated, or silenced, transposons do not result in a phenotypic outcome and do not move around in the genome. Some are inactive because they have mutations that affect their ability to move between chromosomes, while others are capable of moving but remain inactive due to epigenetic defenses, like DNA methylation and chromatin remodeling. For example, chemical modifications of DNA can constrict certain areas of the genome such that transcription enzymes are unable to reach them. RNAi, specifically siRNA and miRNA silencing, is a naturally occurring mechanisms that, in addition to regulating eukaryotic gene expression, prevents transcription of DNA transposons. Another mode of inactivation is overproduction inhibition. When transposase exceeds a threshold concentration, transposon activity is decreased. [28] Since transposase can form inactive or less active monomers that will decrease transposition activity overall, a decrease in the production of transposase will also occur when large copies of those less active elements increase in the host genome.
Horizontal transfer refers to the movement of DNA information between cells of different organisms. Horizontal transfer can involve the movement of TEs from one organism into the genome of another. The insertion itself allows the TE to become an activated gene in the new host. Horizontal transfer is used by DNA transposons to prevent inactivation and complete loss of the transposon. This inactivation is termed vertical inactivation, meaning that the DNA transposon is inactive and remains as a fossil. This type of transfer is not the most common, but has been seen in the case of the wheat virulence protein ToxA, which was transferred between the different fungal pathogens Parastagonospora nodorum, Pyrenophora tritici-repentis , and Bipolaris sorokiniana. [29] Other examples include transfer between marine crustaceans, insects of different orders, and organisms of different phyla, such as humans and nematodes. [30]
Eukaryotic genomes differ in TE content. Recently, a study of the different superfamilies of TEs reveals that there are striking similarities between the groups. It has been hypothesized that many of them are represented in two or more Eukaryotic supergroups. This means that divergence of the transposon superfamilies could even predate the divergence of Eukaryotic supergroups. [31]
V(D)J recombination, although not a DNA TE, is remarkably similar to transposons. V(D)J recombination is the process by which the large variation in antibody binding sites is created. In this mechanism, DNA is recombined in order to create genetic diversity. [32] Because of this, it has been hypothesized that these proteins, particularly Rag1 and Rag2 [33] are derived from transposable elements. [34]
There is evidence suggesting that at least 40 human DNA transposon families were active during mammalian radiation and early primate lineage. Then, there was a pause in transpositional activity during the later portion of primate radiation, with a complete halt in transposon movement in an anthropoid primate ancestor. There is no evidence of any transposable element younger than about 37 million years. [35]
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.
A transposable element (TE), also transposon, or jumping gene, is a type of mobile genetic element, a nucleic acid sequence in DNA that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size.
Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional, such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses. Regions that are completely nonfunctional are called junk DNA.
Retrotransposons are mobile elements which move in the host genome by converting their transcribed RNA into DNA through reverse transcription. Thus, they differ from Class II transposable elements, or DNA transposons, in utilizing an RNA intermediate for the transposition and leaving the transposition donor site unchanged.
A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transposition. The word "transposase" was first coined by the individuals who cloned the enzyme required for transposition of the Tn3 transposon. The existence of transposons was postulated in the late 1940s by Barbara McClintock, who was studying the inheritance of maize, but the actual molecular basis for transposition was described by later groups. McClintock discovered that some segments of chromosomes changed their position, jumping between different loci or from one chromosome to another. The repositioning of these transposons allowed other genes for pigment to be expressed. Transposition in maize causes changes in color; however, in other organisms, such as bacteria, it can cause antibiotic resistance. Transposition is also important in creating genetic diversity within species and generating adaptability to changing living conditions.
P elements are transposable elements that were discovered in Drosophila as the causative agents of genetic traits called hybrid dysgenesis. The transposon is responsible for the P trait of the P element and it is found only in wild flies. They are also found in many other eukaryotes.
The Genetic Information Research Institute (GIRI) is a non-profit institution that was founded in 1994 by Jerzy Jurka. The mission of the institute "is to understand biological processes which alter the genetic makeup of different organisms, as a basis for potential gene therapy and genome engineering techniques." The institute specializes in applying computer tools to analysis of DNA and protein sequence information. GIRI develops and maintains Repbase Update, a database of prototypic sequences representing repetitive DNA from different eukaryotic species, and Repbase Reports, an electronic journal established in 2001. Repetitive DNA is primarily derived from transposable elements (TEs), which include DNA transposons belonging to around 20 superfamilies and retrotransposons that can also be sub-classified into subfamilies. The majority of known superfamilies of DNA transposons were discovered or co-discovered at GIRI, including Helitron, Academ, Dada, Ginger, Kolobok, Novosib, Sola, Transib, Zator, PIF/Harbinger and Polinton/Maverick. An ancient element from the Transib superfamily was identified as the evolutionary precursor of the Recombination activating gene. GIRI has hosted three international conferences devoted to the genomic impact of eukaryotic transposable elements.
Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron structure. There are different mechanisms through which exon shuffling occurs: transposon mediated exon shuffling, crossover during sexual recombination of parental genomes and illegitimate recombination.
Mobile genetic elements (MGEs), sometimes called selfish genetic elements, are a type of genetic material that can move around within a genome, or that can be transferred from one species or replicon to another. MGEs are found in all organisms. In humans, approximately 50% of the genome are thought to be MGEs. MGEs play a distinct role in evolution. Gene duplication events can also happen through the mechanism of MGEs. MGEs can also cause mutations in protein coding regions, which alters the protein functions. These mechanisms can also rearrange genes in the host genome generating variation. These mechanism can increase fitness by gaining new or additional functions. An example of MGEs in evolutionary context are that virulence factors and antibiotic resistance genes of MGEs can be transported to share genetic code with neighboring bacteria. However, MGEs can also decrease fitness by introducing disease-causing alleles or mutations. The set of MGEs in an organism is called a mobilome, which is composed of a large number of plasmids, transposons and viruses.
Histone-lysine N-methyltransferase SETMAR is an enzyme that in humans is encoded by the SETMAR gene.
Jerzy Władysław Jurka was a Polish–American computational and molecular biologist known for his pioneering work on repetitive DNA and transposable elements (TEs) in eukaryotic genomes. He served as the assistant director of research at the Linus Pauling Institute prior to founding and directing the Genetic Information Research Institute (GIRI) in Mountain View, California.
Helitrons are one of the three groups of eukaryotic class 2 transposable elements (TEs) so far described. They are the eukaryotic rolling-circle transposable elements which are hypothesized to transpose by a rolling circle replication mechanism via a single-stranded DNA intermediate. They were first discovered in plants and in the nematode Caenorhabditis elegans, and now they have been identified in a diverse range of species, from protists to mammals. Helitrons make up a substantial fraction of many genomes where non-autonomous elements frequently outnumber the putative autonomous partner. Helitrons seem to have a major role in the evolution of host genomes. They frequently capture diverse host genes, some of which can evolve into novel host genes or become essential for Helitron transposition.
Transposons are semi-parasitic DNA sequences which can replicate and spread through the host's genome. They can be harnessed as a genetic tool for analysis of gene and protein function. The use of transposons is well-developed in Drosophila and in Thale cress and bacteria such as Escherichia coli.
A conserved non-coding sequence (CNS) is a DNA sequence of noncoding DNA that is evolutionarily conserved. These sequences are of interest for their potential to regulate gene production.
Transposable elements are short strands of repetitive DNA that can self-replicate and translocate within genomes of plants, animals, and prokaryotes, and they are generally perceived as parasitic in nature. Their transcription can lead to the production of dsRNAs, which resemble retrovirus transcripts. While most host cellular RNA has a singular, unpaired sense strand, dsRNA possesses sense and anti-sense transcripts paired together, and this difference in structure allows a host organism to detect dsRNA production, and thereby the presence of transposons. Plants lack distinct divisions between somatic cells and reproductive cells, and also have, generally, larger genomes than animals and prokaryotes, making plants an intriguing case-study for better understanding the epigenetic regulation and function of transposable elements.
Transposition is the process by which a specific genetic sequence, known as a transposon, is moved from one location of the genome to another. Simple, or conservative transposition, is a non-replicative mode of transposition. That is, in conservative transposition the transposon is completely removed from the genome and reintegrated into a new, non-homologous locus, the same genetic sequence is conserved throughout the entire process. The site in which the transposon is reintegrated into the genome is called the target site. A target site can be in the same chromosome as the transposon or within a different chromosome. Conservative transposition uses the "cut-and-paste" mechanism driven by the catalytic activity of the enzyme transposase. Transposase acts like DNA scissors; it is an enzyme that cuts through double-stranded DNA to remove the transposon, then transfers and pastes it into a target site.
Polintons are large DNA transposons which contain genes with homology to viral proteins and which are often found in eukaryotic genomes. They were first discovered in the mid-2000s and are the largest and most complex known DNA transposons. Polintons encode up to 10 individual proteins and derive their name from two key proteins, a DNA polymerase and a retroviral-like integrase.
Tc1/mariner is a class and superfamily of interspersed repeats DNA transposons. The elements of this class are found in all animals, including humans. They can also be found in protists and bacteria.
hAT transposons are a superfamily of DNA transposons, or Class II transposable elements, that are common in the genomes of plants, animals, and fungi.
Varidnaviria is a realm of viruses that includes all DNA viruses that encode major capsid proteins that contain a vertical jelly roll fold. The major capsid proteins (MCP) form into pseudohexameric subunits of the viral capsid, which stores the viral deoxyribonucleic acid (DNA), and are perpendicular, or vertical, to the surface of the capsid. Apart from this, viruses in the realm also share many other characteristics, such as minor capsid proteins (mCP) with the vertical jelly roll fold, an ATPase that packages viral DNA into the capsid, and a DNA polymerase that replicates the viral genome.