A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transposition. The word "transposase" was first coined by the individuals who cloned the enzyme required for transposition of the Tn3 transposon. [1] The existence of transposons was postulated in the late 1940s by Barbara McClintock, who was studying the inheritance of maize, but the actual molecular basis for transposition was described by later groups. McClintock discovered that some segments of chromosomes changed their position, jumping between different loci or from one chromosome to another. The repositioning of these transposons (which coded for color) allowed other genes for pigment to be expressed. [2] Transposition in maize causes changes in color; however, in other organisms, such as bacteria, it can cause antibiotic resistance. [2] Transposition is also important in creating genetic diversity within species and generating adaptability to changing living conditions. [3]
Transposases are classified under EC number EC 2.7.7. Genes encoding transposases are widespread in the genomes of most organisms and are the most abundant genes known. [4] During the course of human evolution, as much as 40% of the human genome has moved around via methods such as transposition of transposons. [2]
Transposase Tn5 dimerisation domain | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | Dimer_Tnp_Tn5 | ||||||||
Pfam | PF02281 | ||||||||
InterPro | IPR003201 | ||||||||
SCOP2 | 1b7e / SCOPe / SUPFAM | ||||||||
|
Transposase (Tnp) Tn5 is a member of the RNase superfamily of proteins which includes retroviral integrases. Tn5 can be found in Shewanella and Escherichia bacteria. [5] The transposon codes for antibiotic resistance to kanamycin and other aminoglycoside antibiotics. [3] [6]
Tn5 and other transposases are notably inactive. Because DNA transposition events are inherently mutagenic, the low activity of transposases is necessary to reduce the risk of causing a fatal mutation in the host, and thus eliminating the transposable element. One of the reasons Tn5 is so unreactive is because the N- and C-termini are located in relatively close proximity to one another and tend to inhibit each other. This was elucidated by the characterization of several mutations which resulted in hyperactive forms of transposases. One such mutation, L372P, is a mutation of amino acid 372 in the Tn5 transposase. This amino acid is generally a leucine residue in the middle of an alpha helix. When this leucine is replaced with a proline residue the alpha helix is broken, introducing a conformational change to the C-terminal domain, separating it from the N-terminal domain enough to promote higher activity of the protein. [3] The transposition of a transposon often needs only three pieces: the transposon, the transposase enzyme, and the target DNA for the insertion of the transposon. [3] This is the case with Tn5, which uses a cut-and-paste mechanism for moving around transposons. [3]
Tn5 and most other transposases contain a DDE motif, which is the active site that catalyzes the movement of the transposon. Aspartate-97, aspartate-188, and glutamate-326 make up the active site, which is a triad of acidic residues. [7] The DDE motif is said to coordinate divalent metal ions, most often magnesium and manganese, which are important in the catalytic reaction. [7] Because transposase is incredibly inactive, the DDE region is mutated so that the transposase becomes hyperactive and catalyzes the movement of the transposon. [7] The glutamate is transformed into an aspartate and the two aspartates into glutamates. [7] Through this mutation, the study of Tn5 becomes possible, but some steps in the catalytic process are lost as a result. [3]
There are several steps which catalyze the movement of the transposon, including Tnp binding, synapsis (the creation of a synaptic complex), cleavage, target capture, and strand transfer. Transposase then binds to the DNA strand and creates a clamp over the transposon end of the DNA and inserts into the active site. Once the transposase binds to the transposon, it produces a synaptic complex in which two transposases are bound in a cis/trans relationship with the transposon. [3]
In cleavage, the magnesium ions activate oxygen from water molecules and expose them to nucleophilic attack. [6] This allows the water molecules to nick the 3' strands on both ends and create a hairpin formation, which separates the transposon from the donor DNA. [3] Next, the transposase moves the transposon to a suitable location. Not much is known about the target capture, although there is a sequence bias which has not yet been determined. [3] After target capture, the transposase attacks the target DNA nine base pairs apart, resulting in the integration of the transposon into the target DNA. [3]
As mentioned before, due to the mutations of the DDE, some steps of the process are lost—for example, when this experiment is performed in vitro, and SDS heat treatment denatures the transposase. However, it is still uncertain what happens to the transposase in vivo. [3]
The study of transposase Tn5 is of general importance because of its similarities to HIV-1 and other retroviral diseases. By studying Tn5, much can also be discovered about other transposases and their activities. [3]
Tn5 is utilized in genome sequencing by using the Tn5 to append sequencing adaptors and fragment the DNA in a single enzymatic reaction in 2010, [8] reducing the time and input requirements over traditional next-generation sequencing library preparation. The Tn5-based strategy can simplify the library preparation protocol significantly and can even can be incorporated into the direct colony-PCR for large numbers of bacterial isolates with no obvious coverage bias. [8] The main disadvantages are less control of fragmented size compared to enzymatic fragmentation and mechanical fragmentation, and a bias toward high G-C content. [8] This means of library preparation is also used in the ATAC-seq technique.
The Sleeping Beauty (SB) transposase is the recombinase that drives the Sleeping Beauty transposon system. [9] SB transposase belongs to the DD[E/D] family of transposases, which in turn belong to a large superfamily of polynucleotidyl transferases that includes RNase H, RuvC Holliday resolvase, RAG proteins, and retroviral integrases. [10] [11] The SB system is used primarily in vertebrate animals for gene transfer, [12] including gene therapy, [13] [14] and gene discovery. [15] [16] The engineered SB100X is an enzyme that directs the high levels of transposon integration. [17] [18]
The Tn7 transposon is a mobile genetic element found in many prokaryotes such as Escherichia coli (E. coli), and was first discovered as a DNA sequence in bacterial chromosomes and naturally occurring plasmids that encoded resistance to the antibiotics trimethoprim and streptomycin. [19] [20] Specifically classified as a transposable element (transposon), the sequence can duplicate and move itself within a genome by utilizing a self-encoded recombinase enzyme called a transposase, resulting in effects such as creating or reversing mutations and changing genome size. The Tn7 transposon has developed two mechanisms to promote its propagation among prokaryotes. [21] Like many other bacterial transposons, Tn7 transposes at low-frequency and inserts into many different sites with little to no site-selectivity. Through this first pathway, Tn7 is preferentially directed into conjugable plasmids, which can be replicated and distributed between bacteria. However, Tn7 is unique in that it also transposes at high-frequency into a single specific site in bacterial chromosomes called attTn7. [22] This specific sequence is an essential and highly conserved gene found in many strains of bacteria. However, the recombination is not deleterious to the host bacterium as Tn7 actually transposes downstream of the gene after recognizing it, resulting in a safe way to propagate the transposon without killing the host. This highly evolved and sophisticated target-site selection pathway suggests this pathway evolved to promote coexistence between the transposon and it host, as well as Tn7's successful transmission into future generations of bacterium. [21]
The Tn7 transposon is 14 kb long and codes for five enzymes. [21] The ends of the DNA sequence consists of two segments that the Tn7 transposase interacts with during recombination. The left segment (Tn7-L) is 150 bp long and the right sequence (Tn7-R) is 90 bp long. Both ends of the transposon contain a series of 22 bp binding sites that the Tn7 transposase recognizes and binds to. Within the transposon are five discrete genes encoding for proteins that make up the transposition machinery. In addition, the transposon contains an integron, a DNA segment containing several cassettes of genes encoding for antibiotic-resistance. [21]
The Tn7 transposon codes for five proteins: TnsA, TnsB, TnsC, TnsD, and TnsE. [21] TnsA and TnsB interact together to form the Tn7 transposase enzyme TnsAB. The enzyme specifically recognizes and binds to the ends of the DNA sequence of the transposon, and excises it by introducing double-stranded DNA breaks to each end. The excised sequence is then inserted to another target DNA site. Much like other characterized transposons, the mechanism for Tn7 transposition involves cleavage of the 3' ends from the donating DNA by the TnsA protein of the TnsAB transposase. However, Tn7 is also uniquely cleaved near the 5' ends, about 5 bp from the 5' end towards the Tn7 transposon, by the TnsB protein of TnsAB. After the insertion of the transposon into the target DNA site, the 3' ends are covalently linked to the target DNA, but the 5 bp gaps are still present at the 5' ends. As a result, repair of these gaps leads to a further 5 bp duplication at the target site. The TnsC protein interacts with the transposase enzyme and the target DNA to promote the excision and insertion processes. The ability of TnsC to activate the transposase depends on its interaction with a target DNA along with its appropriate targeting protein, TnsD or TnsE. The TnsD and TnsE proteins are alternative target selectors that are also DNA binding activators that promote excision and insertion of Tn7. Their ability to interact with a particular target DNA is key to the target-site selection of Tn7. The proteins TnsA, TnsB, and TnsC thus form the core machinery of Tn7: TnsA and TnsB interact together to form the transposase, while TnsC functions as a regulator of the transposase's activity, communicating between the transposase and TnsD and TnsE. When the TnsE protein interacts with the TnsABC core machinery, Tn7 preferentially directs insertions into conjugable plasmids. When the TnsD protein interacts with TnsABC, Tn7 preferentially directs insertions downstream into a single essential and highly conserved site in the bacterial chromosome. This site, attTn7, is specifically recognized by TnsD. [21]
A transposable element is a nucleic acid sequence in DNA that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. In the human genome, L1 and Alu elements are two examples. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983. Its importance in personalized medicine is becoming increasingly relevant, as well as gaining more attention in data analytics given the difficulty of analysis in very high dimensional spaces.
Retrotransposons are a type of genetic component that copy and paste themselves into different genomic locations (transposon) by converting RNA back into DNA through the reverse transcription process using an RNA transposition intermediate.
P elements are transposable elements that were discovered in Drosophila as the causative agents of genetic traits called hybrid dysgenesis. The transposon is responsible for the P trait of the P element and it is found only in wild flies. They are also found in many other eukaryotes.
Insertion element is a short DNA sequence that acts as a simple transposable element. Insertion sequences have two major characteristics: they are small relative to other transposable elements and only code for proteins implicated in the transposition activity. These proteins are usually the transposase which catalyses the enzymatic reaction allowing the IS to move, and also one regulatory protein which either stimulates or inhibits the transposition activity. The coding region in an insertion sequence is usually flanked by inverted repeats. For example, the well-known IS911 is flanked by two 36bp inverted repeat extremities and the coding region has two genes partially overlapping orfA and orfAB, coding the transposase (OrfAB) and a regulatory protein (OrfA). A particular insertion sequence may be named according to the form ISn, where n is a number ; this is not the only naming scheme used, however. Although insertion sequences are usually discussed in the context of prokaryotic genomes, certain eukaryotic DNA sequences belonging to the family of Tc1/mariner transposable elements may be considered to be, insertion sequences.
Tn10 is a transposable element, which is a sequence of DNA that is capable of mediating its own movement from one position in the DNA of the host organism to another. There are a number of different transposition mechanisms in nature, but Tn10 uses the non-replicative cut-and-paste mechanism. The transposase protein recognizes the ends of the element and cuts it from the original locus. The protein-DNA complex then diffuses away from the donor site until random collisions brings it in contact with a new target site, where it is integrated. To accomplish this reaction the 50 kDa transposase protein must break four DNA strands to free the transposon from the donor site, and perform two strand exchange reactions to integrate the element at the target site. This leaves two strands unjoined at the target site, but the host DNA repair proteins take care of this. The target site selection is essentially random, but there is a preference for the sequence 5'-GCTNAGC-3'. The 6-9 base pairs that flank the sequence also influence selection of the insertion site.
Mobile genetic elements (MGEs) sometimes called selfish genetic elements are a type of genetic material that can move around within a genome, or that can be transferred from one species or replicon to another. MGEs are found in all organisms. In humans, approximately 50% of the genome is thought to be MGEs. MGEs play a distinct role in evolution. Gene duplication events can also happen through the mechanism of MGEs. MGEs can also cause mutations in protein coding regions, which alters the protein functions. These mechanisms can also rearrange genes in the host genome generating variation. These mechanism can increase fitness by gaining new or additional functions. An example of MGEs in evolutionary context are that virulence factors and antibiotic resistance genes of MGEs can be transported to share genetic code with neighboring bacteria. However, MGEs can also decrease fitness by introducing disease-causing alleles or mutations. The set of MGEs in an organism is called a mobilome, which is composed of a large number of plasmids, transposons and viruses.
A composite transposon is similar in function to simple transposons and insertion sequence (IS) elements in that it has protein coding DNA segments flanked by inverted, repeated sequences that can be recognized by transposase enzymes. A composite transposon, however, is flanked by two separate IS elements which may or may not be exact replicas. Instead of each IS element moving separately, the entire length of DNA spanning from one IS element to the other is transposed as one complete unit. Composite transposons will also often carry one or more genes conferring antibiotic resistance.
The Tn3 transposon is a 4957 base pair mobile genetic element, found in prokaryotes. It encodes three proteins:
In the fields of bioinformatics and computational biology, Genome survey sequences (GSS) are nucleotide sequences similar to expressed sequence tags (ESTs) that the only difference is that most of them are genomic in origin, rather than mRNA.
Transposon mutagenesis, or transposition mutagenesis, is a biological process that allows genes to be transferred to a host organism's chromosome, interrupting or modifying the function of an extant gene on the chromosome and causing mutation. Transposon mutagenesis is much more effective than chemical mutagenesis, with a higher mutation frequency and a lower chance of killing the organism. Other advantages include being able to induce single hit mutations, being able to incorporate selectable markers in strain construction, and being able to recover genes after mutagenesis. Disadvantages include the low frequency of transposition in living systems, and the inaccuracy of most transposition systems.
A knockout rat is a genetically engineered rat with a single gene turned off through a targeted mutation used for academic and pharmaceutical research. Knockout rats can mimic human diseases and are important tools for studying gene function and for drug discovery and development. The production of knockout rats was not economically or technically feasible until 2008.
Helitrons are one of the three groups of eukaryotic class 2 transposable elements (TEs) so far described. They are the eukaryotic rolling-circle transposable elements which are hypothesized to transpose by a rolling circle replication mechanism via a single-stranded DNA intermediate. They were first discovered in plants and in the nematode Caenorhabditis elegans, and now they have been identified in a diverse range of species, from protists to mammals. Helitrons make up a substantial fraction of many genomes where non-autonomous elements frequently outnumber the putative autonomous partner. Helitrons seem to have a major role in the evolution of host genomes. They frequently capture diverse host genes, some of which can evolve into novel host genes or become essential for Helitron transposition.
Transposons are semi-parasitic DNA sequences which can replicate and spread through the host's genome. They can be harnessed as a genetic tool for analysis of gene and protein function. The use of transposons is well-developed in Drosophila and in Thale cress and bacteria such as Escherichia coli.
A conserved non-coding sequence (CNS) is a DNA sequence of noncoding DNA that is evolutionarily conserved. These sequences are of interest for their potential to regulate gene production.
The Sleeping Beauty transposon system is a synthetic DNA transposon designed to introduce precisely defined DNA sequences into the chromosomes of vertebrate animals for the purposes of introducing new traits and to discover new genes and their functions. It is a Tc1/mariner-type system, with the transposase resurrected from multiple inactive fish sequences.
The PiggyBac (PB) transposon is a mobile genetic element that efficiently transposes between vectors and chromosomes via a "cut and paste" mechanism. During transposition, the PB transposase recognizes transposon-specific inverted terminal repeat sequences (ITRs) located on both ends of the transposon vector and efficiently moves the contents from the original sites and integrates them into TTAA chromosomal sites. The powerful activity of the PiggyBac transposon system enables genes of interest between the two ITRs in the PB vector to be easily mobilized into target genomes. The TTAA-specific transposon piggyBac is rapidly becoming a highly useful transposon for genetic engineering of a wide variety of species, particularly insects. They were discovered in 1989 by Malcolm Fraser at the University of Notre Dame.
Ac/Ds transposable controlling elements was the first transposable element system recognized in maize. The Ac Activator element is autonomous, whereas the Ds Dissociation element requires an Activator element to transpose. Ac was initially discovered as enabling a Ds element to break chromosomes. Both Ac and Ds can also insert into genes, causing mutants that may revert to normal on excision of the element. The phenotypic consequence of Ac/Ds transposable element includes mosaic colors in kernels and leaves in maize.
Transposition is the process by which a specific genetic sequence, known as a transposon, is moved from one location of the genome to another. Simple, or conservative transposition, is a non-replicative mode of transposition. That is, in conservative transposition the transposon is completely removed from the genome and reintegrated into a new, non-homologous locus, the same genetic sequence is conserved throughout the entire process. The site in which the transposon is reintegrated into the genome is called the target site. A target site can be in the same chromosome as the transposon or within a different chromosome. Conservative transposition uses the "cut-and-paste" mechanism driven by the catalytic activity of the enzyme transposase. Transposase acts like DNA scissors; it is an enzyme that cuts through double-stranded DNA to remove the transposon, then transfers and pastes it into a target site.
DNA transposons are DNA sequences, sometimes referred to "jumping genes", that can move and integrate to different locations within the genome. They are class II transposable elements (TEs) that move through a DNA intermediate, as opposed to class I TEs, retrotransposons, that move through an RNA intermediate. DNA transposons can move in the DNA of an organism via a single-or double-stranded DNA intermediate. DNA transposons have been found in both prokaryotic and eukaryotic organisms. They can make up a significant portion of an organism's genome, particularly in eukaryotes. In prokaryotes, TE's can facilitate the horizontal transfer of antibiotic resistance or other genes associated with virulence. After replicating and propagating in a host, all transposon copies become inactivated and are lost unless the transposon passes to a genome by starting a new life cycle with horizontal transfer. It is important to note that DNA transposons do not randomly insert themselves into the genome, but rather show preference for specific sites.
Tc1/mariner is a class and superfamily of interspersed repeats DNA transposons. The elements of this class are found in all animals, including humans. They can also be found in protists and bacteria.