Group II intron

Last updated
Group II catalytic intron, D5
IntronGroupII.jpg
full secondary structure of group II intron
Identifiers
SymbolIntron_gpII
Rfam RF00029
Other data
PDB structures PDBe 6cih
Extra informationEntry contains Splicing domain V (D5) and some consensus 3' to it.
Group II catalytic intron, D1-D4
Identifiers
Symbolgroup-II-D1D4
Rfam CL00102
Other data
PDB structures PDBe 4fb0
Extra informationEntry contains D1-D4, parts 5' to D5.

Group II introns are a large class of self-catalytic ribozymes and mobile genetic elements found within the genes of all three domains of life. Ribozyme activity (e.g., self-splicing) can occur under high-salt conditions in vitro . However, assistance from proteins is required for in vivo splicing. [1] In contrast to group I introns, intron excision occurs in the absence of GTP and involves the formation of a lariat, with an A-residue branchpoint strongly resembling that found in lariats formed during splicing of nuclear pre-mRNA. It is hypothesized that pre-mRNA splicing (see spliceosome) may have evolved from group II introns, due to the similar catalytic mechanism as well as the structural similarity of the Group II Domain V substructure to the U6/U2 extended snRNA. [2] [3] Finally, their ability to site-specifically insert into DNA sites has been exploited as a tool for biotechnology. [4] For example, group II introns can be modified to make site-specific genome insertions and deliver cargo DNA such as reporter genes or lox sites [5]

Contents

Structure and catalysis

The Domain V substructure that is shared between Group II introns and U6 spliceosomal RNA. Domain-V-secondary-structure.svg
The Domain V substructure that is shared between Group II introns and U6 spliceosomal RNA.

The secondary structure of group II introns is characterized by six typical stem-loop structures, also called domains I to VI (DI to DVI, or D1 to D6). The domains radiate from a central core that brings the 5' and 3' splice junctions into close proximity. The proximal helix structures of the six domains are connected by a few nucleotides in the central region (linker or joiner sequences). Due to its enormous size, the domain I was divided further into subdomains a, b, c, and d. Sequence differences of group II introns that led to a further division into subgroups IIA, IIB and IIC were identified, along with varying distance of the bulged adenosine in domain VI (the prospective branch point forming the lariat) from the 3' splice site, and the inclusion or omission of structural elements such as a coordination loop in domain I, which is present in IIB and IIC introns but not IIA. [1] Group II introns also form very complicated RNA Tertiary Structure.

Group II introns possess only a very few conserved nucleotides, and the nucleotides important for the catalytic function are spread over the complete intron structure. The few strictly conserved primary sequences are the consensus at the 5' and 3' splicing site (...↓GUGYG&... and ...AY↓..., with the Y representing a pyrimidine), some of the nucleotides of the central core (joiner sequences), a relatively high number of nucleotides of DV and some short-sequence stretches of DI. The unpaired adenosine in DVI (marked by an asterisk in the figure and located 7 or 8 nt away from the 3' splicing site) is also conserved and plays a central role in the splicing process. The 2' hydroxyl of the bulged adenosine attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. This results in a branched intron lariat connected by a 2' phosphodiester linkage at the DVI adenosine.

Protein machinery is required for splicing in vivo , and long-range intron-intron and intron-exon interactions are important for splice site positioning, as well as a number of tertiary contacts between motifs, including kissing-loop and tetraloop-receptor interactions. In 2005, A. De Lencastre et al. found that during splicing of Group II introns, all reactants are preorganized before the initiation of splicing. The branch site, both exons, the catalytically essential regions of DV and J2/3, and ε−ε' are in close proximity before the first step of splicing occurs. In addition to the bulge and AGC triad regions of DV, the J2/3 linker region, the ε−ε' nucleotides and the coordination loop in DI are crucial for the architecture and function of the active-site. [6]

The first crystal structure of a group II intron was resolved in 2008 for the Oceanobacillus iheyensis group IIC catalytic intron, and was joined by the Pylaiella littoralis (P.li.LSUI2) group IIB intron in 2014. Attempts have been made to model the tertiary structure of other group II introns, such as the ai5γ group IIB intron, using a combination of programs for homology mapping onto known structures and de novo modeling of previously unresolved regions. [7] Group IIC are characterized by a catalytic triad made up by CGC, while Group IIA and Group IIB are made up by AGC catalytic triad, which is more similar to the catalytic triad of the spliceosome. It is believed that the Group IIC are also smaller, more reactive and more ancient. The first step of splicing in Group IIC intron is done by water and it form a linear structure instead of lariat. [8]

Distribution and phylogeny

Domain X
Identifiers
SymbolDomain_X
Pfam PF01348
Pfam clan CL0359
InterPro IPR024937
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary
Group II intron, maturase-specific
Identifiers
SymbolGIIM
Pfam PF08388
Pfam clan CL0359
InterPro IPR013597
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary

Group II introns are found in rRNA, tRNA, and mRNA of organelles (chloroplasts and mitochondria) in fungi, plants, and protists, and also in mRNA in bacteria. The first intron to be identified as distinct from group I was the ai5γ group IIB intron, which was isolated in 1986 from a pre-mRNA transcript of the oxi 3 mitochondrial gene of Saccharomyces cerevisiae. [9]

A subset of group II introns encode essential splicing proteins, known as intron-encoded proteins or IEPs, in intronic ORFs. The length of these introns can, as a result, be up to 3 kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps, with both also using magnesium ions to stabilize the leaving group in each step, which has led some to theorize a phylogenetic link between group II introns and the nuclear spliceosome. Further evidence for this link includes structural similarity between the U2/U6 junction of spliceosomal RNA and domain V of group II introns, which contains the catalytic AGC triad and much of the heart of the active site, as well as parity between conserved 5' and 3' end sequences. [10]

Many of these IEPs, including LtrA, share a reverse transcriptase domain and a "Domain X". [11] Maturase K (MatK) is a protein somewhat similar to those intron-encoded proteins, found in plant chloroplasts. It is required for in vivo splicing of Group II introns, and can be found in chloroplastic introns or in the nuclear genome. Its RT domain is broken. [11]

Protein domain

Group II IEPs share a related conserved domain, known as either "Domain X" in organelles or "GIIM" in bacteria, that is not found in other retroelements. [12] [13] Domain X is essential for splicing in yeast mitochondria. [14] This domain may be responsible for recognizing and binding to intron RNA [13] or DNA. [15]

See also

Related Research Articles

An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intragenic region, i.e., a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts. The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons.

<span class="mw-page-title-main">RNA splicing</span> Process in molecular biology

RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.

<span class="mw-page-title-main">Spliceosome</span> Molecular machine that removes intron RNA from the primary transcript

A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specific proteins to form a small nuclear ribonucleoprotein complex, which in turn combines with other snRNPs to form a large ribonucleoprotein complex called a spliceosome. The spliceosome removes introns from a transcribed pre-mRNA, a type of primary transcript. This process is generally referred to as splicing. An analogy is a film editor, who selectively cuts out irrelevant or incorrect material from the initial film and sends the cleaned-up version to the director for the final cut.

<span class="mw-page-title-main">Ribozyme</span> Type of RNA molecules

Ribozymes are RNA molecules that have the ability to catalyze specific biochemical reactions, including RNA splicing in gene expression, similar to the action of protein enzymes. The 1982 discovery of ribozymes demonstrated that RNA can be both genetic material and a biological catalyst, and contributed to the RNA world hypothesis, which suggests that RNA may have been important in the evolution of prebiotic self-replicating systems.

Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribed by either RNA polymerase II or RNA polymerase III. Their primary function is in the processing of pre-messenger RNA (hnRNA) in the nucleus. They have also been shown to aid in the regulation of transcription factors or RNA polymerase II, and maintaining the telomeres.

Group III intron is a class of introns found in mRNA genes of chloroplasts in euglenid protists. They have a conventional group II-type dVI with a bulged adenosine, a streamlined dI, no dII-dV, and a relaxed splice site consensus. Splicing is done with two transesterification reactions with a dVI bulged adenosine as initiating nucleophile; the intron is excised as a lariat. Not much is known about how they work, although an isolated chloroplast transformation system has been constructed.

In molecular biology, a twintron is an intron-within-intron excised by sequential splicing reactions. A twintron is presumably formed by the insertion of a mobile intron into an existing intron.

<span class="mw-page-title-main">VS ribozyme</span>

The Varkud satellite (VS) ribozyme is an RNA enzyme that carries out the cleavage of a phosphodiester bond.

<span class="mw-page-title-main">Group I catalytic intron</span> Large self-splicing ribozymes

Group I introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of nine paired regions (P1-P9). These fold to essentially two domains – the P4-P6 domain and the P3-P9 domain. The secondary structure mark-up for this family represents only this conserved core. Group I introns often have long open reading frames inserted in loop regions.

<span class="mw-page-title-main">U11 spliceosomal RNA</span> Non-coding RNA involved in alternative splicing

The U11 snRNA is an important non-coding RNA in the minor spliceosome protein complex, which activates the alternative splicing mechanism. The minor spliceosome is associated with similar protein components as the major spliceosome. It uses U11 snRNA to recognize the 5' splice site while U12 snRNA binds to the branchpoint to recognize the 3' splice site.

<span class="mw-page-title-main">U1 spliceosomal RNA</span>

U1 spliceosomal RNA is the small nuclear RNA (snRNA) component of U1 snRNP, an RNA-protein complex that combines with other snRNPs, unmodified pre-mRNA, and various other proteins to assemble a spliceosome, a large RNA-protein molecular complex upon which splicing of pre-mRNA occurs. Splicing, or the removal of introns, is a major aspect of post-transcriptional modification, and takes place only in the nucleus of eukaryotes.

<span class="mw-page-title-main">U2 spliceosomal RNA</span>

U2 spliceosomal snRNAs are a species of small nuclear RNA (snRNA) molecules found in the major spliceosomal (Sm) machinery of virtually all eukaryotic organisms. In vivo, U2 snRNA along with its associated polypeptides assemble to produce the U2 small nuclear ribonucleoprotein (snRNP), an essential component of the major spliceosomal complex. The major spliceosomal-splicing pathway is occasionally referred to as U2 dependent, based on a class of Sm intron—found in mRNA primary transcripts—that are recognized exclusively by the U2 snRNP during early stages of spliceosomal assembly. In addition to U2 dependent intron recognition, U2 snRNA has been theorized to serve a catalytic role in the chemistry of pre-RNA splicing as well. Similar to ribosomal RNAs (rRNAs), Sm snRNAs must mediate both RNA:RNA and RNA:protein contacts and hence have evolved specialized, highly conserved, primary and secondary structural elements to facilitate these types of interactions.

<span class="mw-page-title-main">U6 spliceosomal RNA</span> Small nuclear RNA component of the spliceosome

U6 snRNA is the non-coding small nuclear RNA (snRNA) component of U6 snRNP, an RNA-protein complex that combines with other snRNPs, unmodified pre-mRNA, and various other proteins to assemble a spliceosome, a large RNA-protein molecular complex that catalyzes the excision of introns from pre-mRNA. Splicing, or the removal of introns, is a major aspect of post-transcriptional modification and takes place only in the nucleus of eukaryotes.

The Lariat capping ribozyme is a ~180 nt ribozyme with an apparent resemblance to a group I ribozyme. It is found within a complex type of group I introns also termed twin-ribozyme introns. Rather than splicing, it catalyses a branching reaction in which the 2'OH of an internal residue is involved in a nucleophilic attack at a nearby phosphodiester bond. As a result, the RNA is cleaved at an internal processing site (IPS), leaving a 3'OH and a downstream product with a 3 nt lariat at its 5' end. The lariat has the first and the third nucleotide joined by a 2',5' phosphodiester bond and is referred to as 'the lariat cap' because it caps an intron-encoded mRNA. The resulting lariat cap seems to contribute by increasing the half-life of the HE mRNA, thus conferring an evolutionary advantage to the HE.

<span class="mw-page-title-main">Cyclic di-GMP-II riboswitch</span>

Cyclic di-GMP-II riboswitches form a class of riboswitches that specifically bind cyclic di-GMP, a second messenger used in multiple bacterial processes such as virulence, motility and biofilm formation. Cyclic di-GMP II riboswitches are structurally unrelated to cyclic di-GMP-I riboswitches, though they have the same function.

Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.

<span class="mw-page-title-main">Prp8</span>

Prp8 refers to both the Prp8 protein and Prp8 gene. Prp8's name originates from its involvement in pre-mRNA processing. The Prp8 protein is a large, highly conserved, and unique protein that resides in the catalytic core of the spliceosome and has been found to have a central role in molecular rearrangements that occur there. Prp8 protein is a major central component of the catalytic core in the spliceosome, and the spliceosome is responsible for splicing of precursor mRNA that contains introns and exons. Unexpressed introns are removed by the spliceosome complex in order to create a more concise mRNA transcript. Splicing is just one of many different post-transcriptional modifications that mRNA must undergo before translation. Prp8 has also been hypothesized to be a cofactor in RNA catalysis.

The split gene theory is a theory of the origin of introns, long non-coding sequences in eukaryotic genes between the exons. The theory holds that the randomness of primordial DNA sequences would only permit small (< 600bp) open reading frames (ORFs), and that important intron structures and regulatory sequences are derived from stop codons. In this introns-first framework, the spliceosomal machinery and the nucleus evolved due to the necessity to join these ORFs into larger proteins, and that intronless bacterial genes are less ancestral than the split eukaryotic genes. The theory originated with Periannan Senapathy.

<span class="mw-page-title-main">DHX8</span> Protein-coding gene in humans

DEAH-box helicase 8, is a protein that in humans is encoded by the DHX8 gene. This protein is member of the DEAH box polypeptide family. The main characteristic of this group is their conserved motif DEAH. A wide range of RNA helicases belongs to this family. Specifically, DHX8 acts as an ATP-dependent RNA helicase involved in splicing and the regulation of the releasing of spliced mRNAs from spliceosomes out of the nucleus. Published studies have shown the consequences of DHX8 mutations, some of them are critical for biological processes such as hematopoiesis and are related to some diseases.

References

  1. 1 2 Lambowitz AM, Zimmerly S (August 2011). "Group II introns: mobile ribozymes that invade DNA". Cold Spring Harbor Perspectives in Biology. 3 (8): a003616. doi:10.1101/cshperspect.a003616. PMC   3140690 . PMID   20463000.
  2. Seetharaman M, Eldho NV, Padgett RA, Dayie KT (February 2006). "Structure of a self-splicing group II intron catalytic effector domain 5: parallels with spliceosomal U6 RNA". RNA. 12 (2): 235–47. doi:10.1261/rna.2237806. PMC   1370903 . PMID   16428604.
  3. Valadkhan S (May–Jun 2010). "Role of the snRNAs in spliceosomal active site". RNA Biology. 7 (3): 345–53. doi: 10.4161/rna.7.3.12089 . PMID   20458185.
  4. Karberg M, Guo H, Zhong J, Coon R, Perutka J, Lambowitz AM (December 2001). "Group II introns as controllable gene targeting vectors for genetic manipulation of bacteria". Nat Biotechnol. 19 (12): 1162–7. doi:10.1038/nbt1201-1162. PMID   11731786. S2CID   18669663.
  5. Cerisy T, Rostain W, Chhun A, Boutard M, Salanoubat M, Tolonen AC (December 2019). "A Targetron-Recombinase System for Large-Scale Genome Engineering of Clostridia". mSphere. 4 (6): e00710-19. doi:10.1128/mSphere.00710-19. PMC   6908422 . PMID   31826971.
  6. de Lencastre A, Hamill S, Pyle AM (July 2005). "A single active-site region for a group II intron". Nature Structural & Molecular Biology. 12 (7): 626–7. doi:10.1038/nsmb957. PMID   15980867. S2CID   27639877.
  7. Somarowthu S, Legiewicz M, Keating KS, Pyle AM (February 2014). "Visualizing the ai5γ group IIB intron". Nucleic Acids Research. 42 (3): 1947–58. doi:10.1093/nar/gkt1051. PMC   3919574 . PMID   24203709.
  8. Keating KS, Toor N, Perlman PS, Pyle AM (January 2010). "A structural analysis of the group II intron active site and implications for the spliceosome". RNA. 16 (1): 1–9. doi:10.1261/rna.1791310. PMC   2802019 . PMID   19948765.
  9. Peebles CL, Perlman PS, Mecklenburg KL, Petrillo ML, Tabor JH, Jarrell KA, Cheng HL (January 1986). "A self-splicing RNA excises an intron lariat". Cell. 44 (2): 213–23. doi:10.1016/0092-8674(86)90755-5. PMID   3510741. S2CID   42307152.
  10. Gordon PM, Sontheimer EJ, Piccirilli JA (February 2000). "Metal ion catalysis during the exon-ligation step of nuclear pre-mRNA splicing: extending the parallels between the spliceosome and group II introns". RNA. 6 (2): 199–205. doi:10.1017/S1355838200992069. PMC   1369906 . PMID   10688359.
  11. 1 2 Ahlert D, Piepenburg K, Kudla J, Bock R (July 2006). "Evolutionary origin of a plant mitochondrial group II intron from a reverse transcriptase/maturase-encoding ancestor". Journal of Plant Research. 119 (4): 363–71. Bibcode:2006JPlR..119..363A. doi:10.1007/s10265-006-0284-0. PMID   16763758. S2CID   8277547.
  12. Mohr G, Perlman PS, Lambowitz AM (November 1993). "Evolutionary relationships among group II intron-encoded proteins and identification of a conserved domain that may be related to maturase function". Nucleic Acids Research. 21 (22): 4991–7. doi:10.1093/nar/21.22.4991. PMC   310608 . PMID   8255751.
  13. 1 2 Centrón D, Roy PH (May 2002). "Presence of a group II intron in a multiresistant Serratia marcescens strain that harbors three integrons and a novel gene fusion". Antimicrobial Agents and Chemotherapy. 46 (5): 1402–9. doi:10.1128/AAC.46.5.1402-1409.2002. PMC   127176 . PMID   11959575.
  14. Moran JV, Mecklenburg KL, Sass P, Belcher SM, Mahnke D, Lewin A, Perlman P (June 1994). "Splicing defective mutants of the COXI gene of yeast mitochondrial DNA: initial definition of the maturase domain of the group II intron aI2". Nucleic Acids Research. 22 (11): 2057–64. doi:10.1093/nar/22.11.2057. PMC   308121 . PMID   8029012.
  15. Guo H, Zimmerly S, Perlman PS, Lambowitz AM (November 1997). "Group II intron endonucleases use both RNA and protein subunits for recognition of specific sequences in double-stranded DNA". The EMBO Journal. 16 (22): 6835–48. doi:10.1093/emboj/16.22.6835. PMC   1170287 . PMID   9362497.

Further reading