A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specific proteins to form a small nuclear ribonucleoprotein complex (snRNP, pronounced "snurps"), which in turn combines with other snRNPs to form a large ribonucleoprotein complex called a spliceosome. The spliceosome removes introns from a transcribed pre-mRNA, a type of primary transcript. This process is generally referred to as splicing. [1] An analogy is a film editor, who selectively cuts out irrelevant or incorrect material (equivalent to the introns) from the initial film and sends the cleaned-up version to the director for the final cut.[ citation needed ]
However, sometimes the RNA within the intron acts as a ribozyme, splicing itself without the use of a spliceosome or protein enzymes.[ citation needed ]
In 1977, work by the Sharp and Roberts labs revealed that genes of higher organisms are "split" or present in several distinct segments along the DNA molecule. [2] [3] The coding regions of the gene are separated by non-coding DNA that is not involved in protein expression. The split gene structure was found when adenoviral mRNAs were hybridized to endonuclease cleavage fragments of single stranded viral DNA. [2] It was observed that the mRNAs of the mRNA-DNA hybrids contained 5' and 3' tails of non-hydrogen bonded regions. When larger fragments of viral DNAs were used, forked structures of looped out DNA were observed when hybridized to the viral mRNAs. It was realised that the looped out regions, the introns, are excised from the precursor mRNAs in a process Sharp named "splicing". The split gene structure was subsequently found to be common to most eukaryotic genes. Phillip Sharp and Richard J. Roberts were awarded the Nobel Prize in Medicine 1993 for the discovery of introns and the splicing process.
Each spliceosome is composed of five small nuclear RNAs (snRNA) and a range of associated protein factors. When these small RNAs are combined with the protein factors, they make RNA-protein complexes called snRNPs (small nuclear ribonucleoproteins, pronounced "snurps"). The snRNAs that make up the major spliceosome are named U1, U2, U4, U5, and U6, so-called because they are rich in uridine, and participate in several RNA-RNA and RNA-protein interactions. [1]
The assembly of the spliceosome occurs on each pre-mRNA (also known as heterogeneous nuclear RNA, hn-RNA) at each exon:intron junction. The pre-mRNA introns contains specific sequence elements that are recognized and utilized during spliceosome assembly. These include the 5' end splice site, the branch point sequence, the polypyrimidine tract, and the 3' end splice site. The spliceosome catalyzes the removal of introns, and the ligation of the flanking exons.[ citation needed ]
Introns typically have a GU nucleotide sequence at the 5' end splice site, and an AG at the 3' end splice site. The 3' splice site can be further defined by a variable length of polypyrimidines, called the polypyrimidine tract (PPT), which serves the dual function of recruiting factors to the 3' splice site and possibly recruiting factors to the branch point sequence (BPS). The BPS contains the conserved adenosine required for the first step of splicing.[ citation needed ]
Many proteins exhibit a zinc-binding motif, which underscores the importance of zinc in the splicing mechanism. [4] [5] [6] The first molecular-resolution reconstruction of U4/U6.U5 triple small nuclear ribonucleoprotein (tri-snRNP) complex was reported in 2016. [7]
Cryo-EM has been applied extensively by Shi et al. to elucidate the near-/atomic structure of spliceosome in both yeast [9] and humans. [10] The molecular framework of spliceosome at near-atomic-resolution demonstrates Spp42 component of U5 snRNP forms a central scaffold and anchors the catalytic center in yeast. The atomic structure of the human spliceosome illustrates the step II component Slu7 adopts an extended structure, poised for selection of the 3'-splice site. All five metals (assigned as Mg2+) in the yeast complex are preserved in the human complex.[ citation needed ]
Alternative splicing (the re-combination of different exons) is a major source of genetic diversity in eukaryotes. Splice variants have been used to account for the relatively small number of protein coding genes in the human genome, currently estimated at around 20,000. One particular Drosophila gene, Dscam, has been speculated to be alternatively spliced into 38,000 different mRNAs, assuming all of its exons can splice independently of each other. [11]
Pre-mRNA splicing factors were originally found to be concentrated in nuclear bodies known as nuclear speckles. [12] It was originally postulated that nuclear speckles are either sites of mRNA splicing or storage sites of mRNA splicing factors. It is now understood that nuclear speckles help concentrate splicing factors near genes that are physically located close to them. Genes located farther from speckles can still be transcribed and spliced, but their splicing is less efficient compared to those closer to speckles. [13] RNA splicing is a biochemical reaction, and like all biochemical reactions, its rate depends on the concentration of enzymes and substrates. In this case, the enzymes are the spliceosomes, and the substrates are the pre-mRNAs. By varying the concentration of spliceosomes and pre-mRNAs based on their proximity to nuclear speckles, cells could potentially regulate the efficiency of splicing. [13]
The model for formation of the spliceosome active site involves an ordered, stepwise assembly of discrete snRNP particles on the pre-mRNA substrate. The first recognition of pre-mRNAs involves U1 snRNP binding to the 5' end splice site of the pre-mRNA and other non-snRNP associated factors to form the commitment complex, or early (E) complex in mammals. [14] [15] The commitment complex is an ATP-independent complex that commits the pre-mRNA to the splicing pathway. [16] U2 snRNP is recruited to the branch region through interactions with the E complex component U2AF (U2 snRNP auxiliary factor) and possibly U1 snRNP. In an ATP-dependent reaction, U2 snRNP becomes tightly associated with the branch point sequence (BPS) to form complex A. A duplex formed between U2 snRNP and the pre-mRNA branch region bulges out the branch adenosine specifying it as the nucleophile for the first transesterification. [17]
The presence of a pseudouridine residue in U2 snRNA, nearly opposite of the branch site, results in an altered conformation of the RNA-RNA duplex upon the U2 snRNP binding. Specifically, the altered structure of the duplex induced by the pseudouridine places the 2' OH of the bulged adenosine in a favorable position for the first step of splicing. [18] The U4/U5/U6 tri-snRNP (see Figure 1) is recruited to the assembling spliceosome to form complex B, and following several rearrangements, complex C is activated for catalysis. [19] [20] It is unclear how the tri-snRNP is recruited to complex A, but this process may be mediated through protein-protein interactions and/or base pairing interactions between U2 snRNA and U6 snRNA.[ citation needed ]
The U5 snRNP interacts with sequences at the 5' and 3' splice sites via the invariant loop of U5 snRNA [21] and U5 protein components interact with the 3' splice site region. [22]
Upon recruitment of the tri-snRNP, several RNA-RNA rearrangements precede the first catalytic step and further rearrangements occur in the catalytically active spliceosome. Several of the RNA-RNA interactions are mutually exclusive; however, it is not known what triggers these interactions, nor the order of these rearrangements. The first rearrangement is probably the displacement of U1 snRNP from the 5' splice site and formation of a U6 snRNA interaction. It is known that U1 snRNP is only weakly associated with fully formed spliceosomes, [23] and U1 snRNP is inhibitory to the formation of a U6-5' splice site interaction on a model of substrate oligonucleotide containing a short 5' exon and 5' splice site. [24] Binding of U2 snRNP to the branch point sequence (BPS) is one example of an RNA-RNA interaction displacing a protein-RNA interaction. Upon recruitment of U2 snRNP, the branch binding protein SF1 in the commitment complex is displaced since the binding site of U2 snRNA and SF1 are mutually exclusive events.[ citation needed ]
Within the U2 snRNA, there are other mutually exclusive rearrangements that occur between competing conformations. For example, in the active form, stem loop IIa is favored; in the inactive form a mutually exclusive interaction between the loop and a downstream sequence predominates. [20] It is unclear how U4 is displaced from U6 snRNA, although RNA has been implicated in spliceosome assembly, and may function to unwind U4/U6 and promote the formation of a U2/U6 snRNA interaction. The interactions of U4/U6 stem loops I and II dissociate and the freed stem loop II region of U6 folds on itself to form an intramolecular stem loop and U4 is no longer required in further spliceosome assembly. The freed stem loop I region of U6 base pairs with U2 snRNA forming the U2/U6 helix I. However, the helix I structure is mutually exclusive with the 3' half of an internal 5' stem loop region of U2 snRNA.[ citation needed ]
Some eukaryotes have a second spliceosome, the so-called minor spliceosome. [25] A group of less abundant snRNAs, U11, U12, U4atac, and U6atac, together with U5, are subunits of the minor spliceosome that splices a rare class of pre-mRNA introns, denoted U12-type. The minor spliceosome is located in the nucleus like its major counterpart, [26] though there are exceptions in some specialised cells including anucleate platelets [27] and the dendroplasm (dendrite cytoplasm) of neuronal cells. [28]
RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.
SR proteins are a conserved family of proteins involved in RNA splicing. SR proteins are named because they contain a protein domain with long repeats of serine and arginine amino acid residues, whose standard abbreviations are "S" and "R" respectively. SR proteins are ~200-600 amino acids in length and composed of two domains, the RNA recognition motif (RRM) region and the RS domain. SR proteins are more commonly found in the nucleus than the cytoplasm, but several SR proteins are known to shuttle between the nucleus and the cytoplasm.
snRNPs, or small nuclear ribonucleoproteins, are RNA-protein complexes that combine with unmodified pre-mRNA and various other proteins to form a spliceosome, a large RNA-protein molecular complex upon which splicing of pre-mRNA occurs. The action of snRNPs is essential to the removal of introns from pre-mRNA, a critical aspect of post-transcriptional modification of RNA, occurring only in the nucleus of eukaryotic cells. Additionally, U7 snRNP is not involved in splicing at all, as U7 snRNP is responsible for processing the 3′ stem-loop of histone pre-mRNA.
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribed by either RNA polymerase II or RNA polymerase III. Their primary function is in the processing of pre-messenger RNA (hnRNA) in the nucleus. They have also been shown to aid in the regulation of transcription factors or RNA polymerase II, and maintaining the telomeres.
The minor spliceosome is a ribonucleoprotein complex that catalyses the removal (splicing) of an atypical class of spliceosomal introns (U12-type) from messenger RNAs in some clades of eukaryotes. This process is called noncanonical splicing, as opposed to U2-dependent canonical splicing. U12-type introns represent less than 1% of all introns in human cells. However they are found in genes performing essential cellular functions.
In molecular biology, LSm proteins are a family of RNA-binding proteins found in virtually every cellular organism. LSm is a contraction of 'like Sm', because the first identified members of the LSm protein family were the Sm proteins. LSm proteins are defined by a characteristic three-dimensional structure and their assembly into rings of six or seven individual LSm protein molecules, and play a large number of various roles in mRNA processing and regulation.
The U11 snRNA is an important non-coding RNA in the minor spliceosome protein complex, which activates the alternative splicing mechanism. The minor spliceosome is associated with similar protein components as the major spliceosome. It uses U11 snRNA to recognize the 5' splice site while U12 snRNA binds to the branchpoint to recognize the 3' splice site.
U1 spliceosomal RNA is the small nuclear RNA (snRNA) component of U1 snRNP, an RNA-protein complex that combines with other snRNPs, unmodified pre-mRNA, and various other proteins to assemble a spliceosome, a large RNA-protein molecular complex upon which splicing of pre-mRNA occurs. Splicing, or the removal of introns, is a major aspect of post-transcriptional modification, and takes place only in the nucleus of eukaryotes.
U2 spliceosomal snRNAs are a species of small nuclear RNA (snRNA) molecules found in the major spliceosomal (Sm) machinery of virtually all eukaryotic organisms. In vivo, U2 snRNA along with its associated polypeptides assemble to produce the U2 small nuclear ribonucleoprotein (snRNP), an essential component of the major spliceosomal complex. The major spliceosomal-splicing pathway is occasionally referred to as U2 dependent, based on a class of Sm intron—found in mRNA primary transcripts—that are recognized exclusively by the U2 snRNP during early stages of spliceosomal assembly. In addition to U2 dependent intron recognition, U2 snRNA has been theorized to serve a catalytic role in the chemistry of pre-RNA splicing as well. Similar to ribosomal RNAs (rRNAs), Sm snRNAs must mediate both RNA:RNA and RNA:protein contacts and hence have evolved specialized, highly conserved, primary and secondary structural elements to facilitate these types of interactions.
The U4 small nuclear Ribo-Nucleic Acid is a non-coding RNA component of the major or U2-dependent spliceosome – a eukaryotic molecular machine involved in the splicing of pre-messenger RNA (pre-mRNA). It forms a duplex with U6, and with each splicing round, it is displaced from the U6 snRNA in an ATP-dependent manner, allowing U6 to re-fold and create the active site for splicing catalysis. A recycling process involving protein Brr2 releases U4 from U6, while protein Prp24 re-anneals U4 and U6. The crystal structure of a 5′ stem-loop of U4 in complex with a binding protein has been solved.
U5 snRNA is a small nuclear RNA (snRNA) that participates in RNA splicing as a component of the spliceosome. It forms the U5 snRNP by associating with several proteins including Prp8 - the largest and most conserved protein in the spliceosome, Brr2 - a helicase required for spliceosome activation, Snu114, and the 7 Sm proteins. U5 snRNA forms a coaxially-stacked series of helices that project into the active site of the spliceosome. Loop 1, which caps this series of helices, forms 4-5 base pairs with the 5'-exon during the two chemical reactions of splicing. This interaction appears to be especially important during step two of splicing, exon ligation.
U6 snRNA is the non-coding small nuclear RNA (snRNA) component of U6 snRNP, an RNA-protein complex that combines with other snRNPs, unmodified pre-mRNA, and various other proteins to assemble a spliceosome, a large RNA-protein molecular complex that catalyzes the excision of introns from pre-mRNA. Splicing, or the removal of introns, is a major aspect of post-transcriptional modification and takes place only in the nucleus of eukaryotes.
Splicing factor U2AF 65 kDa subunit is a protein that in humans is encoded by the U2AF2 gene.
Pre-mRNA-processing factor 6 is a protein that in humans is encoded by the PRPF6 gene.
U4/U6 small nuclear ribonucleoprotein Prp4 is a protein that in humans is encoded by the PRPF4 gene. The removal of introns from nuclear pre-mRNAs occurs on complexes called spliceosomes, which are made up of 4 small nuclear ribonucleoprotein (snRNP) particles and an undefined number of transiently associated splicing factors. PRPF4 is 1 of several proteins that associate with U4 and U6 snRNPs.[supplied by OMIM]
Peptidyl-prolyl cis-trans isomerase H is an enzyme that in humans is encoded by the PPIH gene.
Prp24 is a protein part of the pre-messenger RNA splicing process and aids the binding of U6 snRNA to U4 snRNA during the formation of spliceosomes. Found in eukaryotes from yeast to E. coli, fungi, and humans, Prp24 was initially discovered to be an important element of RNA splicing in 1989. Mutations in Prp24 were later discovered in 1991 to suppress mutations in U4 that resulted in cold-sensitive strains of yeast, indicating its involvement in the reformation of the U4/U6 duplex after the catalytic steps of splicing.
Prp8 refers to both the Prp8 protein and Prp8 gene. Prp8's name originates from its involvement in pre-mRNA processing. The Prp8 protein is a large, highly conserved, and unique protein that resides in the catalytic core of the spliceosome and has been found to have a central role in molecular rearrangements that occur there. Prp8 protein is a major central component of the catalytic core in the spliceosome, and the spliceosome is responsible for splicing of precursor mRNA that contains introns and exons. Unexpressed introns are removed by the spliceosome complex in order to create a more concise mRNA transcript. Splicing is just one of many different post-transcriptional modifications that mRNA must undergo before translation. Prp8 has also been hypothesized to be a cofactor in RNA catalysis.
Christine Guthrie (1945-2022) was an American yeast geneticist and American Cancer Society Research Professor of Genetics at University of California San Francisco. She showed that yeast have small nuclear RNAs (snRNAs) involved in splicing pre-messenger RNA into messenger RNA in eukaryotic cells. Guthrie cloned and sequenced the genes for yeast snRNA and established the role of base pairing between the snRNAs and their target sequences at each step in the removal of an intron. She also identified proteins that formed part of the spliceosome complex with the snRNAs. Elected to the National Academy of Sciences in 1993, Guthrie edited Guide to Yeast Genetics and Molecular Biology, an influential methods series for many years.
Kiyoshi Nagai was a Japanese structural biologist at the MRC Laboratory of Molecular Biology Cambridge, UK. He was known for his work on the mechanism of RNA splicing and structures of the spliceosome.