Splice site mutation

Last updated

A splice site mutation is a genetic mutation that inserts, deletes or changes a number of nucleotides in the specific site at which splicing takes place during the processing of precursor messenger RNA into mature messenger RNA. Splice site consensus sequences that drive exon recognition are located at the very termini of introns. [1] The deletion of the splicing site results in one or more introns remaining in mature mRNA and may lead to the production of abnormal proteins. When a splice site mutation occurs, the mRNA transcript possesses information from these introns that normally should not be included. Introns are supposed to be removed, while the exons are expressed.

Contents

The mutation must occur at the specific site at which intron splicing occurs: within non-coding sites in a gene, directly next to the location of the exon. The mutation can be an insertion, deletion, frameshift, etc. The splicing process itself is controlled by the given sequences, known as splice-donor and splice-acceptor sequences, which surround each exon. Mutations in these sequences may lead to retention of large segments of intronic DNA by the mRNA, or to entire exons being spliced out of the mRNA. These changes could result in production of a nonfunctional protein. [2] An intron is separated from its exon by means of the splice site. Acceptor-site and donor-site relating to the splice sites signal to the spliceosome where the actual cut should be made. These donor sites, or recognition sites, are essential in the processing of mRNA. The average vertebrate gene consists of multiple small exons (average size, 137 nucleotides) separated by introns that are considerably larger. [1]

A visual representation of a splice site mutation instance Splice site mutation.jpg
A visual representation of a splice site mutation instance

Background

In 1993, Richard J. Roberts and Phillip Allen Sharp received the Nobel Prize in Physiology or Medicine for their discovery of "split genes". [4] Using the model adenovirus in their research, they were able to discover splicing—the fact that pre-mRNA is processed into mRNA once introns were removed from the RNA segment. These two scientists discovered the existence of splice sites, thereby changing the face of genomics research. They also discovered that the splicing of the messenger RNA can occur in different ways, opening up the possibility for a mutation to occur.

Technology

Today, many different types of technologies exist in which splice sites can be located and analyzed for more information. The Human Splicing Finder is an online database stemming from the Human Genome Project data. The genome database identifies thousands of mutations related to medical and health fields, as well as providing critical research information regarding splice site mutations. The tool specifically searches for pre-mRNA splicing errors, the calculation of potential splice sites using complex algorithms, and correlation with several other online genomic databases, such as the Ensembl genome browser. [5]

Role in Disease

Due to the sensitive location of splice sites, mutations in the acceptor or donor areas of splice sites can become detrimental to a human individual. In fact, many different types of diseases stem from anomalies within the splice sites.

Cancer

A study researching the role of splice site mutations in cancer supported that a splice site mutation was common in a set of women who were positive for breast and ovarian cancer. These women had the same mutation, according to the findings. An intronic single base-pair substitution destroys an acceptor site, thus activating a cryptic splice site, leading to a 59 base-pair insertion and chain termination. The four families with both breast and ovarian cancer had chain termination mutations in the N-terminal half of the protein. [6] The mutation in this research example was located within the splice-site.

Splice-site mutations are recurrently found in key lymphoma genes [7] like BCL7A [8] or CD79B [7] due to aberrant somatic hypermutation as the sequence targeted by AID overlaps with the sequences of the splice-sites. [9]

Dementia

According to a research study conducted Hutton, M et al, a missense mutation occurring on the 5' region of the RNA associated with the tau protein was found to be correlated with inherited dementia (known as FTDP-17). The splice-site mutations all destabilize a potential stem–loop structure which is most likely involved in regulating the alternative splicing of exon10 in chromosome 17. Consequently, more usage occurs on the 5' splice site and an increased proportion of tau transcripts that include exon 10 are created. Such drastic increase in mRNA will increase the proportion of Tau containing four microtubule-binding repeats, which is consistent with the neuropathology described in several families with FTDP-17, a type inherited dementia. [10]

Epilepsy

Some types of epilepsy may be brought on due to a splice site mutation. In addition to a mutation in a stop codon, a splice site mutation on the 3' strand was found in a gene coding for cystatin B in Progressive Myoclonus Epilepsy [11] patients. This combination of mutations was not found in unaffected individuals. By comparing sequences with and without the splice site mutation, investigators were able to determine that a G-to-C nucleotide transversion occurs at the last position of the first intron. This transversion occurs in the region that codes for the cystatin B gene. Individuals suffering from Progressive Myoclonus Epilepsy possess a mutated form of this gene, which results in decreased output of mature mRNA, and subsequently decreases in protein expression.

A study has also shown that a type of Childhood Absence Epilepsy (CAE) causing febrile seizures may be linked to a splice site mutation in the sixth intron of the GABRG2 gene. This splice site mutation was found to cause a nonfunctional GABRG2 subunit in affected individuals. [12] According to this study, a point mutation was the culprit for the splice-donor site mutation, which occurred in intron 6. A nonfunctional protein product is produced, leading to the also nonfunctional subunit.

Hematological Disorders

Several genetic diseases may be the result of splice site mutations. For example, mutations that cause the incorrect splicing of β-globin mRNA are responsible of some cases of β-thalassemia. Another Example is TTP (thrombotic thrombocytopenic purpura). TTP is caused by deficiency of ADAMTS-13. A splice site mutation of ADAMTS-13 gene can therefore cause TTP. It is estimated that 15% of all point mutations causing human genetic diseases occur within a splice site. [13]

Parathyroid Deficiency

When a splice site mutation occurs in intron 2 of the gene that produces the parathyroid hormone, a parathyroid deficiency can prevail. In one particular study, a G to C substitution in the splice site of intron 2 produces a skipping effect in the messenger RNA transcript. The exon that is skipped possesses the initiation start codon to produce parathyroid hormone. [14] Such failure in initiation causes the deficiency.

Analysis

Using the model organism Drosophila melanogaster, data has been compiled regarding the genomic information and sequencing of this organism. A prediction model exists in which a researcher can upload his or her genomic information and use a splice site prediction database to gather information about where the splice sites could be located. The Berkeley Drosophila Project can be used to incorporate this research, as well as annotate high quality euchromatic data. The splice site predictor can be a great tool for researchers studying human disease in this model organism.

Splice site mutations can be analyzed using information theory. [15]

Related Research Articles

<span class="mw-page-title-main">Exon</span> A region of a transcribed gene present in the final functional mRNA molecule

An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature RNA. Just as the entire set of genes for a species constitutes the genome, the entire set of exons constitutes the exome.

An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intragenic region, i.e., a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts. The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons.

<span class="mw-page-title-main">Protein biosynthesis</span> Assembly of proteins inside biological cells

Protein biosynthesis is a core biological process, occurring inside cells, balancing the loss of cellular proteins through the production of new proteins. Proteins perform a number of critical functions as enzymes, structural proteins or hormones. Protein synthesis is a very similar process for both prokaryotes and eukaryotes but there are some distinct differences.

<span class="mw-page-title-main">RNA splicing</span> Process in molecular biology

RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.

The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for a protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.

<span class="mw-page-title-main">Alternative splicing</span> Process by which a gene can code for multiple proteins

Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be included within or excluded from the final, processed messenger RNA (mRNA) produced from that gene. This means the exons are joined in different combinations, leading to different (alternative) mRNA strands. Consequently, the proteins translated from alternatively spliced mRNAs usually contain differences in their amino acid sequence and, often, in their biological functions.

<span class="mw-page-title-main">Silent mutation</span> DNA mutation with no observable effect on an organisms phenotype

Silent mutations are mutations in DNA that do not have an observable effect on the organism's phenotype. They are a specific type of neutral mutation. The phrase silent mutation is often used interchangeably with the phrase synonymous mutation; however, synonymous mutations are not always silent, nor vice versa. Synonymous mutations can affect transcription, splicing, mRNA transport, and translation, any of which could alter phenotype, rendering the synonymous mutation non-silent. The substrate specificity of the tRNA to the rare codon can affect the timing of translation, and in turn the co-translational folding of the protein. This is reflected in the codon usage bias that is observed in many species. Mutations that cause the altered codon to produce an amino acid with similar functionality are often classified as silent; if the properties of the amino acid are conserved, this mutation does not usually significantly affect protein function.

IKBKAP is a human gene encoding the IKAP protein, which is ubiquitously expressed at varying levels in all tissue types, including brain cells. The IKAP protein is thought to participate as a sub-unit in the assembly of a six-protein putative human holo-Elongator complex, which allows for transcriptional elongation by RNA polymerase II. Further evidence has implicated the IKAP protein as being critical in neuronal development, and directs that decreased expression of IKAP in certain cell types is the molecular basis for the severe, neurodevelopmental disorder familial dysautonomia. Other pathways that have been connected to IKAP protein function in a variety of organisms include tRNA modification, cell motility, and cytosolic stress signalling. Homologs of the IKBKAP gene have been identified in multiple other Eukaryotic model organisms. Notable homologs include Elp1 in yeast, Ikbkap in mice, and D-elp1 in fruit flies. The fruit fly homolog (D-elp1) has RNA-dependent RNA polymerase activity and is involved in RNA interference.

Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron structure. There are different mechanisms through which exon shuffling occurs: transposon mediated exon shuffling, crossover during sexual recombination of parental genomes and illegitimate recombination.

In molecular biology, an exonic splicing enhancer (ESE) is a DNA sequence motif consisting of 6 bases within an exon that directs, or enhances, accurate splicing of heterogeneous nuclear RNA (hnRNA) or pre-mRNA into messenger RNA (mRNA).

An exonic splicing silencer (ESS) is a short region of an exon and is a cis-regulatory element. A set of 103 hexanucleotides known as FAS-hex3 has been shown to be abundant in ESS regions. ESSs inhibit or silence splicing of the pre-mRNA and contribute to constitutive and alternate splicing. To elicit the silencing effect, ESSs recruit proteins that will negatively affect the core splicing machinery.

<span class="mw-page-title-main">Cystatin B</span> Protein-coding gene in the species Homo sapiens

Cystatin-B is a protein that in humans is encoded by the CSTB gene.

<span class="mw-page-title-main">60S ribosomal protein L41</span> Protein found in humans

60S ribosomal protein L41 is a protein that is specific to humans and is encoded by the RPL41 gene, also known as HG12 and large eukaryotic ribosomal subunit protein eL41. The gene family HGNC is L ribosomal proteins. The protein itself is also described as P62945-RL41_HUMAN on the GeneCards database. This RPL41 gene is located on chromosome 12.

Periannan Senapathy is a molecular biologist, geneticist, author and entrepreneur. He is the founder, president and chief scientific officer at Genome International Corporation, a biotechnology, bioinformatics, and information technology firm based in Madison, Wisconsin, which develops computational genomics applications of next-generation DNA sequencing (NGS) and clinical decision support systems for analyzing patient genome data that aids in diagnosis and treatment of diseases.

<span class="mw-page-title-main">Kohlschütter–Tönz syndrome</span> Medical condition

Kohlschütter–Tönz syndrome (KTS), also called amelo-cerebro-hypohidrotic syndrome, is a rare inherited syndrome characterized by epilepsy, psychomotor delay or regression, intellectual disability, and yellow teeth caused by amelogenesis imperfecta. It is a type A ectodermal dysplasia.

<span class="mw-page-title-main">Minigene</span>

A minigene is a minimal gene fragment that includes an exon and the control regions necessary for the gene to express itself in the same way as a wild type gene fragment. This is a minigene in its most basic sense. More complex minigenes can be constructed containing multiple exons and intron(s). Minigenes provide a valuable tool for researchers evaluating splicing patterns both in vivo and in vitro biochemically assessed experiments. Specifically, minigenes are used as splice reporter vectors and act as a probe to determine which factors are important in splicing outcomes. They can be constructed to test the way both cis-regulatory elements and trans-regulatory elements affect gene expression.

Exitrons are produced through alternative splicing and have characteristics of both introns and exons, but are described as retained introns. Even though they are considered introns, which are typically cut out of pre mRNA sequences, there are significant problems that arise when exitrons are spliced out of these strands, with the most obvious result being altered protein structures and functions. They were first discovered in plants, but have recently been found in metazoan species as well.

<span class="mw-page-title-main">Prp8</span>

Prp8 refers to both the Prp8 protein and Prp8 gene. Prp8's name originates from its involvement in pre-mRNA processing. The Prp8 protein is a large, highly conserved, and unique protein that resides in the catalytic core of the spliceosome and has been found to have a central role in molecular rearrangements that occur there. Prp8 protein is a major central component of the catalytic core in the spliceosome, and the spliceosome is responsible for splicing of precursor mRNA that contains introns and exons. Unexpressed introns are removed by the spliceosome complex in order to create a more concise mRNA transcript. Splicing is just one of many different post-transcriptional modifications that mRNA must undergo before translation. Prp8 has also been hypothesized to be a cofactor in RNA catalysis.

<span class="mw-page-title-main">Shapiro–Senapathy algorithm</span>

The ShapiroSenapathy algorithm (S&S) is an algorithm for predicting splice junctions in genes of animals and plants. This algorithm has been used to discover disease-causing splice site mutations and cryptic splice sites.

The split gene theory is a theory of the origin of introns, long non-coding sequences in eukaryotic genes between the exons. The theory holds that the randomness of primordial DNA sequences would only permit small (< 600bp) open reading frames (ORFs), and that important intron structures and regulatory sequences are derived from stop codons. In this introns-first framework, the spliceosomal machinery and the nucleus evolved due to the necessity to join these ORFs into larger proteins, and that intronless bacterial genes are less ancestral than the split eukaryotic genes. The theory originated with Periannan Senapathy.

References

  1. 1 2 Berget SM (February 1995). "Exon recognition in vertebrate splicing". The Journal of Biological Chemistry. 270 (6): 2411–2414. doi: 10.1074/jbc.270.6.2411 . PMID   7852296.
  2. Understanding Cancer Genomics: Splice Site Mutations. National Cancer Institute. Archived from the original on 2020-02-03. Retrieved 2014-11-16.
  3. Understanding Cancer Genomics. National Cancer Institute.
  4. "Physiology or Medicine 1993 - Press Release". www.nobelprize.org. Retrieved 2017-10-07.
  5. "The Human Splicing Finder".
  6. Friedman LS, Ostermeyer EA, Szabo CI, Dowd P, Lynch ED, Rowell SE, et al. (December 1994). "Confirmation of BRCA1 by analysis of germline mutations linked to breast and ovarian cancer in ten families". Nature Genetics. 8 (4): 399–404. doi:10.1038/ng1294-399. PMID   7894493. S2CID   2863113.
  7. 1 2 Andrades A, Álvarez-Pérez JC, Patiño-Mercau JR, Cuadros M, Baliñas-Gavira C, Medina PP (April 2022). "Recurrent splice site mutations affect key diffuse large B-cell lymphoma genes". Blood. 139 (15): 2406–2410. doi:10.1182/blood.2021011708. PMID   34986231.
  8. Baliñas-Gavira C, Rodríguez MI, Andrades A, Cuadros M, Álvarez-Pérez JC, Álvarez-Prado ÁF, et al. (October 2020). "Frequent mutations in the amino-terminal domain of BCL7A impair its tumor suppressor role in DLBCL". Leukemia. 34 (10): 2722–2735. doi:10.1038/s41375-020-0919-5. PMID   32576963. S2CID   219989509.
  9. Benitez-Cantos MS, Cano C, Cuadros M, Medina PP (February 2024). "Activation-induced cytidine deaminase causes recurrent splicing mutations in diffuse large B-cell lymphoma". Molecular Cancer. 23 (1): 42. doi: 10.1186/s12943-024-01960-w . PMC   10893679 . PMID   38402205.
  10. Hutton M, Lendon CL, Rizzu P, Baker M, Froelich S, Houlden H, et al. (June 1998). "Association of missense and 5'-splice-site mutations in tau with the inherited dementia FTDP-17". Nature. 393 (6686): 702–705. Bibcode:1998Natur.393..702H. doi:10.1038/31508. PMID   9641683. S2CID   205001265.
  11. Pennacchio LA, Lehesjoki AE, Stone NE, Willour VL, Virtaneva K, Miao J, et al. (March 1996). "Mutations in the gene encoding cystatin B in progressive myoclonus epilepsy (EPM1)". Science. 271 (5256): 1731–1734. Bibcode:1996Sci...271.1731P. doi:10.1126/science.271.5256.1731. JSTOR   2890839. PMID   8596935. S2CID   84361089.
  12. Kananura C, Haug K, Sander T, Runge U, Gu W, Hallmann K, et al. (July 2002). "A splice-site mutation in GABRG2 associated with childhood absence epilepsy and febrile convulsions". Archives of Neurology. 59 (7): 1137–1141. doi:10.1001/archneur.59.7.1137. PMID   12117362.
  13. Carvalho GA, Weiss RE, Refetoff S (October 1998). "Complete thyroxine-binding globulin (TBG) deficiency produced by a mutation in acceptor splice site causing frameshift and early termination of translation (TBG-Kankakee)". The Journal of Clinical Endocrinology and Metabolism. 83 (10): 3604–3608. doi: 10.1210/jcem.83.10.5208 . PMID   9768672.
  14. Parkinson DB, Thakker RV (May 1992). "A donor splice site mutation in the parathyroid hormone gene is associated with autosomal recessive hypoparathyroidism". Nature Genetics. 1 (2): 149–152. doi:10.1038/ng0592-149. PMID   1302009. S2CID   24032313.
  15. Rogan PK, Faux BM, Schneider TD (1998). "Information analysis of human splice site mutations". Human Mutation. 12 (3): 153–171. doi: 10.1002/(SICI)1098-1004(1998)12:3<153::AID-HUMU3>3.0.CO;2-I . PMID   9711873.