The TREX (TRanscription-EXport) complex is a conserved eukaryotic multi-protein complex that couples mRNA transcription and nuclear export. [1] The TREX complex travels across transcribed genes with RNA polymerase II. [2] TREX binds mRNA and recruits transport proteins NXF1 and NXT1 (yeast Mex67 and Mtr2), which shuttle the mRNA out of the nucleus. [3] [4] [5] [6] The TREX complex plays an important role in genome stability and neurodegenerative diseases. [7]
During transcription elongation, the THO complex follows RNA polymerase II and interacts with transcription factors along the entire transcribed region. [1] Then, the carboxy-terminal domain (CTD) of RNA polymerase II recruits the 3'-end processing factors/transcription termination factors, which load DEAD-box RNA helicase UAP56 and RNA export adapter ALYREF. This forms the complete TREX complex. At the end of transcription, after the 3'-end of mRNA is formed and the mRNA is released from transcription site, the mRNA is transferred from UAP56 to ALYREF. UAP56 then dissociates, allowing the heterodimeric export receptor NXF1-NXT1 to bind as it recognizes the mRNA indirectly through ALYREF. Further arrangements of mRNA result in ALYREF's release. Finally, the NXF1-NXT1 dimer facilitates mRNA nuclear transport to the cytoplasm through direct interaction with the nuclear pore complex.[ citation needed ]
The human THO complex comprises six subunits, THOC1, −2, –3, −5, –6, and −7. Four of them have counterparts in Saccharomyces cerevisiae : THOC1 (yeast Hpr1), −2 (yeast Tho2), −3 (yeast Tex3), and −7 (yeast Mft1). [7] [9] THOC1 is the first protein identified in THO complex. THOC2 is the largest subunit of TREX. It acts as a scaffold for the formation of the complex. The C-terminal domain of THOC2 directly interacts with nucleic acids. Mutational variants of THOC2 have been associated with syndromic intellectual disabilities, causing seizures, tremors, speech delays, and more. [10] [11] THOC 3 and 6 both contains WD40 repeat motifs that allow interaction with other THO proteins. [12] THOC5 and THOC7 binds tightly and forms a dimer at their coiled coil domain (CCD). Four THO complexes form a tetramer, and each THO complex binds with one UAP56 protein at THOC2 and THOC1.[ citation needed ]
DDX39, or U2AF65-associated protein 56 (UAP56, Sub2 in yeast) is a DEAD-box ATPase essential for pre-mRNA splicing, [13] but is also a key component of the TREX complex. DDX39 is very similar to UAP56, sharing 90% of the amino acid sequence. [1] UAP56 travels along genes with the THO complex, where it interacts with the sugar-phosphate backbone of the mRNA. [14] UAP56 functions to recruit ALYREF, an RNA export adaptor, to the spliced or intronless mRNA. [15] [6] After transfer of the mRNA from UAP56 to ALYREF, UAP56 dissociates from the complex, allowing the binding of export factor NXF1 to ALYREF at the same site. [16] [4]
In mammalian cells, a paralog of DDX39b, DDX39a, exists, and is somewhat functionally redundant. Knockdown of both paralogs is required to block mRNA export, [21] [22] but depletion of either paralog affects different forms of mRNAs. [23] DDX39b is shown to associate with THO and ALYREF, and DDX39a with CIP29 and ALYREF. [24]
ALYREF (Yra1 in yeast) is an essential RNA export adapter involved in the export of both spliced and intronless mRNAs. [25] The N and C-termini of ALYREF both contain UAP56-bonding motifs (UBMs), which are necessary for its interaction with UAP56. [21] ALYREF also contains an RNA recognition motif (RRM) that weakly binds RNA, [26] and is flanked by two arginine-rich RNA binding sites. ALYREF alone cannot bind mRNA effectively, and requires interaction with UAP56 to bind the mRNA in the TREX Complex (see Figure 3). These arginine-rich sites are also necessary for ALYREF's interaction with export receptor NXF1, which stimulates the transfer of the mRNA from ALYREF to NXF1. [5] Like UAP56, ALYREF dissociates prior to nuclear export of the mRNA. [16] [27] The unstructured and flexible nature of ALYREF indicates it may play a key role in packaging the mRNA and proteins into a messenger ribonuclear protein (mRNP) for nuclear export. [28]
UIF, identified through gene homology of ALYREF's UAP56-binding domain, is functionally redundant with ALYREF. Knockdown of ALYREF in mammalian cells results in large upregulation of UIF. UIF can associate with the other TREX complex components in an RNA-independent manner. [21] UIF is speculated to associate with alternative TREX complexes in place of ALYREF, perhaps acting on certain types or mRNAs.[ citation needed ]
Originally identified as a RNA-binding protein involved in cell cycle regulation, [29] CHTOP contains two UBMs like those in ALYREF and UIF, and is thought to function in a similar manner to ALYREF. CHTOP has also been shown to stimulate UAP56 ATPase activity. [30] CHTOP is speculated to associate with alternative TREX complexes in place of UAP56, perhaps acting on specific types or mRNAs.[ citation needed ]
SARNP/CIP29 (yeast Tho1), identified alongside yeast Tho2, [31] forms a trimeric complex with UAP56 and ALYREF, [32] and has been shown to preferentially associate with DDX39a. [24] SARNP stimulates UAP56 ATPase activity. [30] [33]
Yeast | Drosophila | Mammals | |
---|---|---|---|
THO components | Hpr1 | Thoc1 | Thoc1 (hHpr1) |
Tho2 | Thoc2 | Thoc2 | |
Thp2 | |||
Mft1 | Thoc7 | Thoc7 | |
Thoc5 | Thoc5 (FMIP) | ||
Thoc6 | Thoc6 | ||
Tex1 | Thoc3 | Thoc3 (hTEX1) | |
DEAD-box type helicase | Sub2 | Uap56 | Uap56 |
DDX39 | |||
Adaptor mRNA binding protein | Yra1 | Aly | ALYREF |
NXF1(Mex67p in yeast), also known as nuclear RNA export factor 1, is a multi-domain protein composed of one conserved N-terminal RNA recognition and four leucine-rich repeat motifs, a central NTF2-like domain, and a C-terminal ubiquitin associated domain that mediates interactions with nucleoporins. The NTF2-like domain is able to form heterodimers with NTF2-related export protein-1 (NXT1). The heterodimer binds mRNAs processed by the TREX complex and assists the TREX complex in the nuclear export process. [34] [35]
NXT1 (Mtr2p in yeast) is also known as p15. It shuttles between the nucleus and the cytoplasm acting as an active nuclear transport protein. NXT1 binds specifically to Ran-GTP and localizes to the nuclear pore complex in mammalian cells. It also stabilizes and forms heterodimers with NXF1. The heterodimer binds mRNAs processed by the TREX complex and assists the TREX complex in the nuclear export process. [36]
NCBP1 and NCBP3 are both part of the cap-binding complex. The two proteins interact with each other as well as the TREX complex in facilitating the mRNA export from the nucleus to the cytoplasm. NCBP3 further interact with exon junction complex proteins for mRNA splicing and stability. [37]
The TREX complex is a conserved protein complex that couples transcription to mRNA export and is linked to genome stability and several disorders.
The TREX complex plays an important role in genome stability. Newly formed RNA strands can hybridize with the single-stranded template DNA sequence during transcription, leading to an R-loop. [7] The R-loop makes the opposing DNA strand more susceptible to cleavage, which can cause DNA damage in cells. [7] The TREX complex associates with the RNA polymerase and newly formed RNA, sequestering the RNA and, therefore, preventing its hybridization to the DNA strand, improving genome stability. [7]
The TREX complex is associated with several neurodegenerative and neurodevelopmental disorders. These disorders are caused by mutations in the TREX complex itself or in other genes. [7]
Several mutations in the THOC2 gene, part of the THO complex, are associated with disease. For example, missense mutations, or a change in a nucleotide that results in the encoding of a different amino acid, in this gene and translocations on the X chromosome are associated with intellectual disabilities. [7] [38]
The THOC6 gene, part of the THO complex, plays a role in the development of the brain and other organs. Mutations on this gene leads to the incorrect localization of the protein in the cytoplasm, an essential process for neural and organ development. [7] A homozygous mutation in this gene can lead to not only intellectual disability, but cardiac defects and brain malformation. [7]
Mutations in other genes can also have an indirect dependence on the TREX complex and lead to disease, including familial amyotrophic lateral sclerosis(ALS). ALS is a rare neurodegenerative disease that leads to the death of motor neurons in the brain, resulting in the loss of voluntary movement. [39] In the familial form of the disease, a GGGGCC repeat in an intron of the C9ORF72 gene is expanded in the pre-mRNA, which is exported to the cytoplasm and forms RNA foci. [7] ALYREF binds to the repeat expansion, and an excess recruitment promotes its export. [7] A mutation that disrupts its activity suppresses neurodegeneration, and is enhanced by CHTOP and NXF1. [7]
A nuclear pore is a channel as part of the nuclear pore complex (NPC), a large protein complex found in the nuclear envelope of eukaryotic cells. The nuclear envelope (NE) surrounds the cell nucleus containing DNA and facilitates the selective membrane transport of various molecules.
RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.
SR proteins are a conserved family of proteins involved in RNA splicing. SR proteins are named because they contain a protein domain with long repeats of serine and arginine amino acid residues, whose standard abbreviations are "S" and "R" respectively. SR proteins are ~200-600 amino acids in length and composed of two domains, the RNA recognition motif (RRM) region and the RS domain. SR proteins are more commonly found in the nucleus than the cytoplasm, but several SR proteins are known to shuttle between the nucleus and the cytoplasm.
RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif (RRM), dsRNA binding domain, zinc finger and others. They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles (hnRNPs). RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database (EuRBPDB), there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP (ribonucleoprotein) for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically. It has now become clear that RNA–RBP interactions play important roles in many biological processes among organisms.
Ribosome biogenesis is the process of making ribosomes. In prokaryotes, this process takes place in the cytoplasm with the transcription of many ribosome gene operons. In eukaryotes, it takes place both in the cytoplasm and in the nucleolus. It involves the coordinated function of over 200 proteins in the synthesis and processing of the three prokaryotic or four eukaryotic rRNAs, as well as assembly of those rRNAs with the ribosomal proteins. Most of the ribosomal proteins fall into various energy-consuming enzyme families including ATP-dependent RNA helicases, AAA-ATPases, GTPases, and kinases. About 60% of a cell's energy is spent on ribosome production and maintenance.
Nuclear RNA export factor 1, also known as NXF1 or TAP, is a protein which in humans is encoded by the NXF1 gene.
Non-POU domain-containing octamer-binding protein (NonO) is a protein that in humans is encoded by the NONO gene.
ATP-dependent RNA helicase A is an enzyme that in humans is encoded by the DHX9 gene.
Nucleoporin 214 (Nup2014) is a protein that in humans is encoded by the NUP214 gene.
Spliceosome RNA helicase BAT1 is an enzyme that in humans is encoded by the BAT1 gene.
ATP-dependent RNA helicase DDX3X is an enzyme that in humans is encoded by the DDX3X gene.
Aly/REF export factor, also known as THO complex subunit 4 is a protein that in humans is encoded by the ALYREF gene.
Heterogeneous nuclear ribonucleoprotein A/B, also known as HNRNPAB, is a protein which in humans is encoded by the HNRNPAB gene. Although this gene is named HNRNPAB in reference to its first cloning as an RNA binding protein with similarity to HNRNP A and HNRNP B, it is not a member of the HNRNP A/B subfamily of HNRNPs, but groups together closely with HNRNPD/AUF1 and HNRNPDL.
THO complex subunit 1 is a protein that in humans is encoded by the THOC1 gene.
ATP-dependent RNA helicase DDX39 is an enzyme that in humans is encoded by the DDX39 gene.
Post-transcriptional regulation is the control of gene expression at the RNA level. It occurs once the RNA polymerase has been attached to the gene's promoter and is synthesizing the nucleotide sequence. Therefore, as the name indicates, it occurs between the transcription phase and the translation phase of gene expression. These controls are critical for the regulation of many genes across human tissues. It also plays a big role in cell physiology, being implicated in pathologies such as cancer and neurodegenerative diseases.
Rev is a transactivating protein that is essential to the regulation of HIV-1 protein expression. A nuclear localization signal is encoded in the rev gene, which allows the Rev protein to be localized to the nucleus, where it is involved in the export of unspliced and incompletely spliced mRNAs. In the absence of Rev, mRNAs of the HIV-1 late (structural) genes are retained in the nucleus, preventing their translation.
An exon junction complex (EJC) is a protein complex which forms on a pre-messenger RNA strand at the junction of two exons which have been joined together during RNA splicing. The EJC has major influences on translation, surveillance, localization of the spliced mRNA, and m6A methylation. It is first deposited onto mRNA during splicing and is then transported into the cytoplasm. There it plays a major role in post-transcriptional regulation of mRNA. It is believed that exon junction complexes provide a position-specific memory of the splicing event. The EJC consists of a stable heterotetramer core, which serves as a binding platform for other factors necessary for the mRNA pathway. The core of the EJC contains the protein eukaryotic initiation factor 4A-III bound to an adenosine triphosphate (ATP) analog, as well as the additional proteins Magoh and Y14. The binding of these proteins to nuclear speckled domains has been measured recently and it may be regulated by PI3K/AKT/mTOR signaling pathways. In order for the binding of the complex to the mRNA to occur, the eIF4AIII factor is inhibited, stopping the hydrolysis of ATP. This recognizes EJC as an ATP dependent complex. EJC also interacts with a large number of additional proteins; most notably SR proteins. These interactions are suggested to be important for mRNA compaction. The role of EJC in mRNA export is controversial.
Gene gating is a phenomenon by which transcriptionally active genes are brought next to nuclear pore complexes (NPCs) so that nascent transcripts can quickly form mature mRNA associated with export factors. Gene gating was first hypothesised by Günter Blobel in 1985. It has been shown to occur in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster as well as mammalian model systems.
Prp8 refers to both the Prp8 protein and Prp8 gene. Prp8's name originates from its involvement in pre-mRNA processing. The Prp8 protein is a large, highly conserved, and unique protein that resides in the catalytic core of the spliceosome and has been found to have a central role in molecular rearrangements that occur there. Prp8 protein is a major central component of the catalytic core in the spliceosome, and the spliceosome is responsible for splicing of precursor mRNA that contains introns and exons. Unexpressed introns are removed by the spliceosome complex in order to create a more concise mRNA transcript. Splicing is just one of many different post-transcriptional modifications that mRNA must undergo before translation. Prp8 has also been hypothesized to be a cofactor in RNA catalysis.