Poly(A)-binding protein (PAB or PABP) [1] is a RNA-binding protein which triggers the binding of eukaryotic initiation factor 4 complex (eIF4G) directly to the poly(A) tail of mRNA which is 200-250 nucleotides long. [2] The poly(A) tail is located on the 3' end of mRNA and was discovered by Mary Edmonds, [3] who also characterized the poly-A polymerase enzyme that generates the poly(a) tail. [4] The binding protein is also involved in mRNA precursors by helping polyadenylate polymerase add the poly(A) nucleotide tail to the pre-mRNA before translation. [5] The nuclear isoform selectively binds to around 50 nucleotides and stimulates the activity of polyadenylate polymerase by increasing its affinity towards RNA. Poly(A)-binding protein is also present during stages of mRNA metabolism including nonsense-mediated decay and nucleocytoplasmic trafficking. The poly(A)-binding protein may also protect the tail from degradation and regulate mRNA production. Without these two proteins in-tandem, then the poly(A) tail would not be added and the RNA would degrade quickly. [6]
Cytosolic poly-A binding protein (PABPC) is made up of four RNA recognition motifs (RRMs) and a C-terminal region known as the PABC domain. RRM is the most common motifs for RNA recognition and is usually made up of 90-100 amino acids. Previous solution NMR and X-ray crystallography studies have shown that RRMs are globular domains, each composed of 4 anti-parallel β sheets that are backed by 2 α-helices. The central two β-strands, connected by a short linker, of each RRM forms a trough-like surface that is thought to be responsible for binding to the poly(A) oligonucleotides. The polyadenylate RNA adopts an extended conformation running the length of the molecular trough. Adenine recognition is primarily mediated by contacts with conserved residues found in the RNP motifs of the two RRMs. [7] In vitro studies have shown the binding affinities to be on the order of 2-7nM, while affinity for poly(U), poly(G), and poly(C) were reportedly lower or undetectable in comparison. This shows that the poly(A)-binding protein is specific to poly(A) oligonucleotides and not others. [8] Since the two central β-strands are used for poly(A) oligonucleotide binding, the other face of the protein is free for protein-protein interactions.
The PABC domain is approximately 75 amino acids and consists of 4 or 5 α-helices depending on the organism – human PABCs have 5, while yeast has been observed to have 4. This domain does not contact RNA, and instead, it recognizes 15 residues sequences that are a part of the PABP interaction motif (PAM-2) found on such proteins as eukaryotic translation termination factor (eRF3) and PABP interacting proteins 1 and 2 (PAIP 1, PAIP2).
The structure of human poly(A)-binding protein found in the nucleus (PABPN1) has yet to be well determined but it has been shown to contain a single RRM domain and an arginine rich carboxy terminal domain. They are thought to be structurally and functionally different from poly-A binding proteins found in the cytosol.
The expression of mammalian poly(A)-binding protein is regulated at the translational level by a feed-back mechanism: the mRNA encoding PABP contains in its 5' UTR an A-rich sequence which binds poly(A)-binding protein. This leads to autoregulatory repression of translation of PABP.
The cytosolic isoform of eukaryotic poly(A)-binding protein binds to the initiation factor eIF4G via its C-terminal domain. eIF4G is a component of the eIF4F complex, containing eIF4E, another initiation factor bound to the 5' cap on the 5' end of mRNA. This binding forms the characteristic loop structure of eukaryotic protein synthesis. Poly(A)-binding proteins in the cytosol compete for the eIF4G binding sites. This interaction enhances both the affinity of eIF4E for the cap structure and PABP1 for poly(A), effectively locking proteins onto both ends of the mRNA. As a result, this association may in part underlie the ability of PABP1 to promote small ribosomal (40S) subunit recruitment, which is aided by the interaction between eIF4G and eIF3. Poly(A)-binding protein has also been shown to interact with a termination factor (eRF3). The eRF3/PABP1 interaction may promote recycling of terminating ribosomes from the 3' to 5' end, facilitating multiple rounds of initiation on an mRNA. Alternatively, it may link translation to mRNA decay, as eRF3 appears to interfere with the ability of PABP1 to multimerise/form on poly(A), potentially leading to PABP1 dissociation, deadenylation and, ultimately, turnover. [9]
Rotavirus RNA-binding protein NSP3 interacts with eIF4GI and evicts the poly(A) binding protein from eIF4F. NSP3A by taking the place of PABP on eIF4GI, is responsible for the shut-off of cellular protein synthesis. [10] Rotavirus mRNAs terminate a 3’ GACC motif that is recognized by the viral protein NSP3. This is the location where NSP3 competes with poly(A)-binding protein for eIF4G binding.
Once rotavirus infection occurs viral GACC-tailed mRNAs are translated while the poly(A)-tailed mRNA is severely impaired. In infected cells, there have been high magnitudes of both translation induction (GACC-tailed mRNA) and reduction (poly(A)-tailed mRNA) both dependent on the rotavirus strain. These data suggest that NSP3 is a translational surrogate of the PABP-poly(A) complex; therefore, it cannot by itself be responsible for inhibiting the translation of host poly(A)-tailed mRNAs upon rotavirus infection. [11]
PABP-C1 evicted from eIF4G by NSP3 accumulates in the nucleus of rotavirus-infected cells. This eviction process requires rotavirus NSP3, eIF4G, and RoXaN. To better understand the interaction, modeling of the NSP3-RoXaN complex, demonstrates mutations in NSP3 interrupt this complex without compromising NSP3 interaction with eIF4G. The nuclear localization of PABP-C1 is dependent on the capacity of NSP3 to interact with eIF4G and also requires the interaction of NSP3 with a specific region in RoXaN, the leucine- and aspartic acid-rich (LD) domain. RoXaN is identified as a cellular partner of NSP3 involved in the nucleocytoplasmic localization of PABP-C1. [12]
Oculopharyngeal muscular dystrophy (OPMD) is a genetic condition that occurs in adulthood often after the age of 40. This disorder usually leads to weaker facial muscles oftentimes showing as progressive eyelid drooping, swallowing difficulties, and proximal limb muscle weakness such as weak leg and hip muscles. People with this disorder are often hindered to the point that they have to use a cane in order to walk. [13] OPMD has been reported in approximately 29 countries and the number affected varies widely by specific population. The disease can be inherited as an autosomal dominant or recessive trait. [14]
Mutations of poly(A)-binding protein nuclear 1 (PABPN1) can cause OPMD (oculopharyngeal muscular dystrophy). What makes the PABPN1 protein so different than all other genes with disease causing expanded polyalanine tracts, is that it is not a transcription factor. Instead, PABPN1 is involved in the polyadenylation of mRNA precursors. [15]
Mutations in PABPN1 that cause this disorder, result when the protein has an extended polyalanine tract (12-17 alanines long vs. the expected amount of 10). The extra alanines cause PABPN1 to aggregate and form clumps within muscles because they are not able to be broken down. These clumps are believed to disrupt the normal function of muscle cells which eventually lead to cell death. This progressive loss of muscle cells most likely causes the weakness in muscles seen in patients with OPMD. It is still not known why this disorder only affects certain muscles like the upper leg and hip. In recent studies on OPMD in Drosophila, it has been shown that the degeneration of muscles within those who are affected may not solely be due to the expanded polyalanine tract. It may actually be due to the RNA-binding domain and its function in binding. [16]
As of November 2015, there has been a lot of effort devoted to research of OPMD and how one might treat it. Myoblast Transplantation has been suggested and is in fact in clinical trials in France. This is done by taking myoblasts from a normal muscle cell and putting them into pharyngeal muscles and allowing them to develop to help form new muscle cells. There has also been testing of compounds, either existing or developed, to see if they might combat OPMD and its symptoms. Trehalose is a special form of sugar that has shown reduced aggregate formation and delayed pathology in the mouse model of OPMD. Doxycycline also played a similar role in delaying toxicity of OPMD in mouse models most likely due to stopping aggregate formation and reduced apoptosis. Many other compounds and methods are currently being researched and showing some success in clinical trials leading to optimism in curing this disease. [17]
Multiple human genes encode different protein isoforms and paralogs of PABP, including PABPN1, PABPC1, PABPC3, PABPC4, PABPC5. [18]
In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally influence gene expression.
A polyribosome is a group of ribosomes bound to an mRNA molecule like “beads” on a “thread”. It consists of a complex of an mRNA molecule and two or more ribosomes that act to translate mRNA instructions into polypeptides. Originally coined "ergosomes" in 1963, they were further characterized by Jonathan Warner, Paul M. Knopf, and Alex Rich.
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature mRNA for translation. In many bacteria, the poly(A) tail promotes degradation of the mRNA. It, therefore, forms part of the larger process of gene expression.
RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif (RRM), dsRNA binding domain, zinc finger and others. They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles (hnRNPs). RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database (EuRBPDB), there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP (ribonucleoprotein) for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically.
Eukaryotic translation is the biological process by which messenger RNA is translated into proteins in eukaryotes. It consists of four phases: gene translation, elongation, termination, and recapping.
The cytoplasmic polyadenylation element (CPE) is a sequence element found in the 3' untranslated region of messenger RNA. While several sequence elements are known to regulate cytoplasmic polyadenylation, CPE is the best characterized. The most common CPE sequence is UUUUAU, though there are other variations. Binding of CPE binding protein to this region promotes the extension of the existing polyadenine tail and, in general, activation of the mRNA for protein translation. This elongation occurs after the mRNA has been exported from the nucleus to the cytoplasm. A longer poly(A) tail attracts more cytoplasmic polyadenine binding proteins (PABPs) which interact with several other cytoplasmic proteins that encourage the mRNA and the ribosome to associate. The lengthening of the poly(A) tail thus has a role in increasing translational efficiency of the mRNA. The polyadenine tails are extended from approximately 40 bases to 150 bases.
Rotavirus translation, the process of translating mRNA into proteins, occurs in a different way in Rotaviruses. Unlike the vast majority of cellular proteins in other organisms, in Rotaviruses the proteins are translated from capped but nonpolyadenylated mRNAs. The viral nonstructural protein NSP3 specifically binds the 3'-end consensus sequence of viral mRNAs and interacts with the eukaryotic translation initiation factor eIF4G. The Rotavirus replication cycle occurs entirely in the cytoplasm. Upon virus entry, the viral transcriptase synthesizes capped but nonpolyadenylated mRNA The viral mRNAs bear 5' and 3' untranslated regions (UTR) of variable length and are flanked by two different sequences common to all genes.
RoXaN also known as ZC3H7B, is a protein that in humans is encoded by the ZC3H7B gene. RoXaN is a protein that contains tetratricopeptide repeat and leucine-aspartate repeat as well as zinc finger domains. This protein also interacts with the rotavirus non-structural protein NSP3.
NSP1 (NS53), the product of rotavirus gene 5, is a nonstructural RNA-binding protein that contains a cysteine-rich region and is a component of early replication intermediates. RNA-folding predictions suggest that this region of the NSP1 mRNA can interact with itself, producing a stem-loop structure similar to that found near the 5'-terminus of the NSP1 mRNA.
Rotavirus protein NSP3 (NS34) is bound to the 3' end consensus sequence of viral mRNAs in infected cells.
Polyadenylate-binding protein 1 is a protein that in humans is encoded by the PABPC1 gene. The protein PABP1 binds mRNA and facilitates a variety of functions such as transport into and out of the nucleus, degradation, translation, and stability. There are two separate PABP1 proteins, one which is located in the nucleus (PABPN1) and the other which is found in the cytoplasm (PABPC1). The location of PABP1 affects the role of that protein and its function with RNA.
Eukaryotic translation initiation factor 4 gamma 1 is a protein that in humans is encoded by the EIF4G1 gene.
Polyadenylate-binding protein 2 (PABP-2) also known as polyadenylate-binding nuclear protein 1 (PABPN1) is a protein that in humans is encoded by the PABPN1 gene. PABN1 is a member of a larger family of poly(A)-binding proteins in the human genome.
Eukaryotic initiation factor 4A-I is a 46 kDa cytosolic protein that, in humans, is encoded by the EIF4A1 gene, which is located on chromosome 17. It is the most prevalent member of the eIF4A family of ATP-dependant RNA helicases, and plays a critical role in the initiation of cap-dependent eukaryotic protein translation as a component of the eIF4F translation initiation complex. eIF4A1 unwinds the secondary structure of RNA within the 5'-UTR of mRNA, a critical step necessary for the recruitment of the 43S preinitiation complex, and thus the translation of protein in eukaryotes. It was first characterized in 1982 by Grifo, et al., who purified it from rabbit reticulocyte lysate.
Polyadenylate-binding protein 4 (PABPC4) is a protein that in humans is encoded by the PABPC4 gene.
Polyadenylate-binding protein-interacting protein 2 is a protein that in humans is encoded by the PAIP2 gene.
Polyadenylate-binding protein-interacting protein 1 is a protein that in humans is encoded by the PAIP1 gene.
Eukaryotic peptide chain release factor GTP-binding subunit ERF3B is an enzyme that in humans is encoded by the GSPT2 gene.
Polyadenylate-binding protein 3 is a protein that in humans is encoded by the PABPC3 gene. PABPC3 is a member of a larger family of poly(A)-binding proteins in the human genome.
RNA recognition motif, RNP-1 is a putative RNA-binding domain of about 90 amino acids that are known to bind single-stranded RNAs. It was found in many eukaryotic proteins.