RNA recognition motif

Last updated
Typical architecture of an RRM domain, with a four-stranded antiparallel beta-sheet, stacked on two alpha helices RRM.png
Typical architecture of an RRM domain, with a four-stranded antiparallel beta-sheet, stacked on two alpha helices
RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain)
Identifiers
SymbolRRM_1
Pfam PF00076
Pfam clan CL0221
ECOD 304.9.1
InterPro IPR000504
PROSITE PDOC00030
SCOP2 1sxl / SCOPe / SUPFAM
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary
PDB 1cvj F:101-170 1x5t A:102-174 2cpz A:403-474

1u6f A:45-116 1fxl A:48-119 1g2e A:48-119 1d8z A:41-112 1fnx H:41-112 3sxl A:127-198 1b7f B:127-198 2sxl :127-198 1x4e A:58-117 1x5s A:8-79 2cqc A:120-191 2cqd A:13-68 2cqb A:8-79 2cq3 A:123-192 2err A:119-188 2cpj A:152-153 2cqi A:11-80 1x5u A:15-86 2cq0 A:241-312 1d9a A:127-198 1sxl :213-279 1x5o A:143-207 1x5p A:264-327 1x4g A:207-272 1x4a A:18-86 1wf2 A:18-82 1wf1 A:23-87 2f9j B:21-89 2f9d B:21-89 1p1t A:18-89 2u2f A:261-332 1p27 B:75-146 1hl6 C:75-146 1rk8 A:75-146 1oo0 B:75-146 2cq4 A:168-238 1rkj A:396-462 1fjc A:396-462 1fje B:396-462 1wi8 A:98-168 2cqh A:5-71 2mss A:111-181 2mst A:111-181 1uaw A:22-92 1hd0 A:99-169 1hd1 A:99-169 2up1 A:16-86 1u1k A:16-86 1pgz A:16-86 1u1p A:16-86 1u1o A:16-86 1up1 :16-86 1u1r A:16-86 1po6 A:16-86 1u1l A:16-86 1u1q A:16-86 1u1n A:16-86 1ha1 :16-86 1u1m A:16-86 1l3k A:16-86 1x4b A:23-93 1wtb A:184-254 1iqt A:184-254 1x0f A:184-254 2cqg A:106-175 1wf0 A:193-236 2cpf A:724-798 2cph A:826-899 1x4h A:327-404 1h6k Y:42-113 1n52 B:42-113 1h2t Z:42-113 1h2v Z:42-113 1n54 B:42-113 1h2u X:42-113 1no8 A:107-177 2cpx A:311-382 1dz5 A:12-84 1m5k C:12-84 1m5p F:12-84 1m5v F:12-84 1cx0 A:12-84 1aud A:12-84 1vc0 A:12-84 1m5o C:12-84 1vc5 A:12-84 1vbz A:12-84 1vbx A:12-84 1nu4 B:12-84 1sjf A:12-84 1fht :12-84 3utr D:12-84 1u6b A:12-84 1zzn A:12-84 1vc6 A:12-84 1sj4 P:12-84 1drz A:12-84 1sj3 P:12-84 1vby A:12-84 1oia B:12-84 1vc7 A:12-84 1a9n D:24-81 1x4c A:123-187 1wg4 A:114-178 1u2f A:151-226 2cpe A:363-442 1wg1 A:71-135 1bny A:26-87 2u1a :210-277 2cpi A:130-188 1jmt A:91-142 1opi A:400-461 1o0p A:400-461 1qm9 A:456-524 2adc A:456-524 2evz A:456-524 2adb A:186-253 1sjr A:186-253 1fj7 A:310-379 1wex A:127-177 1zh5 A:113-182 1yty A:113-182 1s79 A:113-182 1wg5 A:113-183 1wez A:291-359 1wel A:432-502

Contents

2cqp A:928-999 2cpy A:546-616

RNA recognition motif, RNP-1 is a putative RNA-binding domain of about 90 amino acids that are known to bind single-stranded RNAs. It was found in many eukaryotic proteins. [1] [2] [3]

The largest group of single strand RNA-binding protein is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. [4] [5]

RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF2, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). [2] [3] [5] The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor appears to have two RRM-like domains with specialised features for protein recognition. [6] The motif also appears in a few single stranded DNA binding proteins.

The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. A third helix is present during RNA binding in some cases. [7] The RRM is reviewed in a number of publications. [8] [9] [10]

Human proteins containing this domain

A2BP1; ACF; BOLL; BRUNOL4; BRUNOL5; BRUNOL6; CCBL2; CGI-96; CIRBP; CNOT4; CPEB2; CPEB3; CPEB4; CPSF7; CSTF2; CSTF2T; CUGBP1; CUGBP2; D10S102; DAZ1; DAZ2; DAZ3; DAZ4; DAZAP1; DAZL; DNAJC17; DND1; EIF3S4; EIF3S9; EIF4B; EIF4H; ELAVL1; ELAVL2; ELAVL3; ELAVL4; ENOX1; ENOX2; EWSR1; FUS; FUSIP1; G3BP; G3BP1; G3BP2; GRSF1; HNRNPL; HNRPA0; HNRPA1; HNRPA2B1; HNRPA3; HNRPAB; HNRPC; HNRPCL1; HNRPD; HNRPDL; HNRPF; HNRPH1; HNRPH2; HNRPH3; HNRPL; HNRPLL; HNRPM; HNRPR; HRNBP1; HSU53209; HTATSF1; IGF2BP1; IGF2BP2; IGF2BP3; LARP7; MKI67IP; MSI1; MSI2; MSSP-2; MTHFSD; MYEF2; NCBP2; NCL; NOL8; NONO; P14; PABPC1; PABPC1L; PABPC3; PABPC4; PABPC5; PABPN1; POLDIP3; PPARGC1; PPARGC1A; PPARGC1B; PPIE; PPIL4; PPRC1; PSPC1; PTBP1; PTBP2; PUF60; RALY; RALYL; RAVER1; RAVER2; RBM10; RBM11; RBM12; RBM12B; RBM14; RBM15; RBM15B; RBM16; RBM17; RBM18; RBM19; RBM22; RBM23; RBM24; RBM25; RBM26; RBM27; RBM28; RBM3; RBM32B; RBM33; RBM34; RBM35A; RBM35B; RBM38; RBM39; RBM4; RBM41; RBM42; RBM44; RBM45; RBM46; RBM47; RBM4B; RBM5; RBM7; RBM8A; RBM9; RBMS1; RBMS2; RBMS3; RBMX; RBMX2; RBMXL2; RBMY1A1; RBMY1B; RBMY1E; RBMY1F; RBMY2FP; RBPMS; RBPMS2; RDBP; RNPC3; RNPC4; RNPS1; ROD1; SAFB; SAFB2; SART3; SETD1A; SF3B6; SF3B4; SFPQ; SFRS1; SFRS10; SFRS11; SFRS12; SFRS15; SRSF2; SFRS2B; SFRS3; SFRS4; SFRS5; SFRS6; SFRS7; SFRS9; SLIRP; SLTM; SNRP70; SNRPA; SNRPB2; SPEN; SR140; SRRP35; SSB; SYNCRIP; TAF15; TARDBP; THOC4; TIA1; TIAL1; TNRC4; TNRC6C; TRA2A; TRSPAP1; TUT1; U1SNRNPBP; U2AF1; U2AF2; UHMK1; ZCRB1; ZNF638; ZRSR1; ZRSR2;

Related Research Articles

<span class="mw-page-title-main">RNA splicing</span> Process in molecular biology

RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.

<span class="mw-page-title-main">Spliceosome</span> Molecular machine that removes intron RNA from the primary transcript

A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specific proteins to form a small nuclear ribonucleoprotein complex, which in turn combines with other snRNPs to form a large ribonucleoprotein complex called a spliceosome. The spliceosome removes introns from a transcribed pre-mRNA, a type of primary transcript. This process is generally referred to as splicing. An analogy is a film editor, who selectively cuts out irrelevant or incorrect material from the initial film and sends the cleaned-up version to the director for the final cut.

<span class="mw-page-title-main">SR protein</span>

SR proteins are a conserved family of proteins involved in RNA splicing. SR proteins are named because they contain a protein domain with long repeats of serine and arginine amino acid residues, whose standard abbreviations are "S" and "R" respectively. SR proteins are ~200-600 amino acids in length and composed of two domains, the RNA recognition motif (RRM) region and the RS domain. SR proteins are more commonly found in the nucleus than the cytoplasm, but several SR proteins are known to shuttle between the nucleus and the cytoplasm.

<span class="mw-page-title-main">Nucleoprotein</span> Type of protein

Nucleoproteins are proteins conjugated with nucleic acids. Typical nucleoproteins include ribosomes, nucleosomes and viral nucleocapsid proteins.

RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif (RRM), dsRNA binding domain, zinc finger and others. They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles (hnRNPs). RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database (EuRBPDB), there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP (ribonucleoprotein) for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically.It has now become clear that RNA–RBP interactions play important roles in many biological processes among organisms.

Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribed by either RNA polymerase II or RNA polymerase III. Their primary function is in the processing of pre-messenger RNA (hnRNA) in the nucleus. They have also been shown to aid in the regulation of transcription factors or RNA polymerase II, and maintaining the telomeres.

<span class="mw-page-title-main">LSm</span> Family of RNA-binding proteins

In molecular biology, LSm proteins are a family of RNA-binding proteins found in virtually every cellular organism. LSm is a contraction of 'like Sm', because the first identified members of the LSm protein family were the Sm proteins. LSm proteins are defined by a characteristic three-dimensional structure and their assembly into rings of six or seven individual LSm protein molecules, and play a large number of various roles in mRNA processing and regulation.

<span class="mw-page-title-main">U11 spliceosomal RNA</span> Non-coding RNA involved in alternative splicing

The U11 snRNA is an important non-coding RNA in the minor spliceosome protein complex, which activates the alternative splicing mechanism. The minor spliceosome is associated with similar protein components as the major spliceosome. It uses U11 snRNA to recognize the 5' splice site while U12 snRNA binds to the branchpoint to recognize the 3' splice site.

<span class="mw-page-title-main">U1 spliceosomal RNA</span>

U1 spliceosomal RNA is the small nuclear RNA (snRNA) component of U1 snRNP, an RNA-protein complex that combines with other snRNPs, unmodified pre-mRNA, and various other proteins to assemble a spliceosome, a large RNA-protein molecular complex upon which splicing of pre-mRNA occurs. Splicing, or the removal of introns, is a major aspect of post-transcriptional modification, and takes place only in the nucleus of eukaryotes.

<span class="mw-page-title-main">U4 spliceosomal RNA</span> Non-coding RNA component of the spliceosome

The U4 small nuclear Ribo-Nucleic Acid is a non-coding RNA component of the major or U2-dependent spliceosome – a eukaryotic molecular machine involved in the splicing of pre-messenger RNA (pre-mRNA). It forms a duplex with U6, and with each splicing round, it is displaced from the U6 snRNA in an ATP-dependent manner, allowing U6 to re-fold and create the active site for splicing catalysis. A recycling process involving protein Brr2 releases U4 from U6, while protein Prp24 re-anneals U4 and U6. The crystal structure of a 5′ stem-loop of U4 in complex with a binding protein has been solved.

<span class="mw-page-title-main">Fibrillarin</span> Protein-coding gene in the species Homo sapiens

rRNA 2'-O-methyltransferase fibrillarin is an enzyme that in humans is encoded by the FBL gene.

<span class="mw-page-title-main">U2AF2</span> Protein-coding gene in the species Homo sapiens

Splicing factor U2AF 65 kDa subunit is a protein that in humans is encoded by the U2AF2 gene.

snRNP70 Protein-coding gene in the species Homo sapiens

snRNP70 also known as U1 small nuclear ribonucleoprotein 70 kDa is a protein that in humans is encoded by the SNRNP70 gene. snRNP70 is a small nuclear ribonucleoprotein that associates with U1 spliceosomal RNA, forming the U1snRNP a core component of the spliceosome. The U1-70K protein and other components of the spliceosome complex form detergent-insoluble aggregates in both sporadic and familial human cases of Alzheimer's disease. U1-70K co-localizes with Tau in neurofibrillary tangles in Alzheimer's disease.

<span class="mw-page-title-main">DEAD box</span> Family of proteins

DEAD box proteins are involved in an assortment of metabolic processes that typically involve RNAs, but in some cases also other nucleic acids. They are highly conserved in nine motifs and can be found in most prokaryotes and eukaryotes, but not all. Many organisms, including humans, contain DEAD-box (SF2) helicases, which are involved in RNA metabolism.

<span class="mw-page-title-main">SNRPB</span> Protein-coding gene in the species Homo sapiens

Small nuclear ribonucleoprotein-associated proteins B and B' is a protein that in humans is encoded by the SNRPB gene.

<span class="mw-page-title-main">Small nuclear ribonucleoprotein polypeptide A</span> Protein-coding gene in the species Homo sapiens

U1 small nuclear ribonucleoprotein A is a protein that in humans is encoded by the SNRPA gene.

<span class="mw-page-title-main">Small nuclear ribonucleoprotein polypeptide C</span> Protein-coding gene in the species Homo sapiens

U1 small nuclear ribonucleoprotein C is a protein that in humans is encoded by the SNRPC gene.

<span class="mw-page-title-main">Prp24</span>

Prp24 is a protein part of the pre-messenger RNA splicing process and aids the binding of U6 snRNA to U4 snRNA during the formation of spliceosomes. Found in eukaryotes from yeast to E. coli, fungi, and humans, Prp24 was initially discovered to be an important element of RNA splicing in 1989. Mutations in Prp24 were later discovered in 1991 to suppress mutations in U4 that resulted in cold-sensitive strains of yeast, indicating its involvement in the reformation of the U4/U6 duplex after the catalytic steps of splicing.

<span class="mw-page-title-main">La domain</span>

In molecular biology, the La domain is a conserved protein domain. Human 60 kDa SS-A/Ro ribonucleoproteins (RNPs) are composed of one of the four small Y RNAs and at least two proteins, Ro60 and La. The La protein is a 47 kDa polypeptide that frequently acts as an autoantigen in systemic lupus erythematosus and Sjögren syndrome. In the nucleus, La acts as a RNA polymerase III transcription factor, while in the cytoplasm, La acts as a translation factor. In the nucleus, La binds to the 3'UTR of nascent RNAP III transcripts to assist in folding and maturation. In the cytoplasm, La recognises specific classes of mRNAs that contain a 5'-terminal oligopyrimidine (5'TOP) motif known to control protein synthesis. The specific recognition is mediated by the N-terminal domain of La, which comprises a La motif and an RNA recognition motif (RRM). The La motif adopts an alpha/beta fold that comprises a winged-helix motif.

<span class="mw-page-title-main">Kiyoshi Nagai</span> Japanese structural biologist (1949–2019)

Kiyoshi Nagai was a Japanese structural biologist at the MRC Laboratory of Molecular Biology Cambridge, UK. He was known for his work on the mechanism of RNA splicing and structures of the spliceosome.

References

  1. Swanson MS, Dreyfuss G, Pinol-Roma S (1988). "Heterogeneous nuclear ribonucleoprotein particles and the pathway of mRNA formation". Trends Biochem. Sci. 13 (3): 86–91. doi:10.1016/0968-0004(88)90046-1. PMID   3072706.
  2. 1 2 Keene JD, Chambers JC, Kenan D, Martin BJ (1988). "Genomic structure and amino acid sequence domains of the human La autoantigen". J. Biol. Chem. 263 (34): 18043–51. doi: 10.1016/S0021-9258(19)81321-2 . PMID   3192525.
  3. 1 2 Davis RW, Sachs AB, Kornberg RD (1987). "A single domain of yeast poly(A)-binding protein is necessary and sufficient for RNA binding and cell viability". Mol. Cell. Biol. 7 (9): 3268–76. doi:10.1128/mcb.7.9.3268. PMC   367964 . PMID   3313012.
  4. Bandziulis RJ, Swanson MS, Dreyfuss G (1989). "RNA-binding proteins as developmental regulators". Genes Dev. 3 (4): 431–437. doi: 10.1101/gad.3.4.431 . PMID   2470643.
  5. 1 2 Keene JD, Query CC, Bentley RC (1989). "A common RNA recognition motif identified within a defined U1 RNA binding domain of the 70K U1 snRNP protein". Cell. 57 (1): 89–101. doi:10.1016/0092-8674(89)90175-X. PMID   2467746. S2CID   22127152.
  6. Green MR, Kielkopf CL, Lucke S (2004). "U2AF homology motifs: protein recognition in the RRM world". Genes Dev. 18 (13): 1513–1526. doi:10.1101/gad.1206204. PMC   2043112 . PMID   15231733.
  7. Kumar S, Birney E, Krainer AR (1993). "Analysis of the RNA-recognition motif and RS and RGG domains: conservation in metazoan pre-mRNA splicing factors". Nucleic Acids Res. 21 (25): 5803–5816. doi:10.1093/nar/21.25.5803. PMC   310458 . PMID   8290338.
  8. Keene JD, Kenan DJ, Query CC (1991). "RNA recognition: towards identifying determinants of specificity". Trends Biochem. Sci. 16 (6): 214–20. doi:10.1016/0968-0004(91)90088-d. PMID   1716386.
  9. Allain FH, Dominguez C, Maris C (2005). "The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression". FEBS J. 272 (9): 2118–31. doi:10.1111/j.1742-4658.2005.04653.x. PMID   15853797. S2CID   46680279.
  10. Teplova M, Yuan YR, Patel DJ, Malinina L, Teplov A, Phan AT, Ilin S (2006). "Structural basis for recognition and sequestration of UUU(OH) 3' temini of nascent RNA polymerase III transcripts by La, a rheumatic disease autoantigen". Mol. Cell. 21 (1): 75–85. doi:10.1016/j.molcel.2005.10.027. PMC   4689297 . PMID   16387655.
This article incorporates text from the public domain Pfam and InterPro: IPR000504