DNA-directed RNA polymerase | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
EC no. | 2.7.7.6 | ||||||||
CAS no. | 9014-24-8 | ||||||||
Databases | |||||||||
IntEnz | IntEnz view | ||||||||
BRENDA | BRENDA entry | ||||||||
ExPASy | NiceZyme view | ||||||||
KEGG | KEGG entry | ||||||||
MetaCyc | metabolic pathway | ||||||||
PRIAM | profile | ||||||||
PDB structures | RCSB PDB PDBe PDBsum | ||||||||
Gene Ontology | AmiGO / QuickGO | ||||||||
|
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that catalyzes the chemical reactions that synthesize RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the double-stranded DNA so that one strand of the exposed nucleotides can be used as a template for the synthesis of RNA, a process called transcription. A transcription factor and its associated transcription mediator complex must be attached to a DNA binding site called a promoter region before RNAP can initiate the DNA unwinding at that position. RNAP not only initiates RNA transcription, it also guides the nucleotides into position, facilitates attachment and elongation, has intrinsic proofreading and replacement capabilities, and termination recognition capability. In eukaryotes, RNAP can build chains as long as 2.4 million nucleotides.
RNAP produces RNA that, functionally, is either for protein coding, i.e. messenger RNA (mRNA); or non-coding (so-called "RNA genes"). Examples of four functional types of RNA genes are:
RNA polymerase is essential to life, and is found in all living organisms and many viruses. Depending on the organism, a RNA polymerase can be a protein complex (multi-subunit RNAP) or only consist of one subunit (single-subunit RNAP, ssRNAP), each representing an independent lineage. The former is found in bacteria, archaea, and eukaryotes alike, sharing a similar core structure and mechanism. [1] The latter is found in phages as well as eukaryotic chloroplasts and mitochondria, and is related to modern DNA polymerases. [2] Eukaryotic and archaeal RNAPs have more subunits than bacterial ones do, and are controlled differently.
Bacteria and archaea only have one RNA polymerase. Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA:
The 2006 Nobel Prize in Chemistry was awarded to Roger D. Kornberg for creating detailed molecular images of RNA polymerase during various stages of the transcription process. [3] [4]
In most prokaryotes, a single RNA polymerase species transcribes all types of RNA. RNA polymerase "core" from E. coli consists of five subunits: two alpha (α) subunits of 36 kDa, a beta (β) subunit of 150 kDa, a beta prime subunit (β′) of 155 kDa, and a small omega (ω) subunit. A sigma (σ) factor binds to the core, forming the holoenzyme. After transcription starts, the factor can unbind and let the core enzyme proceed with its work. [5] [6] The core RNA polymerase complex forms a "crab claw" or "clamp-jaw" structure with an internal channel running along the full length. [7] Eukaryotic and archaeal RNA polymerases have a similar core structure and work in a similar manner, although they have many extra subunits. [8]
All RNAPs contain metal cofactors, in particular zinc and magnesium cations which aid in the transcription process. [9] [10]
Control of the process of gene transcription affects patterns of gene expression and, thereby, allows a cell to adapt to a changing environment, perform specialized roles within an organism, and maintain basic metabolic processes necessary for survival. Therefore, it is hardly surprising that the activity of RNAP is long, complex, and highly regulated. In Escherichia coli bacteria, more than 100 transcription factors have been identified, which modify the activity of RNAP. [11]
RNAP can initiate transcription at specific DNA sequences known as promoters. It then produces an RNA chain, which is complementary to the template DNA strand. The process of adding nucleotides to the RNA strand is known as elongation; in eukaryotes, RNAP can build chains as long as 2.4 million nucleotides (the full length of the dystrophin gene). RNAP will preferentially release its RNA transcript at specific DNA sequences encoded at the end of genes, which are known as terminators.
Products of RNAP include:
RNAP accomplishes de novo synthesis. It is able to do this because specific interactions with the initiating nucleotide hold RNAP rigidly in place, facilitating chemical attack on the incoming nucleotide. Such specific interactions explain why RNAP prefers to start transcripts with ATP (followed by GTP, UTP, and then CTP). In contrast to DNA polymerase, RNAP includes helicase activity, therefore no separate enzyme is needed to unwind DNA.
RNA polymerase binding in bacteria involves the sigma factor recognizing the core promoter region containing the −35 and −10 elements (located before the beginning of sequence to be transcribed) and also, at some promoters, the α subunit C-terminal domain recognizing promoter upstream elements. [12] There are multiple interchangeable sigma factors, each of which recognizes a distinct set of promoters. For example, in E. coli, σ70 is expressed under normal conditions and recognizes promoters for genes required under normal conditions ("housekeeping genes"), while σ32 recognizes promoters for genes required at high temperatures ("heat-shock genes"). In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together. The RNA polymerase-promoter closed complex is usually referred to as the "transcription preinitiation complex." [13] [14]
After binding to the DNA, the RNA polymerase switches from a closed complex to an open complex. This change involves the separation of the DNA strands to form an unwound section of DNA of approximately 13 bp, referred to as the "transcription bubble". Supercoiling plays an important part in polymerase activity because of the unwinding and rewinding of DNA. Because regions of DNA in front of RNAP are unwound, there are compensatory positive supercoils. Regions behind RNAP are rewound and negative supercoils are present. [14]
RNA polymerase then starts to synthesize the initial DNA-RNA heteroduplex, with ribonucleotides base-paired to the template DNA strand according to Watson-Crick base-pairing interactions. As noted above, RNA polymerase makes contacts with the promoter region. However these stabilizing contacts inhibit the enzyme's ability to access DNA further downstream and thus the synthesis of the full-length product. In order to continue RNA synthesis, RNA polymerase must escape the promoter. It must maintain promoter contacts while unwinding more downstream DNA for synthesis, "scrunching" more downstream DNA into the initiation complex. [15] During the promoter escape transition, RNA polymerase is considered a "stressed intermediate." Thermodynamically the stress accumulates from the DNA-unwinding and DNA-compaction activities. Once the DNA-RNA heteroduplex is long enough (~10 bp), RNA polymerase releases its upstream contacts and effectively achieves the promoter escape transition into the elongation phase. The heteroduplex at the active center stabilizes the elongation complex.
However, promoter escape is not the only outcome. RNA polymerase can also relieve the stress by releasing its downstream contacts, arresting transcription. The paused transcribing complex has two options: (1) release the nascent transcript and begin anew at the promoter or (2) reestablish a new 3′-OH on the nascent transcript at the active site via RNA polymerase's catalytic activity and recommence DNA scrunching to achieve promoter escape. Abortive initiation, the unproductive cycling of RNA polymerase before the promoter escape transition, results in short RNA fragments of around 9 bp in a process known as abortive transcription. The extent of abortive initiation depends on the presence of transcription factors and the strength of the promoter contacts. [16]
The 17-bp transcriptional complex has an 8-bp DNA-RNA hybrid, that is, 8 base-pairs involve the RNA transcript bound to the DNA template strand. [17] As transcription progresses, ribonucleotides are added to the 3′ end of the RNA transcript and the RNAP complex moves along the DNA. The characteristic elongation rates in prokaryotes and eukaryotes are about 10–100 nts/sec. [18]
Aspartyl (asp) residues in the RNAP will hold on to Mg2+ ions, which will, in turn, coordinate the phosphates of the ribonucleotides. The first Mg2+ will hold on to the α-phosphate of the NTP to be added. This allows the nucleophilic attack of the 3′-OH from the RNA transcript, adding another NTP to the chain. The second Mg2+ will hold on to the pyrophosphate of the NTP. [19] The overall reaction equation is:
Unlike the proofreading mechanisms of DNA polymerase those of RNAP have only recently been investigated. Proofreading begins with separation of the mis-incorporated nucleotide from the DNA template. This pauses transcription. The polymerase then backtracks by one position and cleaves the dinucleotide that contains the mismatched nucleotide. In the RNA polymerase this occurs at the same active site used for polymerization and is therefore markedly different from the DNA polymerase where proofreading occurs at a distinct nuclease active site. [20]
The overall error rate is around 10−4 to 10−6. [21]
In bacteria, termination of RNA transcription can be rho-dependent or rho-independent. The former relies on the rho factor, which destabilizes the DNA-RNA heteroduplex and causes RNA release. [22] The latter, also known as intrinsic termination, relies on a palindromic region of DNA. Transcribing the region causes the formation of a "hairpin" structure from the RNA transcription looping and binding upon itself. This hairpin structure is often rich in G-C base-pairs, making it more stable than the DNA-RNA hybrid itself. As a result, the 8 bp DNA-RNA hybrid in the transcription complex shifts to a 4 bp hybrid. These last 4 base pairs are weak A-U base pairs, and the entire RNA transcript will fall off the DNA. [23]
Transcription termination in eukaryotes is less well understood than in bacteria, but involves cleavage of the new transcript followed by template-independent addition of adenines at its new 3′ end, in a process called polyadenylation. [24]
Given that DNA and RNA polymerases both carry out template-dependent nucleotide polymerization, it might be expected that the two types of enzymes would be structurally related. However, x-ray crystallographic studies of both types of enzymes reveal that, other than containing a critical Mg2+ ion at the catalytic site, they are virtually unrelated to each other; indeed template-dependent nucleotide polymerizing enzymes seem to have arisen independently twice during the early evolution of cells. One lineage led to the modern DNA polymerases and reverse transcriptases, as well as to a few single-subunit RNA polymerases (ssRNAP) from phages and organelles. [2] The other multi-subunit RNAP lineage formed all of the modern cellular RNA polymerases. [25] [1]
In bacteria, the same enzyme catalyzes the synthesis of mRNA and non-coding RNA (ncRNA).
RNAP is a large molecule. The core enzyme has five subunits (~400 kDa): [26]
In order to bind promoters, RNAP core associates with the transcription initiation factor sigma (σ) to form RNA polymerase holoenzyme. Sigma reduces the affinity of RNAP for nonspecific DNA while increasing specificity for promoters, allowing transcription to initiate at correct sites. The complete holoenzyme therefore has 6 subunits: β′βαI and αIIωσ (~450 kDa).
Eukaryotes have multiple types of nuclear RNAP, each responsible for synthesis of a distinct subset of RNA. All are structurally and mechanistically related to each other and to bacterial RNAP:
Eukaryotic chloroplasts contain a multi-subunit RNAP ("PEP, plastid-encoded polymerase"). Due to its bacterial origin, the organization of PEP resembles that of current bacterial RNA polymerases: It is encoded by the RPOA, RPOB, RPOC1 and RPOC2 genes on the plastome, which as proteins form the core subunits of PEP, respectively named α, β, β′ and β″ [35] . Similar to the RNA polymerase in E. coli, PEP requires the presence of sigma (σ) factors for the recognition of its promoters, containing the -10 and -35 motifs [36] . Despite the many commonalities between plant organellar and bacterial RNA polymerases and their structure, PEP additionally requires the association of a number of nuclear encoded proteins, termed PAPs (PEP-associated proteins), which form essential components that are closely associated with the PEP complex in plants. Initially, a group consisting of 10 PAPs was identified through biochemical methods, which was later extended to 12 PAPs. [37] [38]
Chloroplast also contain a second, structurally and mechanistically unrelated, single-subunit RNAP ("nucleus-encoded polymerase, NEP"). Eukaryotic mitochondria use POLRMT (human), a nucleus-encoded single-subunit RNAP. [2] Such phage-like polymerases are referred to as RpoT in plants. [39]
Archaea have a single type of RNAP, responsible for the synthesis of all RNA. Archaeal RNAP is structurally and mechanistically similar to bacterial RNAP and eukaryotic nuclear RNAP I-V, and is especially closely structurally and mechanistically related to eukaryotic nuclear RNAP II. [8] [40] The history of the discovery of the archaeal RNA polymerase is quite recent. The first analysis of the RNAP of an archaeon was performed in 1971, when the RNAP from the extreme halophile Halobacterium cutirubrum was isolated and purified. [41] Crystal structures of RNAPs from Sulfolobus solfataricus and Sulfolobus shibatae set the total number of identified archaeal subunits at thirteen. [8] [42]
Archaea has the subunit corresponding to Eukaryotic Rpb1 split into two. There is no homolog to eukaryotic Rpb9 (POLR2I) in the S. shibatae complex, although TFS (TFIIS homolog) has been proposed as one based on similarity. There is an additional subunit dubbed Rpo13; together with Rpo5 it occupies a space filled by an insertion found in bacterial β′ subunits (1,377–1,420 in Taq). [8] An earlier, lower-resolution study on S. solfataricus structure did not find Rpo13 and only assigned the space to Rpo5/Rpb5. Rpo3 is notable in that it's an iron–sulfur protein. RNAP I/III subunit AC40 found in some eukaryotes share similar sequences, [42] but does not bind iron. [43] This domain, in either case, serves a structural function. [44]
Archaeal RNAP subunit previously used an "RpoX" nomenclature where each subunit is assigned a letter in a way unrelated to any other systems. [1] In 2009, a new nomenclature based on Eukaryotic Pol II subunit "Rpb" numbering was proposed. [8]
Orthopoxviruses and some other nucleocytoplasmic large DNA viruses synthesize RNA using a virally encoded multi-subunit RNAP. They are most similar to eukaryotic RNAPs, with some subunits minified or removed. [45] Exactly which RNAP they are most similar to is a topic of debate. [46] Most other viruses that synthesize RNA use unrelated mechanics.
Many viruses use a single-subunit DNA-dependent RNAP (ssRNAP) that is structurally and mechanistically related to the single-subunit RNAP of eukaryotic chloroplasts (RpoT) and mitochondria (POLRMT) and, more distantly, to DNA polymerases and reverse transcriptases. Perhaps the most widely studied such single-subunit RNAP is bacteriophage T7 RNA polymerase. ssRNAPs cannot proofread. [2]
B. subtilis prophage SPβ uses YonO, a homolog of the β+β′ subunits of msRNAPs to form a monomeric (both barrels on the same chain) RNAP distinct from the usual "right hand" ssRNAP. It probably diverged very long ago from the canonical five-unit msRNAP, before the time of the last universal common ancestor. [47] [48]
Other viruses use an RNA-dependent RNAP (an RNAP that employs RNA as a template instead of DNA). This occurs in negative strand RNA viruses and dsRNA viruses, both of which exist for a portion of their life cycle as double-stranded RNA. However, some positive strand RNA viruses, such as poliovirus, also contain RNA-dependent RNAP. [49]
RNAP was discovered independently by Charles Loe, Audrey Stevens, and Jerard Hurwitz in 1960. [50] By this time, one half of the 1959 Nobel Prize in Medicine had been awarded to Severo Ochoa for the discovery of what was believed to be RNAP, [51] but instead turned out to be polynucleotide phosphorylase.
RNA polymerase can be isolated in the following ways:
And also combinations of the above techniques.
In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins produce messenger RNA (mRNA). Other segments of DNA are transcribed into RNA molecules called non-coding RNAs (ncRNAs).
A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create two identical DNA duplexes from a single original DNA duplex. During this process, DNA polymerase "reads" the existing DNA strands to create two new strands that match the existing ones. These enzymes catalyze the chemical reaction
DNA primase is an enzyme involved in the replication of DNA and is a type of RNA polymerase. Primase catalyzes the synthesis of a short RNA segment called a primer complementary to a ssDNA template. After this elongation, the RNA piece is removed by a 5' to 3' exonuclease and refilled with DNA.
A sigma factor is a protein needed for initiation of transcription in bacteria. It is a bacterial transcription initiation factor that enables specific binding of RNA polymerase (RNAP) to gene promoters. It is homologous to archaeal transcription factor B and to eukaryotic factor TFIIB. The specific sigma factor used to initiate transcription of a given gene will vary, depending on the gene and on the environmental signals needed to initiate transcription of that gene. Selection of promoters by RNA polymerase is dependent on the sigma factor that associates with it. They are also found in plant chloroplasts as a part of the bacteria-like plastid-encoded polymerase (PEP).
The preinitiation complex is a complex of approximately 100 proteins that is necessary for the transcription of protein-coding genes in eukaryotes and archaea. The preinitiation complex positions RNA polymerase II at gene transcription start sites, denatures the DNA, and positions the DNA in the RNA polymerase II active site for transcription.
RNA polymerase II is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryotic cells. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase. A wide range of transcription factors are required for it to bind to upstream gene promoters and begin transcription.
General transcription factors (GTFs), also known as basal transcriptional factors, are a class of protein transcription factors that bind to specific sites (promoter) on DNA to activate transcription of genetic information from DNA to messenger RNA. GTFs, RNA polymerase, and the mediator constitute the basic transcriptional apparatus that first bind to the promoter, then start transcription. GTFs are also intimately involved in the process of gene regulation, and most are required for life.
A DNA clamp, also known as a sliding clamp, is a protein complex that serves as a processivity-promoting factor in DNA replication. As a critical component of the DNA polymerase III holoenzyme, the clamp protein binds DNA polymerase and prevents this enzyme from dissociating from the template DNA strand. The clamp-polymerase protein–protein interactions are stronger and more specific than the direct interactions between the polymerase and the template DNA strand; because one of the rate-limiting steps in the DNA synthesis reaction is the association of the polymerase with the DNA template, the presence of the sliding clamp dramatically increases the number of nucleotides that the polymerase can add to the growing strand per association event. The presence of the DNA clamp can increase the rate of DNA synthesis up to 1,000-fold compared with a nonprocessive polymerase.
T7 RNA Polymerase is an RNA polymerase from the T7 bacteriophage that catalyzes the formation of RNA from DNA in the 5'→ 3' direction.
A transcription bubble is a molecular structure formed during DNA transcription when a limited portion of the DNA double helix is unwound. The size of a transcription bubble ranges from 12 to 14 base pairs. A transcription bubble is formed when the RNA polymerase enzyme binds to a promoter and causes two DNA strands to detach. It presents a region of unpaired DNA, where a short stretch of nucleotides are exposed on each strand of the double helix.
RNA-dependent RNA polymerase (RdRp) or RNA replicase is an enzyme that catalyzes the replication of RNA from an RNA template. Specifically, it catalyzes synthesis of the RNA strand complementary to a given RNA template. This is in contrast to typical DNA-dependent RNA polymerases, which all organisms use to catalyze the transcription of RNA from a DNA template.
Bacterial transcription is the process in which a segment of bacterial DNA is copied into a newly synthesized strand of messenger RNA (mRNA) with use of the enzyme RNA polymerase.
Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.
Intrinsic, or rho-independent termination, is a process to signal the end of transcription and release the newly constructed RNA molecule. In bacteria such as E. coli, transcription is terminated either by a rho-dependent process or rho-independent process. In the Rho-dependent process, the rho-protein locates and binds the signal sequence in the mRNA and signals for cleavage. Contrarily, intrinsic termination does not require a special protein to signal for termination and is controlled by the specific sequences of RNA. When the termination process begins, the transcribed mRNA forms a stable secondary structure hairpin loop, also known as a stem-loop. This RNA hairpin is followed by multiple uracil nucleotides. The bonds between uracil (rU) and adenine (dA) are very weak. A protein bound to RNA polymerase (nusA) binds to the stem-loop structure tightly enough to cause the polymerase to temporarily stall. This pausing of the polymerase coincides with transcription of the poly-uracil sequence. The weak adenine-uracil bonds lower the energy of destabilization for the RNA-DNA duplex, allowing it to unwind and dissociate from the RNA polymerase. Overall, the modified RNA structure is what terminates transcription.
RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as SRB proteins.
DSIF is a protein complex that can either negatively or positively affect transcription by RNA polymerase II. It can interact with the negative elongation factor (NELF) to promote the stalling of Pol II at some genes, which is called promoter proximal pausing. The pause occurs soon after initiation, once 20–60 nucleotides have been transcribed. This stalling is relieved by positive transcription elongation factor b (P-TEFb) and Pol II enters productive elongation to resume synthesis till finish. In humans, DSIF is composed of hSPT4 and hSPT5. hSPT5 has a direct role in mRNA capping which occurs while the elongation is paused.
In molecular biology, bacterial DNA binding proteins are a family of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins. Since bacterial binding proteins have a diversity of functions, it has been difficult to develop a common function for all of them. They are commonly referred to as histone-like and have many similar traits with the eukaryotic histone proteins. Eukaryotic histones package DNA to help it to fit in the nucleus, and they are known to be the most conserved proteins in nature. Examples include the HU protein in Escherichia coli, a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of bacteria and archaea, and are also encoded in the chloroplast genome of some algae. The integration host factor (IHF), a dimer of closely related chains which is suggested to function in genetic recombination as well as in translational and transcriptional control is found in Enterobacteria and viral proteins including the African swine fever virus protein A104R.
Archaeal transcription factor B is a protein family of extrinsic transcription factors that guide the initiation of RNA transcription in organisms that fall under the domain of Archaea. It is homologous to eukaryotic TFIIB and, more distantly, to bacterial sigma factor. Like these proteins, it is involved in forming transcription preinitiation complexes. Its structure includes several conserved motifs which interact with DNA and other transcription factors, notably the single type of RNA polymerase that performs transcription in Archaea.
Archaeal transcription is the process in which a segment of archaeal DNA is copied into a newly synthesized strand of RNA using the sole Pol II-like RNA polymerase (RNAP). The process occurs in three main steps: initiation, elongation, and termination; and the end result is a strand of RNA that is complementary to a single strand of DNA. A number of transcription factors govern this process with homologs in both bacteria and eukaryotes, with the core machinery more similar to eukaryotic transcription.
{{cite book}}
: CS1 maint: location missing publisher (link){{cite web}}
: Missing or empty |title=
(help)(Wayback Machine copy)