Enhancer RNAs (eRNAs) represent a class of relatively long non-coding RNA molecules (50-2000 nucleotides) transcribed from the DNA sequence of enhancer regions. They were first detected in 2010 through the use of genome-wide techniques such as RNA-seq and ChIP-seq. [1] [2] eRNAs can be subdivided into two main classes: 1D eRNAs and 2D eRNAs, which differ primarily in terms of their size, polyadenylation state, and transcriptional directionality. [3] The expression of a given eRNA correlates with the activity of its corresponding enhancer in target genes. [4] Increasing evidence suggests that eRNAs actively play a role in transcriptional regulation in cis and in trans, and while their mechanisms of action remain unclear, a few models have been proposed. [3]
Enhancers as sites of extragenic transcription were initially discovered in genome-wide studies that identified enhancers as common regions of RNA polymerase II (RNA pol II) binding and non-coding RNA transcription. [1] [2] The level of RNA pol II–enhancer interaction and RNA transcript formation were found to be highly variable among these initial studies. Using explicit chromatin signature peaks, a significant proportion (~70%) of extragenic RNA Pol II transcription start sites were found to overlap enhancer sites in murine macrophages. [5] Out of 12,000 neuronal enhancers in the mouse genome, almost 25% of the sites were found to bind RNA Pol II and generate transcripts. [6] In parallel studies, 4,588 high confidence extragenic RNA Pol II binding sites were identified in murine macrophages stimulated with the inflammatory mediater lipopolysaccharide to induce transcription. [2] These eRNAs, unlike messenger RNAs (mRNAs), lacked modification by polyadenylation, were generally short and non-coding, and were bidirectionally transcribed. Later studies revealed the transcription of another type of eRNAs, generated through unidirectional transcription, that were longer and contained a poly A tail. [7] Furthermore, eRNA levels were correlated with mRNA levels of nearby genes, suggesting the potential regulatory and functional role of these non-coding enhancer RNA molecules. [1]
eRNAs are transcribed from DNA sequences upstream and downstream of extragenic enhancer regions. [8] Previously, several model enhancers have demonstrated the capability to directly recruit RNA Pol II and general transcription factors and form the pre-initiation complex (PIC) prior to the transcription start site at the promoter of genes. In certain cell types, activated enhancers have demonstrated the ability to both recruit RNA Pol II and also provide a template for active transcription of their local sequences. [2] [1]
Depending on the directionality of transcription, enhancer regions generate two different types of non-coding transcripts, 1D-eRNAs and 2D-eRNAs. The nature of the pre-initiation complex and specific transcription factors recruited to the enhancer may control the type of eRNAs generated. After transcription, the majority of eRNAs remain in the nucleus. [9] In general, eRNAs are very unstable and actively degraded by the nuclear exosome. Not all enhancers are transcribed, with non-transcribed enhancers greatly outnumbering the transcribed ones in the order of magnitude of dozens of thousands in every given cell type. [5]
In most cases, unidirectional transcription of enhancer regions generates long (>4kb) and polyadenylated eRNAs. Enhancers that generate polyA+ eRNAs have a lower H3K4me1/me3 ratio in their chromatin signature than 2D-eRNAs. [7] PolyA+ eRNAs are distinct from long multiexonic poly transcripts (meRNAs) that are generated by transcription initiation at intragenic enhancers. These long non-coding RNAs, which accurately reflect the host gene's structure except for the alternative first exon, display poor coding potential. [10] As a result, polyA+ 1D-eRNAs may represent a mixed group of true enhancer-templated RNAs and multiexonic RNAs.
Bidirectional transcription at enhancer sites generates comparatively shorter (0.5-2kb) and non-polyadenylated eRNAs. Enhancers that generate polyA- eRNAs have a chromatin signature with a higher H3K4me1/me3 ratio than 1D-eRNAs. In general, enhancer transcription and production of bidirectional eRNAs demonstrate a strong correlation of enhancer activity on gene transcription. [11]
Arner et al. [12] identified 65,423 transcribed enhancers (producing eRNA) among 33 different cell types under different conditions and different timings of stimulation. The transcription of enhancers generally preceded transcription of transcription factors which, in turn, generally preceded messenger RNA(mRNA) transcription of genes.
Carullo et al. [13] examined one particular cell type, neurons (from primary neuron cultures). They exhibited 28,492 putative enhancers generating eRNAs. These eRNAs were often transcribed from both strands of the enhancer DNA in opposite directions. Carullo et al. [13] used these cultured neurons to examine the timing of specific enhancer eRNAs compared to the mRNAs of their target genes. The cultured neurons were activated and RNA was isolated from those neurons at 0, 3.75, 5, 7.5, 15, 30, and 60 minutes after activation. In these experimental conditions, they found that 2 of the 5 enhancers of the immediate early gene (IEG) FOS, that is FOS enhancer 1 and FOS enhancer 3, became activated and initiated transcription of their eRNAs (eRNA1 and eRNA3). FOS eRNA1 and eRNA3 were significantly up-regulated within 7.5 minutes, whereas FOS mRNA was only upregulated 15 minutes after stimulation. Similar patterns occurred at IEGs FOSb and NR4A1, indicating that for many IEGs, eRNA induction precedes mRNA induction in response to neuronal activation.
While some enhancers can activate their target promoters at their target genes without transcribing eRNA, most active enhancers do transcribe eRNA during activation of their target promoters. [14]
The functions for eRNA described below have been reported in diverse biological systems, often demonstrated with a small number of specific enhancer-target gene pairs. It is not clear to what extent the functions of eRNA described here can be generalized to most eRNAs.
The chromosome loops shown in the figure, bringing an enhancer to the promoter of its target gene, may be directed and formed by the eRNA transcribed from the enhancer after the enhancer is activated.
A transcribed enhancer RNA (eRNA) interacting with the complex of Mediator proteins (see Figure), especially Mediator subunit 12 (MED12), appears to be essential in forming the chromosome loop that brings the enhancer into close association with the promoter of the target gene of the enhancer in the case of five genes studied by Lai et al. [15] [16] [17] Hou and Kraus, [18] describe two other studies reporting similar results. Arnold et al. [19] review another 5 instances where eRNA is active in forming the enhancer-promoter loop.
One well-studied eRNA is the eRNA of the enhancer that interacts with the promoter of the prostate specific antigen (PSA) gene. [20] The PSA eRNA is strongly up-regulated by the androgen receptor. High PSA eRNA then has a domino effect. PSA eRNA binds to and activates the positive transcription elongation factor P-TEFb protein complex which can then phosphorylate RNA polymerase II (RNAP II), initiating its activity in producing mRNA. P-TEFb can also phosphorylate the negative elongation factor NELF (which pauses RNAP II within 60 nucleotides after mRNA initiation begins). Phosphorylated NELF is released from RNAP II, then allowing RNAP II to have productive mRNA progression (see Figure). Up-regulated PSA eRNA thereby increases expression of 586 androgen receptor-responsive genes. Knockdown of PSA eRNA or deleting a set of nucleotides from PSA eRNA causes decreased presence of phosphorylated (active) RNAP II at these genes causing their reduced transcription.
The negative elongation factor NELF protein can also be released from its interaction with RNAP II by direct interaction with some eRNAs. Schaukowitch et al. [21] showed that the eRNAs of two immediate early genes (IEGs) directly interacted with the NELF protein to release NELF from the RNAP II paused at the promoters of these two genes, allowing these two genes to then be expressed.
In addition, eRNAs appear to interact with as many as 30 other proteins. [19] [17] [18]
The notions that not all enhancers are transcribed at the same time and that eRNA transcription correlates with enhancer-specific activity support the idea that individual eRNAs carry distinct and relevant biological functions. [3] However, there is still no consensus on the functional significance of eRNAs. Furthermore, eRNAs can easily be degraded through exosomes and nonsense-mediated decay, which limits their potential as important transcriptional regulators. [22] To date, four main models of eRNA function have been proposed, [3] each supported by different lines of experimental evidence.
Since multiple studies have shown that RNA Pol II can be found at a very large number of extragenic regions, it is possible that eRNAs simply represent the product of random “leaky” transcription and carry no functional significance. [5] The non-specific activity of RNA Pol II would therefore allow extragenic transcriptional noise at sites where chromatin is already in an open and transcriptionally competent state. This would explain even tissue-specific eRNA expression [23] as open sites are tissue-specific as well.
RNA Pol II-mediated gene transcription induces a local opening of chromatin state through the recruitment of histone acetyltransferases and other histone modifiers that promote euchromatin formation. It was proposed that the presence of these enzymes could also induce an opening of chromatin at enhancer regions, which are usually present at distant locations but can be recruited to target genes through looping of DNA. [24] In this model, eRNAs are therefore expressed in response to RNA Pol II transcription and therefore carry no biological function.
While the two previous models implied that eRNAs were not functionally relevant, this mechanism states that eRNAs are functional molecules that exhibit cis activity. In this model, eRNAs can locally recruit regulatory proteins at their own site of synthesis. Supporting this hypothesis, transcripts originating from enhancers upstream of the Cyclin D1 gene are thought to serve as adaptors for the recruitment of histone acetyltransferases. It was found that depletion of these eRNAs led to Cyclin D1 transcriptional silencing. [9]
The last model involves transcriptional regulation by eRNAs at distant chromosomal locations. Through the differential recruitment of protein complexes, eRNAs can affect the transcriptional competency of specific loci. Evf-2 represents a good example of such trans regulatory eRNA as it can induce the expression of Dlx2, which in turn can increase the activity of the Dlx5 and Dlx6 enhancers. [25] Trans-acting eRNAs might also be working in cis, and vice versa.
The detection of eRNAs is fairly recent (2010) and has been made possible through the use of genome-wide investigation techniques such as RNA sequencing (RNA-seq) and chromatin immunoprecipitation-sequencing (ChIP-seq). [1] RNA-seq permits the direct identification of eRNAs by matching the detected transcript to the corresponding enhancer sequence through bioinformatic analyses. [26] [4] ChIP-seq represents a less direct way to assess enhancer transcription but can also provide crucial information as specific chromatin marks are associated with active enhancers. [27] Although some data remain controversial, the consensus in the literature is that the best combination of histone post-translational modifications at active enhancers is made of H2AZ, H3K27ac, and a high ratio of H3K4me1 over H3K4me3. [27] [28] [29] ChIP experiments can also be conducted with antibodies that recognize RNA Pol II, which can be found at sites of active transcription. [5] The experimental detection of eRNAs is complicated by their low endogenous stability conferred by exosome degradation and nonsense-mediated decay. [22] A comparative study showed that assays enriching for capped and nascent RNAs (with strategies like nuclei run-on and size selection) could capture more eRNAs compared to canonical RNA-seq. [30] These assays include Global/Precision Run-on with cap-selection (GRO/PRO-cap), capped-small RNA-seq (csRNA-seq), Native Elongating Transcript-Cap Analysis of Gene Expression (NET-CAGE), and Precision Run-On sequencing (PRO-seq). [31] Nonetheless, the fact that eRNAs tend to be expressed from active enhancers might make their detection a useful tool to distinguish between active and inactive enhancers.
Evidence that eRNAs cause downstream effects on the efficiency of enhancer activation and gene transcription suggests its functional capabilities and potential importance. [4] The transcription factor p53 has been demonstrated to bind enhancer regions and generate eRNAs in a p53-dependent manner. [32] In cancer, p53 plays a central role in tumor suppression as mutations of the gene are shown to appear in 50% of tumors. [33] These p53-bound enhancer regions (p53BERs) are shown to interact with multiple local and distal gene targets involved in cell proliferation and survival. Furthermore, eRNAs generated by the activation of p53BERs are shown to be required for efficient transcription of the p53 target genes, indicating the likely important regulatory role of eRNAs in tumor suppression and cancer. Generally, mutations in eRNA have been shown to demonstrate similar phenotypic behavior in oncogenesis as compared to protein-coding RNA. [34]
Variations in enhancers have been implicated in human disease but a therapeutic approach to manipulate enhancer activity is currently not available. With the emergence of eRNAs as important components in enhancer activity, powerful therapeutic tools such as RNAi may provide promising routes to target disruption of gene expression.
In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn are wrapped into 30-nanometer fibers that form tightly packed chromatin. Histones prevent DNA from becoming tangled and protect it from DNA damage. In addition, histones play important roles in gene regulation and DNA replication. Without histones, unwound DNA in chromosomes would be very long. For example, each human cell has about 1.8 meters of DNA if completely stretched out; however, when wound about histones, this length is reduced to about 9 micrometers (0.09 mm) of 30 nm diameter chromatin fibers.
In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. The process of gene expression is used by all known life—eukaryotes, prokaryotes, and utilized by viruses—to generate the macromolecular machinery for life.
Transcription is the process of copying a segment of DNA into RNA. Some segments of DNA are transcribed into RNA molecules that can encode proteins, called messenger RNA (mRNA). Other segments of DNA are transcribed into RNA molecules called non-coding RNAs (ncRNAs).
In genetics, an enhancer is a short region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes. Active enhancers typically get transcribed as enhancer or regulatory non-coding RNA, whose expression levels correlate with mRNA levels of target genes.
A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and viruses.
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.
In biology, the epigenome of an organism is the collection of chemical changes to its DNA and histone proteins that affects when, where, and how the DNA is expressed; these changes can be passed down to an organism's offspring via transgenerational epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome. The human epigenome, including DNA methylation and histone modification, is maintained through cell division. The epigenome is essential for normal development and cellular differentiation, enabling cells with the same genetic code to perform different functions. The human epigenome is dynamic and can be influenced by environmental factors such as diet, stress, and toxins.
RNA polymerase II is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryotic cells. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase. A wide range of transcription factors are required for it to bind to upstream gene promoters and begin transcription.
Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.
Long non-coding RNAs are a type of RNA, generally defined as transcripts more than 200 nucleotides that are not translated into protein. This arbitrary limit distinguishes long ncRNAs from small non-coding RNAs, such as microRNAs (miRNAs), small interfering RNAs (siRNAs), Piwi-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs), and other short RNAs. Given that some lncRNAs have been reported to have the potential to encode small proteins or micro-peptides, the latest definition of lncRNA is a class of transcripts of over 200 nucleotides that have no or limited coding capacity. However, John S. Mattick and colleagues suggested to change definition of long non-coding RNAs to transcripts more than 500 nt, which are mostly generated by Pol II. That means that question of lncRNA exact definition is still under discussion in the field. Long intervening/intergenic noncoding RNAs (lincRNAs) are sequences of transcripts that do not overlap protein-coding genes.
RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as SRB proteins.
Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.
RNA polymerase IV is an enzyme that synthesizes small interfering RNA (siRNA) in plants, which silence gene expression. RNAP IV belongs to a family of enzymes that catalyze the process of transcription known as RNA Polymerases, which synthesize RNA from DNA templates. Discovered via phylogenetic studies of land plants, genes of RNAP IV are thought to have resulted from multistep evolution processes that occurred in RNA Polymerase II phylogenies. Such an evolutionary pathway is supported by the fact that RNAP IV is composed of 12 protein subunits that are either similar or identical to RNA polymerase II, and is specific to plant genomes. Via its synthesis of siRNA, RNAP IV is involved in regulation of heterochromatin formation in a process known as RNA directed DNA Methylation (RdDM).
H3K4me3 is an epigenetic modification to the DNA packaging protein Histone H3 that indicates tri-methylation at the 4th lysine residue of the histone H3 protein and is often involved in the regulation of gene expression. The name denotes the addition of three methyl groups (trimethylation) to the lysine 4 on the histone H3 protein.
H3K27me3 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.
H3R17me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 17th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H3S10P is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the phosphorylation the 10th serine residue of the histone H3 protein.
H3S28P is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the phosphorylation the 28th serine residue of the histone H3 protein.