A typical enhancer(TE), as illustrated in the top panel of the Figure, is a several hundred base pair region of DNA[1][2] that can bind transcription factors to sequence motifs on the enhancer. The typical enhancer can come in proximity to its target gene through a large chromosome loop. A Mediator a complex (consisting of about 26 proteins in an interacting structure) communicates regulatory signals from the enhancer-located DNA-bound transcription factors to the promoter of a gene, regulating RNA transcription of the target gene.
A super-enhancer, illustrated in the lower panel of the Figure, is a region of the mammalian genome comprising multiple typical enhancers that is collectively bound by an array of transcription factor proteins to drive transcription of genes involved in cell identity,[3][4][5] or of genes involved in cancer.[6] Because super-enhancers frequently occur near genes important for controlling and defining cell identity, they may be used to quickly identify key nodes regulating cell identity.[5][7] Super-enhancers are also central to mediating dysregulation of signaling pathways and promoting cancer cell growth.[6][8] Super-enhancers differ from typical enhancers, however, in that they are strongly dependent on additional specialized proteins that create and maintain their formation, including BRD4 (shown in the lower panel of Figure) and co-factors including p300.[9]
Enhancers have several quantifiable traits that have a range of values, and these traits are generally elevated at super-enhancers. Super-enhancers are bound by higher levels of transcription-regulating proteins and are associated with genes that are more highly expressed.[3][10][11][12] Expression of genes associated with super-enhancers is particularly sensitive to perturbations, which may facilitate cell state transitions or explain sensitivity of super-enhancer—associated genes to small molecules that target transcription.[3][10][11][13][14]
Frequency of super-enhancers
In many cell types, only a minority of activated enhancers are located in Super-Enhancers (SEs). For specialized tissue, such as skeletal muscle, a reduced number of genes are expressed and a low number of specialized and activated super-enhancers are found. In human skeletal muscle, there are nine identified types of cells. On average, the number of expressed genes in these nine cell types is 1,331.[15] There are also about 22 super-enhancers specific to skeletal muscle cells among the nine types of skeletal muscle cells, indicating that specialized super-enhancers in these cells are about 1.7% of the number of typical enhancers (TEs).[16] In immune-system B cells, a study identified 140 SEs and 4,290 TEs in non-stimulated B cells (SEs were 3.2% of activated transcription areas). In stimulated B cells SEs were 3.6% of activated transcription areas.[17] Similarly, in mouse embryonic stem cells, 231 SEs were found, compared to 8,794 TEs, with SEs comprising 2.6% of activated chromatin regions.[18] A study of neural stem cells found 445 SEs and 9436 TEs, so that SEs were 4.7% of active enhancer regions.[19]
Formation of super-enhancers
Hundreds of thousands of sites in the human genome can potentially act as enhancers. In one large 2020 study, 78 different types of human cells were examined for links between activated enhancers and genes coding for messenger RNA to produce gene products. Distributed among the 78 types of cells there were a total of 449,627 activated enhancers linked to 17,643 protein-coding genes.[20] With this large number of potentially active enhancers, there are some genome regions with a cluster of enhancers that, when all are activated they can all loop to the same promoter and produce a super-enhancer, driving a gene to have very high messenger RNA output.
One well-studied gene, MYC, has amplified expression in as many as 70% of all cancers.[21] While about 28% of its over-expressions are due to genetic focal amplifications or translocations,[22] the majority of cases of over-expression of MYC are due to activated super-enhancers.[23] There are more than 10 different super-enhancers that can cause MYC over-expression. For each of 4 tumor types of cells grown in culture (HCT-116, MCF7, K562 and Jurkat) there were three to five super-enhancers specific to each tumor cell type.
In one 2013 study,[24] the length of typical enhancers was found to be about 700 base pairs while in the case of super-enhancers the length was about 9,000 base pairs (encompassing multiple single enhancers). A later study, in 2020, indicated that typical enhancers were about 200 nucleotides long and that there may be as many as 3.6 million potentially active enhancers occupying 21.55% of the human genome.[25]
In the nucleus of mammalian cells, almost all the DNA is wrapped around regularly spaced protein complexes, called nucleosomes (see top panel in Figure "Chromatin").[26] The protein complexes are composed of 4 pairs of histones, H2A, H2B, H3 and H4. The DNA plus these protein complexes is called chromatin (see Figure illustrating chromatin). Enhancer regions, as described above, are several hundred nucleotides long. To be activated, the enhancer region must have the nucleosomes evicted from the DNA so that the multiple transcription factors that bind to that enhancer DNA would have access to their binding sites (see bottom panel in Figure "Chromatin"). (To be an active enhancer, more than 10 different binding sites must be occupied by different transcription factors in the enhancer.[25])
In eviction of nucleosomes from enhancer DNA, a pioneer transcription factor first loosens up the attachment of DNA to the nucleosome of an enhancer region. For instance, one transcription factor that does this is the pioneer transcription factor NF-kB .[28] Five steps follow this: (1) NF-kB is acetylated by p300/CBP. (2) Acetylated NF-kB recruits a specific histone acetyltransferase enzyme, BRD4.[29] (3) BRD4 acetylates histone 3 at histone 3 lysine 122 (see Figure “Nucleosome at enhancer with H3K122 acetylated”). (4) When histone 3 lysine 122 is acetylated the nucleosome is evicted from the enhancer sequence.[30] (5) Opening up the enhancer DNA allows binding of the other transcription factors needed to form an activated enhancer. Presumably, when the activating signal for NF-kB is very strong, much more NF-kB is activated, and then greatly increased NF-kB can start the process of activating multiple nearby enhancers at the same time, forming a super-enhancer.
Super-enhancers promote high levels of transcription
As described above, in forming a super-enhancer, BRD4 is complexed with NF-kB. This complex also recruits and forms a further complex with cyclin T1 and Cdk9. Cyclin T1/Cdk9 is also known as P-TEFb. P-TEFb acts as a kinase that phosphorylates RNA polymerase II (RNAP II), which then activates (in conjunction with the Mediator complex described below) the polymerase on the promoter of a gene to initiate transcription and to continue transcription (instead of pausing).[31]
The transcription factors, bound to their sites on each enhancer within the super-enhancer, recruit the Mediator complex between each enhancer and the RNA polymerase II that will initiate transcription of the gene to be actively transcribed (see Figure at top of article that illustrates a super-enhancer). The Mediator complex in humans is 1.4 MDa in size and includes 26 sub-units.[32] The tail modules of the Mediator complex protein sub-units interact with the activation domains of transcription factors bound at enhancers and the head and middle modules interact with the pre-initiation complex (PIC) at gene promoters.[33] The Mediator complex, when certain sub-units are phosphorylated and up-activated by particular cyclin-dependent kinases (Cdk8, Cdk9, Cdk19, etc.) it will then promote higher levels of transcription.
History
The regulation of transcription by enhancers has been studied since the 1980s.[34][35][36][37][38] Large or multi-component transcription regulators with a range of mechanistic properties, including locus control regions, clustered open regulatory elements, and transcription initiation platforms, were observed shortly thereafter.[39][40][41][42] More recent research has suggested that these different categories of regulatory elements may represent subtypes of super-enhancer.[5][43]
In 2013, two labs identified large enhancers near several genes especially important for establishing cell identities. While Richard A. Young and colleagues identified super-enhancers, Francis Collins and colleagues identified stretch enhancers.[3][4] Both super-enhancers and stretch enhancers are clusters of enhancers that control cell-specific genes and may be largely synonymous.[4][44]
As currently defined, the term “super-enhancer” was introduced by Young’s lab to describe regions identified in mouse embryonic stem cells (ESCs).[3] These particularly large, potent enhancer regions were found to control the genes that establish the embryonic stem cell identity, including Oct-4, Sox2, Nanog, Klf4, and Esrrb. Perturbation of the super-enhancers associated with these genes showed a range of effects on their target genes’ expression.[44] Super-enhancers have been since identified near cell identity-regulators in a range of mouse and human tissues. [4][5][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61]
Function
The enhancers comprising super-enhancers share the functions of enhancers, including binding transcription factor proteins, looping to target genes, and activating transcription.[3][5][43][44] Three notable traits of enhancers comprising super-enhancers are their clustering in genomic proximity, their exceptional signal of transcription-regulating proteins, and their high frequency of physical interaction with each other. Perturbing the DNA of enhancers comprising super-enhancers showed a range of effects on the expression of cell identity genes, suggesting a complex relationship between the constituent enhancers.[44] Super-enhancers separated by tens of megabases cluster in three-dimensions inside the nucleus of mouse embryonic stem cells.[62][63]
High levels of many transcription factors and co-factors are seen at super-enhancers (e.g., CDK7, BRD4, and Mediator).[3][5][10][11][13][14][43] This high concentration of transcription-regulating proteins suggests why their target genes tend to be more highly expressed than other classes of genes. However, housekeeping genes tend to be more highly expressed than super-enhancer—associated genes.[3]
Super-enhancers may have evolved at key cell identity genes to render the transcription of these genes responsive to an array of external cues.[44] The enhancers comprising a super-enhancer can each be responsive to different signals, which allows the transcription of a single gene to be regulated by multiple signaling pathways.[44] Pathways seen to regulate their target genes using super-enhancers include Wnt, TGFb, LIF, BDNF, and NOTCH.[44][64][65][66][67] The constituent enhancers of super-enhancers physically interact with each other and their target genes over a long range sequence-wise.[12][46][68] Super-enhancers that control the expression of major cell surface receptors with a crucial role in the function of a given cell lineage have also been defined. This is notably the case for B-lymphocytes, the survival, the activation and the differentiation of which rely on the expression of membrane-form immunoglobulins (Ig). The Ig heavy chain locus super-enhancer is a very large (25kb) cis-regulatory region, including multiple enhancers and controlling several major modifications of the locus (notably somatic hypermutation, class-switch recombination and locus suicide recombination).
Relevance to Disease
Mutations in super-enhancers have been noted in various diseases, including cancers, type 1 diabetes, Alzheimer’s disease, lupus, rheumatoid arthritis, multiple sclerosis, systemic scleroderma, primary biliary cirrhosis, Crohn’s disease, Graves disease, vitiligo, and atrial fibrillation.[4][5][11][49][56][59][69][70][71][72][73] A similar enrichment in disease-associated sequence variation has also been observed for stretch enhancers.[4]
Super-enhancers may play important roles in the misregulation of gene expression in cancer. During tumor development, tumor cells acquire super-enhancers at key oncogenes, which drive higher levels of transcription of these genes than in healthy cells.[5][10][68][69][74][75][76][77][78][79][80][81][82][83] Altered super-enhancer function is also induced by mutations of chromatin regulators.[84] Acquired super-enhancers may thus be biomarkers that could be useful for diagnosis and therapeutic intervention.[44]
Proteins enriched at super-enhancers include the targets of small molecules that target transcription-regulating proteins and have been deployed against cancers.[10][11][49][85] For instance, super-enhancers rely on exceptional amounts of CDK7, and, in cancer, multiple papers report the loss of expression of their target genes when cells are treated with the CDK7 inhibitor THZ1.[10][13][14][86] Similarly, super-enhancers are enriched in the target of the JQ1 small molecule, BRD4, so treatment with JQ1 causes exceptional losses in expression for super-enhancer—associated genes.[11]
Identification
Super-enhancers have been most commonly identified by locating genomic regions that are highly enriched in ChIP-Seq signal. ChIP-Seq experiments targeting master transcription factors and co-factors like Mediator or BRD4 have been used, but the most frequently used is H3K27ac-marked nucleosomes.[3][5][11][87][88][89] The program “ROSE” (Rank Ordering of Super-Enhancers) is commonly used to identify super-enhancers from ChIP-Seq data. This program stitches together previously identified enhancer regions and ranks these stitched enhancers by their ChIP-Seq signal.[3] The stitching distance selected to combine multiple individual enhancers into larger domains can vary. Because some markers of enhancer activity also are enriched in promoters, regions within promoters of genes can be disregarded. ROSE separates super-enhancers from typical enhancers by their exceptional enrichment in a mark of enhancer activity. Homer is another tool that can identify super-enhancers.[90]
Related Research Articles
In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins produce messenger RNA (mRNA). Other segments of DNA are transcribed into RNA molecules called non-coding RNAs (ncRNAs).
In genetics, an enhancer is a short region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes. Active enhancers typically get transcribed as enhancer or regulatory non-coding RNA, whose expression levels correlate with mRNA levels of target genes.
A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and viruses.
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.
An insulator is a type of cis-regulatory element known as a long-range regulatory element. Found in multicellular eukaryotes and working over distances from the promoter element of the target gene, an insulator is typically 300 bp to 2000 bp in length. Insulators contain clustered binding sites for sequence specific DNA-binding proteins and mediate intra- and inter-chromosomal interactions.
Mothers against decapentaplegic homolog 3 also known as SMAD family member 3 or SMAD3 is a protein that in humans is encoded by the SMAD3 gene.
Myc is a family of regulator genes and proto-oncogenes that code for transcription factors. The Myc family consists of three related human genes: c-myc (MYC), l-myc (MYCL), and n-myc (MYCN). c-myc was the first gene to be discovered in this family, due to homology with the viral gene v-myc.
Transcriptional repressor CTCF also known as 11-zinc finger protein or CCCTC-binding factor is a transcription factor that in humans is encoded by the CTCF gene. CTCF is involved in many cellular processes, including transcriptional regulation, insulator activity, V(D)J recombination and regulation of chromatin architecture.
Chromatin remodeling is the dynamic modification of chromatin architecture to allow access of condensed genomic DNA to the regulatory transcription machinery proteins, and thereby control gene expression. Such remodeling is principally carried out by 1) covalent histone modifications by specific enzymes, e.g., histone acetyltransferases (HATs), deacetylases, methyltransferases, and kinases, and 2) ATP-dependent chromatin remodeling complexes which either move, eject or restructure nucleosomes. Besides actively regulating gene expression, dynamic remodeling of chromatin imparts an epigenetic regulatory role in several key biological processes, egg cells DNA replication and repair; apoptosis; chromosome segregation as well as development and pluripotency. Aberrations in chromatin remodeling proteins are found to be associated with human diseases, including cancer. Targeting chromatin remodeling pathways is currently evolving as a major therapeutic strategy in the treatment of several cancers.
Metastasis-associated protein MTA1 is a protein that in humans is encoded by the MTA1 gene. MTA1 is the founding member of the MTA family of genes. MTA1 is primarily localized in the nucleus but also found to be distributed in the extra-nuclear compartments. MTA1 is a component of several chromatin remodeling complexes including the nucleosome remodeling and deacetylation complex (NuRD). MTA1 regulates gene expression by functioning as a coregulator to integrate DNA-interacting factors to gene activity. MTA1 participates in physiological functions in the normal and cancer cells. MTA1 is one of the most upregulated proteins in human cancer and associates with cancer progression, aggressive phenotypes, and poor prognosis of cancer patients.
SRY -box 2, also known as SOX2, is a transcription factor that is essential for maintaining self-renewal, or pluripotency, of undifferentiated embryonic stem cells. Sox2 has a critical role in maintenance of embryonic and neural stem cells.
AT-rich interactive domain-containing protein 1A is a protein that in humans is encoded by the ARID1A gene.
Bromodomain-containing protein 4 is a protein that in humans is encoded by the BRD4 gene.
Mediator of RNA polymerase II transcription subunit 17 is an enzyme that in humans is encoded by the MED17 gene.
Bromodomain-containing protein 3 (BRD3) also known as RING3-like protein (RING3L) is a protein that in humans is encoded by the BRD3 gene. This gene was identified based on its homology to the gene encoding the RING3 (BRD2) protein, a serine/threonine kinase. The gene maps to 9q34, a region which contains several major histocompatibility complex (MHC) genes.
Enhancer RNAs (eRNAs) represent a class of relatively long non-coding RNA molecules transcribed from the DNA sequence of enhancer regions. They were first detected in 2010 through the use of genome-wide techniques such as RNA-seq and ChIP-seq. eRNAs can be subdivided into two main classes: 1D eRNAs and 2D eRNAs, which differ primarily in terms of their size, polyadenylation state, and transcriptional directionality. The expression of a given eRNA correlates with the activity of its corresponding enhancer in target genes. Increasing evidence suggests that eRNAs actively play a role in transcriptional regulation in cis and in trans, and while their mechanisms of action remain unclear, a few models have been proposed.
Richard Allen Young is an American geneticist, a Member of Whitehead Institute, and a professor of biology at the Massachusetts Institute of Technology. He is a pioneer in the systems biology of gene control who has developed genomics technologies and concepts key to understanding gene control in human health and disease. He has served as an advisor to the World Health Organization and the National Institutes of Health. He is a member of the National Academy of Sciences and the National Academy of Medicine. Scientific American has recognized him as one of the top 50 leaders in science, technology and business. Young is among the most Highly Cited Researchers in his field.
In genetics, transcriptional amplification is the process in which the total amount of messenger RNA (mRNA) molecules from expressed genes is increased during disease, development, or in response to stimuli.
Transcriptional addiction is a concept in cancer biology where cancer cells become heavily reliant on abnormal transcriptional programs to sustain their survival, growth, and proliferation. This addiction occurs because cancer cells often have dysregulated gene expression pathways, allowing them to evade normal cellular processes such as apoptosis. Transcriptional addiction presents an opportunity for targeted cancer therapies by inhibiting the transcriptional machinery essential for tumor cell survival.
↑ Cameron A, Wakelin G, Gaulton N, Young LV, Wotherspoon S, Hodson N, Lees MJ, Moore DR, Johnston AP (December 2022). "Identification of underexplored mesenchymal and vascular-related cell populations in human skeletal muscle". Am J Physiol Cell Physiol. 323 (6): C1586–C1600. doi:10.1152/ajpcell.00364.2022. PMID36342160.
↑ Michida H, Imoto H, Shinohara H, Yumoto N, Seki M, Umeda M, Hayashi T, Nikaido I, Kasukawa T, Suzuki Y, Okada-Hatakeyama M (June 2020). "The Number of Transcription Factors at an Enhancer Determines Switch-like Gene Expression". Cell Rep. 31 (9): 107724. doi:10.1016/j.celrep.2020.107724. PMID32492432.
↑ Jang MK, Mochizuki K, Zhou M, Jeong HS, Brady JN, Ozato K (August 2005). "The bromodomain protein Brd4 is a positive regulatory component of P-TEFb and stimulates RNA polymerase II-dependent transcription". Mol Cell. 19 (4): 523–34. doi:10.1016/j.molcel.2005.06.027. PMID16109376.
↑ Cellier M, Belouchi A, Gros P (June 1996). "Resistance to intracellular infections: comparative genomic analysis of Nramp". Trends in Genetics. 12 (6): 201–4. doi:10.1016/0168-9525(96)30042-5. PMID8928221.
↑ Koch F, Fenouil R, Gut M, Cauchy P, Albert TK, Zacarias-Cabeza J, Spicuglia S, de la Chapelle AL, Heidemann M, Hintermair C, Eick D, Gut I, Ferrier P, Andrau JC (August 2011). "Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters". Nature Structural & Molecular Biology. 18 (8): 956–63. doi:10.1038/nsmb.2085. PMID21765417. S2CID12778976.
This page is based on this Wikipedia article Text is available under the CC BY-SA 4.0 license; additional terms may apply. Images, videos and audio are available under their respective licenses.