R-loop

Last updated
Schematic representation of factors promoting R-loop formation and stabilization R-loop promoting factors.jpg
Schematic representation of factors promoting R-loop formation and stabilization

An R-loop is a three-stranded nucleic acid structure, composed of a DNA:RNA hybrid and the associated non-template single-stranded DNA. R-loops may be formed in a variety of circumstances and may be tolerated or cleared by cellular components. The term "R-loop" was given to reflect the similarity of these structures to D-loops; the "R" in this case represents the involvement of an RNA moiety.

Contents

In the laboratory, R-loops may also be created by the hybridization of mature mRNA with double-stranded DNA under conditions favoring the formation of a DNA-RNA hybrid; in this case, the intron regions (which have been spliced out of the mRNA) form single-stranded DNA loops, as they cannot hybridize with complementary sequence in the mRNA. [1]

History

An illustration showing how a DNA-mRNA hybrid forms R-Loops in the regions where introns have been removed through splicing exons. R loop.jpg
An illustration showing how a DNA-mRNA hybrid forms R-Loops in the regions where introns have been removed through splicing exons.

R-looping was first described in 1976. [2] Independent R-looping studies from the laboratories of Richard J. Roberts and Phillip A. Sharp showed that protein coding adenovirus genes contained DNA sequences that were not present in the mature mRNA. [3] [4] Roberts and Sharp were awarded the Nobel Prize in 1993 for independently discovering introns. After their discovery in adenovirus, introns were found in a number of eukaryotic genes such as the eukaryotic ovalbumin gene (first by the O'Malley laboratory, then confirmed by other groups), [5] [6] hexon DNA, [3] and extrachromosomal rRNA genes of Tetrahymena thermophila . [7]

In the mid-1980s, development of an antibody that binds specifically to the R-loop structure opened the door for immunofluorescence studies, as well as genome-wide characterization of R-loop formation by DRIP-seq. [8]

R-loop mapping

R-loop mapping is a laboratory technique used to distinguish introns from exons in double-stranded DNA. [9] These R-loops are visualized by electron microscopy and reveal intron regions of DNA by creating unbound loops at these regions. [10]

R-loops in vivo

The potential for R-loops to serve as replication primers was demonstrated in 1980. [11] In 1994, R-loops were demonstrated to be present in vivo through analysis of plasmids isolated from E. coli mutants carrying mutations in topoisomerase. [12] This discovery of endogenous R-loops, in conjunction with rapid advances in genetic sequencing technologies, inspired a blossoming of R-loop research in the early 2000s that continues to this day. [13]

Regulation of R-loop formation and resolution

RNaseH enzymes are the primary proteins responsible for the dissolution of R-loops, acting to degrade the RNA moiety in order to allow the two complementary DNA strands to anneal. [14] Research over the past decade has identified more than 50 proteins that appear to influence R-loop accumulation, and while many of them are believed to contribute by sequestering or processing newly transcribed RNA to prevent re-annealing to the template, mechanisms of R-loop interaction for many of these proteins remain to be determined. [15]

Roles of R-loops in genetic regulation

R-loop formation is a key step in immunoglobulin class switching, a process that allows activated B cells to modulate antibody production. [16] They also appear to play a role in protecting some active promoters from methylation. [17] The presence of R-loops can also inhibit transcription. [18] Additionally, R-loop formation appears to be associated with “open” chromatin, characteristic of actively transcribed regions. [19] [20]

R-loops as genetic damage

When unscheduled R-loops form, they can cause damage by a number of different mechanisms. [21] Exposed single-stranded DNA can come under attack by endogenous mutagens, including DNA-modifying enzymes such as activation-induced cytidine deaminase, and can block replication forks to induce fork collapse and subsequent double-strand breaks. [22] As well, R-loops may induce unscheduled replication by acting as a primer. [11] [20]

R-loop accumulation has been associated with a number of diseases, including amyotrophic lateral sclerosis type 4 (ALS4), ataxia oculomotor apraxia type 2 (AOA2), Aicardi–Goutières syndrome, Angelman syndrome, Prader–Willi syndrome, and cancer. [13]

R-loops, Introns and DNA damage

Introns are non-coding regions within genes that are transcribed along with the coding regions of genes, but are subsequently removed from the primary RNA transcript by splicing. Actively transcribed regions of DNA often form R-loops that are vulnerable to DNA damage. Introns reduce R-loop formation and DNA damage in highly expressed yeast genes. [23] Genome-wide analysis showed that intron-containing genes display decreased R-loop levels and decreased DNA damage compared to intron-less genes of similar expression in both yeast and humans. [23] Inserting an intron within an R-loop prone gene can also suppress R-loop formation and recombination. Bonnet et al. (2017) [23] speculated that the function of introns in maintaining genetic stability may explain their evolutionary maintenance at certain locations, particularly in highly expressed genes.

See also

Related Research Articles

<span class="mw-page-title-main">Exon</span> A region of a transcribed gene present in the final functional mRNA molecule

An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature RNA. Just as the entire set of genes for a species constitutes the genome, the entire set of exons constitutes the exome.

<span class="mw-page-title-main">Histone</span> Family proteins package and order the DNA into structural units called nucleosomes.

In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn are wrapped into 30-nanometer fibers that form tightly packed chromatin. Histones prevent DNA from becoming tangled and protect it from DNA damage. In addition, histones play important roles in gene regulation and DNA replication. Without histones, unwound DNA in chromosomes would be very long. For example, each human cell has about 1.8 meters of DNA if completely stretched out; however, when wound about histones, this length is reduced to about 90 micrometers (0.09 mm) of 30 nm diameter chromatin fibers.

An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intragenic region, i.e., a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts. The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons.

<span class="mw-page-title-main">Lambda phage</span> Bacteriophage that infects Escherichia coli

Enterobacteria phage λ is a bacterial virus, or bacteriophage, that infects the bacterial species Escherichia coli. It was discovered by Esther Lederberg in 1950. The wild type of this virus has a temperate life cycle that allows it to either reside within the genome of its host through lysogeny or enter into a lytic phase, during which it kills and lyses the cell to produce offspring. Lambda strains, mutated at specific sites, are unable to lysogenize cells; instead, they grow and enter the lytic cycle after superinfecting an already lysogenized cell.

<span class="mw-page-title-main">RNA</span> Family of large biological molecules

Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself or by forming a template for production of proteins. RNA and deoxyribonucleic acid (DNA) are nucleic acids. The nucleic acids constitute one of the four major macromolecules essential for all known forms of life. RNA is assembled as a chain of nucleotides. Cellular organisms use messenger RNA (mRNA) to convey genetic information that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.

An inverted repeat is a single stranded sequence of nucleotides followed downstream by its reverse complement. The intervening sequence of nucleotides between the initial sequence and the reverse complement can be any length including zero. For example, 5'---TTACGnnnnnnCGTAA---3' is an inverted repeat sequence. When the intervening length is zero, the composite sequence is a palindromic sequence.

<span class="mw-page-title-main">Transcription (biology)</span> Process of copying a segment of DNA into RNA

Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.

<span class="mw-page-title-main">Spliceosome</span> Molecular machine that removes intron RNA from the primary transcript

A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specific proteins to form a small nuclear ribonucleoprotein complex, which in turn combines with other snRNPs to form a large ribonucleoprotein complex called a spliceosome. The spliceosome removes introns from a transcribed pre-mRNA, a type of primary transcript. This process is generally referred to as splicing. An analogy is a film editor, who selectively cuts out irrelevant or incorrect material from the initial film and sends the cleaned-up version to the director for the final cut.

<span class="mw-page-title-main">Nucleoid</span> Region within a prokaryotic cell containing genetic material

The nucleoid is an irregularly shaped region within the prokaryotic cell that contains all or most of the genetic material. The chromosome of a typical prokaryote is circular, and its length is very large compared to the cell dimensions, so it needs to be compacted in order to fit. In contrast to the nucleus of a eukaryotic cell, it is not surrounded by a nuclear membrane. Instead, the nucleoid forms by condensation and functional arrangement with the help of chromosomal architectural proteins and RNA molecules as well as DNA supercoiling. The length of a genome widely varies and a cell may contain multiple copies of it.

<span class="mw-page-title-main">Primary transcript</span> RNA produced by transcription

A primary transcript is the single-stranded ribonucleic acid (RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs are modified in preparation for translation. For example, a precursor mRNA (pre-mRNA) is a type of primary transcript that becomes a messenger RNA (mRNA) after processing.

<span class="mw-page-title-main">Adeno-associated virus</span> Species of virus

Adeno-associated viruses (AAV) are small viruses that infect humans and some other primate species. They belong to the genus Dependoparvovirus, which in turn belongs to the family Parvoviridae. They are small replication-defective, nonenveloped viruses and have linear single-stranded DNA (ssDNA) genome of approximately 4.8 kilobases (kb).

<span class="mw-page-title-main">Triple-stranded DNA</span> DNA structure

Triple-stranded DNA is a DNA structure in which three oligonucleotides wind around each other and form a triple helix. In triple-stranded DNA, the third strand binds to a B-form DNA double helix by forming Hoogsteen base pairs or reversed Hoogsteen hydrogen bonds.

In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, sense may have slightly different meanings. For example, negative-sense strand of DNA is equivalent to the template strand, whereas the positive-sense strand is the non-template strand whose nucleotide sequence is equivalent to the sequence of the mRNA transcript.

<span class="mw-page-title-main">Eukaryotic transcription</span> Transcription is heterocatalytic function of DNA

Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.

<span class="mw-page-title-main">Intrinsic termination</span>

Intrinsic, or rho-independent termination, is a process in prokaryotes to signal the end of transcription and release the newly constructed RNA molecule. In prokaryotes such as E. coli, transcription is terminated either by a rho-dependent process or rho-independent process. In the Rho-dependent process, the rho-protein locates and binds the signal sequence in the mRNA and signals for cleavage. Contrarily, intrinsic termination does not require a special protein to signal for termination and is controlled by the specific sequences of RNA. When the termination process begins, the transcribed mRNA forms a stable secondary structure hairpin loop, also known as a Stem-loop. This RNA hairpin is followed by multiple uracil nucleotides. The bonds between uracil and adenine are very weak. A protein bound to RNA polymerase (nusA) binds to the stem-loop structure tightly enough to cause the polymerase to temporarily stall. This pausing of the polymerase coincides with transcription of the poly-uracil sequence. The weak adenine-uracil bonds lower the energy of destabilization for the RNA-DNA duplex, allowing it to unwind and dissociate from the RNA polymerase. Overall, the modified RNA structure is what terminates transcription.

In molecular biology, a displacement loop or D-loop is a DNA structure where the two strands of a double-stranded DNA molecule are separated for a stretch and held apart by a third strand of DNA. An R-loop is similar to a D-loop, but in this case the third strand is RNA rather than DNA. The third strand has a base sequence which is complementary to one of the main strands and pairs with it, thus displacing the other complementary main strand in the region. Within that region the structure is thus a form of triple-stranded DNA. A diagram in the paper introducing the term illustrated the D-loop with a shape resembling a capital "D", where the displaced strand formed the loop of the "D".

Genome instability refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or aneuploidy. Genome instability does occur in bacteria. In multicellular organisms genome instability is central to carcinogenesis, and in humans it is also a factor in some neurodegenerative diseases such as amyotrophic lateral sclerosis or the neuromuscular disease myotonic dystrophy.

DRIP-seq (DRIP-sequencing) is a technology for genome-wide profiling of a type of DNA-RNA hybrid called an "R-loop". DRIP-seq utilizes a sequence-independent but structure-specific antibody for DNA-RNA immunoprecipitation (DRIP) to capture R-loops for massively parallel DNA sequencing.

<span class="mw-page-title-main">Nuclear organization</span> Spatial distribution of chromatin within a cell nucleus

Nuclear organization refers to the spatial distribution of chromatin within a cell nucleus. There are many different levels and scales of nuclear organisation. Chromatin is a higher order structure of DNA.

Transcription-translation coupling is a mechanism of gene expression regulation in which synthesis of an mRNA (transcription) is affected by its concurrent decoding (translation). In prokaryotes, mRNAs are translated while they are transcribed. This allows communication between RNA polymerase, the multisubunit enzyme that catalyzes transcription, and the ribosome, which catalyzes translation. Coupling involves both direct physical interactions between RNA polymerase and the ribosome, as well as ribosome-induced changes to the structure and accessibility of the intervening mRNA that affect transcription.

References

  1. Wang, Kang; Wang, Honghong; Li, Conghui; Yin, Zhinang; Xiao, Ruijing; Li, Qiuzi; Xiang, Ying; Wang, Wen; Huang, Jian; Chen, Liang; Fang, Pingping; Liang, Kaiwei (2021-02-19). "Genomic profiling of native R loops with a DNA-RNA hybrid recognition sensor". Science Advances. 7 (8). doi:10.1126/sciadv.abe3516. ISSN   2375-2548. PMC   7888926 . PMID   33597247.
  2. Thomas M, White RL, Davis RW (July 1976). "Hybridization of RNA to double-stranded DNA: formation of R-loops". Proceedings of the National Academy of Sciences of the United States of America. 73 (7): 2294–8. Bibcode:1976PNAS...73.2294T. doi: 10.1073/pnas.73.7.2294 . PMC   430535 . PMID   781674.
  3. 1 2 Berget SM, Moore C, Sharp PA (August 1977). "Spliced segments at the 5' terminus of adenovirus 2 late mRNA". Proceedings of the National Academy of Sciences of the United States of America. 74 (8): 3171–5. Bibcode:1977PNAS...74.3171B. doi: 10.1073/pnas.74.8.3171 . PMC   431482 . PMID   269380.
  4. Chow LT, Gelinas RE, Broker TR, Roberts RJ (September 1977). "An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA". Cell. 12 (1): 1–8. doi:10.1016/0092-8674(77)90180-5. PMID   902310. S2CID   2099968.
  5. Lai EC, Woo SL, Dugaiczyk A, Catterall JF, O'Malley BW (May 1978). "The ovalbumin gene: structural sequences in native chicken DNA are not contiguous". Proceedings of the National Academy of Sciences of the United States of America. 75 (5): 2205–9. Bibcode:1978PNAS...75.2205L. doi: 10.1073/pnas.75.5.2205 . PMC   392520 . PMID   276861.
  6. O'Hare K, Breathnach R, Benoist C, Chambon P (September 1979). "No more than seven interruptions in the ovalbumin gene: comparison of genomic and double-stranded cDNA sequences". Nucleic Acids Research. 7 (2): 321–34. doi:10.1093/nar/7.2.321. PMC   328020 . PMID   493147.
  7. Cech TR, Rio DC (October 1979). "Localization of transcribed regions on extrachromosomal ribosomal RNA genes of Tetrahymena thermophila by R-loop mapping". Proceedings of the National Academy of Sciences of the United States of America. 76 (10): 5051–5. Bibcode:1979PNAS...76.5051C. doi: 10.1073/pnas.76.10.5051 . PMC   413077 . PMID   291921.
  8. Boguslawski SJ, Smith DE, Michalak MA, Mickelson KE, Yehle CO, Patterson WL, Carrico RJ (May 1986). "Characterization of monoclonal antibody to DNA.RNA and its application to immunodetection of hybrids". Journal of Immunological Methods. 89 (1): 123–30. doi:10.1016/0022-1759(86)90040-2. PMID   2422282.
  9. Woolford JL, Rosbash M (June 1979). "The use of R-looping for structural gene identification and mRNA purification". Nucleic Acids Research. 6 (7): 2483–97. doi:10.1093/nar/6.7.2483. PMC   327867 . PMID   379820.
  10. King RC, Stansfield WD, Mulligan PK (2007). A Dictionary of Genetics. Oxford University Press 7.
  11. 1 2 Itoh T, Tomizawa J (May 1980). "Formation of an RNA primer for initiation of replication of ColE1 DNA by ribonuclease H". Proceedings of the National Academy of Sciences of the United States of America. 77 (5): 2450–4. Bibcode:1980PNAS...77.2450I. doi: 10.1073/pnas.77.5.2450 . PMC   349417 . PMID   6156450.
  12. Drolet M, Bi X, Liu LF (January 1994). "Hypernegative supercoiling of the DNA template during transcription elongation in vitro". The Journal of Biological Chemistry. 269 (3): 2068–74. doi: 10.1016/S0021-9258(17)42136-3 . PMID   8294458.
  13. 1 2 Groh M, Gromak N (September 2014). "Out of balance: R-loops in human disease". PLOS Genetics. 10 (9): e1004630. doi:10.1371/journal.pgen.1004630. PMC   4169248 . PMID   25233079.
  14. Cerritelli SM, Crouch RJ (March 2009). "Ribonuclease H: the enzymes in eukaryotes". The FEBS Journal. 276 (6): 1494–505. doi:10.1111/j.1742-4658.2009.06908.x. PMC   2746905 . PMID   19228196.
  15. Chan YA, Aristizabal MJ, Lu PY, Luo Z, Hamza A, Kobor MS, Stirling PC, Hieter P (April 2014). "Genome-wide profiling of yeast DNA:RNA hybrid prone sites with DRIP-chip". PLOS Genetics. 10 (4): e1004288. doi:10.1371/journal.pgen.1004288. PMC   3990523 . PMID   24743342.
  16. Roy D, Yu K, Lieber MR (January 2008). "Mechanism of R-loop formation at immunoglobulin class switch sequences". Molecular and Cellular Biology. 28 (1): 50–60. doi:10.1128/mcb.01251-07. PMC   2223306 . PMID   17954560.
  17. Ginno PA, Lott PL, Christensen HC, Korf I, Chédin F (March 2012). "R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters". Molecular Cell. 45 (6): 814–25. doi:10.1016/j.molcel.2012.01.017. PMC   3319272 . PMID   22387027.
  18. D'Souza AD, Belotserkovskii BP, Hanawalt PC (February 2018). "A novel mode for transcription inhibition mediated by PNA-induced R-loops with a model in vitro system". Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 1861 (2): 158–166. doi:10.1016/j.bbagrm.2017.12.008. PMC   5820110 . PMID   29357316.
  19. Castellano-Pozo M, Santos-Pereira JM, Rondón AG, Barroso S, Andújar E, Pérez-Alegre M, García-Muse T, Aguilera A (November 2013). "R loops are linked to histone H3 S10 phosphorylation and chromatin condensation". Molecular Cell. 52 (4): 583–90. doi: 10.1016/j.molcel.2013.10.006 . PMID   24211264.
  20. 1 2 Costantino L, Koshland D (June 2015). "The Yin and Yang of R-loop biology". Current Opinion in Cell Biology. 34: 39–45. doi:10.1016/j.ceb.2015.04.008. PMC   4522345 . PMID   25938907.
  21. Belotserkovskii BP, Tornaletti S, D'Souza AD, Hanawalt PC (November 2018). "R-loop generation during transcription: Formation, processing and cellular outcomes". DNA Repair. 71: 69–81. doi:10.1016/j.dnarep.2018.08.009. PMC   6340742 . PMID   30190235.
  22. Sollier J, Cimprich KA (September 2015). "Breaking bad: R-loops and genome integrity". Trends in Cell Biology. 25 (9): 514–22. doi:10.1016/j.tcb.2015.05.003. PMC   4554970 . PMID   26045257.
  23. 1 2 3 Bonnet A, Grosso AR, Elkaoutari A, Coleno E, Presle A, Sridhara SC, Janbon G, Géli V, de Almeida SF, Palancade B (August 2017). "Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability". Molecular Cell. 67 (4): 608–621.e6. doi: 10.1016/j.molcel.2017.07.002 . PMID   28757210.