DRIP-seq

Last updated

DRIP-seq (DRIP-sequencing) is a technology for genome-wide profiling of a type of DNA-RNA hybrid called an "R-loop". [1] DRIP-seq utilizes a sequence-independent but structure-specific antibody for DNA-RNA immunoprecipitation (DRIP) to capture R-loops for massively parallel DNA sequencing. [1]

Contents

Introduction

R-loop and S9.6 monoclonal antibody R loop S9.6 mAb.svg
R-loop and S9.6 monoclonal antibody

An R-loop is a three-stranded nucleic acid structure, which consists of a DNA-RNA hybrid duplex and a displaced single stranded DNA (ssDNA). [2] R-loops are predominantly formed in cytosine-rich genomic regions during transcription [2] and are known to be involved with gene expression and immunoglobulin class switching. [1] [3] [4] They have been found in a variety of species, ranging from bacteria to mammals. [2] They are preferentially localized at CpG island promoters in human cells and highly transcribed regions in yeast. [1] [3]

Under abnormal conditions, namely elevated production of DNA-RNA hybrids, R-loops can cause genome instability by exposing single-stranded DNA to endogenous damages exerted by the action of enzymes such as AID and APOBEC, or overexposure to chemically reactive species. [4] Therefore, understanding where and in what circumstances R-loops are formed across the genome is crucial for the better understanding of genome instability. R-loop characterization was initially limited to locus specific approaches. [5] However, upon the arrival of massive parallel sequencing technologies and thereafter derivatives like DRIP-seq, the possibility to investigate entire genomes for R-loops has opened up.

DRIP-seq relies on the high specificity and affinity of the S9.6 monoclonal antibody (mAb) towards DNA-RNA hybrids of various lengths. S9.6 mAb was first created and characterized in 1986 and is currently used for the selective immunoprecipitation of R-loops. [6] Since then, it was used in diverse immunoprecipitation methods for R-loop characterization. [1] [3] [7] [8] The concept behind DRIP-seq is similar to ChIP-sequencing; R-loop fragments are the main immunoprecipitated material in DRIP-seq.

Uses and Current Research

DRIP-seq is mainly used for genome-wide mapping of R-loops. Identifying R-loop formation sites allows the study of diverse cellular events, such as the function of R-loop formation at specific regions, the characterization of these regions, and the impact on gene expression. It can also be used to study the influence of R-loops in other processes like DNA replication and synthesis. Indirectly, DRIP-seq can be performed on mutant cell lines deficient in genes involved in R-loop resolution. [3] These types of studies provide information about the roles of the mutated gene in suppressing DNA-RNA formation and potentially about the significance of R-loops in genome instability.

DRIP-seq was first used for genome-wide profiling of R-loops in humans, which showed widespread R-loop formation at CpG island promoters. [1] Particularly, the researchers found that R-loop formation is associated with the unmethylated state of CpG islands.

DRIP-seq was later used to profile R-loop formation at transcription start and termination sites in human pluripotent Ntera2 cells. [7] In this study, the researchers revealed that R-loops on 3' ends of genes may be correlated with transcription termination.

Workflow of DRIP-seq

Workflow of DRIP-seq Workflow of DRIP-sequencing.svg
Workflow of DRIP-seq

Genomic DNA extraction

First, genomic DNA (gDNA) is extracted from cells of interest by proteinase K treatment followed by phenol-chloroform extraction and ethanol precipitation. Additional zymolyase digestion is necessary for yeast cells to remove the cell wall prior to proteinase K treatment. gDNA can also be extracted with a variety of other methods, such as column-based methods.

Genomic DNA fragmentation

gDNA is treated with S1 nuclease to remove undesired ssDNA and RNA, followed by ethanol precipitation to remove the S1 nuclease. Then, gDNA is fragmented with restriction endonuclease, yielding double-stranded DNA (dsDNA) fragments of different sizes. Alternatively, gDNA fragments can be generated by sonication.

Immunoprecipitation

Fragmented gDNA is incubated with the DNA-RNA structure-specific S9.6 mAb. This step is unique for the DRIP-seq protocol, since it entirely relies on the high specificity and affinity of the S9.6 mAb for DNA-RNA hybrids. The antibody will recognize and bind these regions dispersed across the genome and will be used for immunoprecipitation. The S9.6 antibodies are bound to magnetic beads by interacting with specific ligands (i.e. protein A or protein G) on the surface of the beads. Thus, the DNA-RNA containing fragments will bind to the beads by means of the antibody.

Elution

The magnetic beads are washed to remove any gDNA not bound to the beads by a series of washes and DNA-RNA hybrids are recovered by elution. To remove the antibody bound to the nucleic acid hybrids, proteinase K treatment is performed followed by phenol-chloroform extraction and ethanol precipitation. This results in the isolation of purified DNA-RNA hybrids of different sizes.

Sequencing

For massive parallel sequencing of these fragments, the immunoprecipitated material is sonicated, size selected and ligated to barcoded oligonucleotide adaptors for cluster enrichment and sequencing.

Computational Analysis

To detect sites of R-loop formation, the hundreds of millions of sequencing reads from DRIP-seq are first aligned to a reference genome with a short-read sequence aligner, then peak calling methods designed for ChIP-seq can be used to evaluate DRIP signals. [1] If different cocktails of restriction enzymes were used for different DRIP-seq experiments of the same sample, consensus DRIP-seq peaks are called. [7] Typically, peaks are compared against those from a corresponding RNase H1-treated sample, which serves as an input control. [1] [7]

Limitations

Due to the absence of another antibody-based method for R-loop immunoprecipitation, validation of DRIP-seq results is difficult. However, results of other R-loop profiling methods, such as DRIVE-seq, may be used to measure consensus.

On the other hand, DRIP-seq relies on existing short-read sequencing platforms for the sequencing of R-loops. In other words, all inherent limitations of these platform also apply to DRIP-seq. In particular, typical short-read sequencing platforms would produce uneven read coverage in GC-rich regions. Sequencing long R-loops might pose a challenge because R-loops are predominantly formed in cytosine-rich DNA regions. Moreover, GC-rich regions tend to have low complexity by nature, which is difficult for short read aligners to produce unique alignments.

Other R-loop Profiling Methods

Although there are several other methods for analysis and profiling of R-loop formation, [5] [9] [10] [11] [12] [13] [14] [15] [16] only few provide coverage and robustness at the genome-wide scale. [1] [3]

See also

Related Research Articles

Chromatin is a complex of DNA and protein found in eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in reinforcing the DNA during cell division, preventing DNA damage, and regulating gene expression and DNA replication. During mitosis and meiosis, chromatin facilitates proper segregation of the chromosomes in anaphase; the characteristic shapes of chromosomes visible during this stage are the result of DNA being coiled into highly condensed chromatin.

<span class="mw-page-title-main">Epigenome</span> Biological term

An epigenome consists of a record of the chemical changes to the DNA and histone proteins of an organism; these changes can be passed down to an organism's offspring via transgenerational stranded epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome.

Cross-linking and immunoprecipitation is a method used in molecular biology that combines UV crosslinking with immunoprecipitation in order to identify RNA binding sites of proteins on a transcriptome-wide scale, thereby increasing our understanding of post-transcriptional regulatory networks. CLIP can be used either with antibodies against endogenous proteins, or with common peptide tags or affinity purification, which enables the possibility of profiling model organisms or RBPs otherwise lacking suitable antibodies.

<span class="mw-page-title-main">Nuclear run-on</span>

A nuclear run-on assay is conducted to identify the genes that are being transcribed at a certain time point. Approximately one million cell nuclei are isolated and incubated with labeled nucleotides, and genes in the process of being transcribed are detected by hybridization of extracted RNA to gene specific probes on a blot. Garcia-Martinez et al. (2004) developed a protocol for the yeast S. cerevisiae that allows for the calculation of transcription rates (TRs) for all yeast genes to estimate mRNA stabilities for all yeast mRNAs.

ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.

<span class="mw-page-title-main">RNA immunoprecipitation chip</span>

RIP-chip is a molecular biology technique which combines RNA immunoprecipitation with a microarray. The purpose of this technique is to identify which RNA sequences interact with a particular RNA binding protein of interest in vivo. It can also be used to determine relative levels of gene expression, to identify subsets of RNAs which may be co-regulated, or to identify RNAs that may have related functions. This technique provides insight into the post-transcriptional gene regulation which occurs between RNA and RNA binding proteins.

Methylated DNA immunoprecipitation is a large-scale purification technique in molecular biology that is used to enrich for methylated DNA sequences. It consists of isolating methylated DNA fragments via an antibody raised against 5-methylcytosine (5mC). This technique was first described by Weber M. et al. in 2005 and has helped pave the way for viable methylome-level assessment efforts, as the purified fraction of methylated DNA can be input to high-throughput DNA detection methods such as high-resolution DNA microarrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). Nonetheless, understanding of the methylome remains rudimentary; its study is complicated by the fact that, like other epigenetic properties, patterns vary from cell-type to cell-type.

Chromatin Interaction Analysis by Paired-End Tag Sequencing is a technique that incorporates chromatin immunoprecipitation (ChIP)-based enrichment, chromatin proximity ligation, Paired-End Tags, and High-throughput sequencing to determine de novo long-range chromatin interactions genome-wide.

<span class="mw-page-title-main">Chromatin immunoprecipitation</span> Genomic technique

Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genomic regions, such as transcription factors on promoters or other DNA binding sites, and possibly define cistromes. ChIP also aims to determine the specific location in the genome that various histone modifications are associated with, indicating the target of the histone modifiers. ChIP is crucial for the advancements in the field of epigenomics and learning more about epigenetic phenomena.

Enhancer RNAs (eRNAs) represent a class of relatively long non-coding RNA molecules transcribed from the DNA sequence of enhancer regions. They were first detected in 2010 through the use of genome-wide techniques such as RNA-seq and ChIP-seq. eRNAs can be subdivided into two main classes: 1D eRNAs and 2D eRNAs, which differ primarily in terms of their size, polyadenylation state, and transcriptional directionality. The expression of a given eRNA correlates with the activity of its corresponding enhancer in target genes. Increasing evidence suggests that eRNAs actively play a role in transcriptional regulation in cis and in trans, and while their mechanisms of action remain unclear, a few models have been proposed.

Single-cell sequencing examines the nucleic acid sequence information from individual cells with optimized next-generation sequencing technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. For example, in cancer, sequencing the DNA of individual cells can give information about mutations carried by small populations of cells. In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. In microbial systems, a population of the same species can appear genetically clonal. Still, single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.

<span class="mw-page-title-main">R-loop</span> Three-stranded nucleic acid structure

An R-loop is a three-stranded nucleic acid structure, composed of a DNA:RNA hybrid and the associated non-template single-stranded DNA. R-loops may be formed in a variety of circumstances and may be tolerated or cleared by cellular components. The term "R-loop" was given to reflect the similarity of these structures to D-loops; the "R" in this case represents the involvement of an RNA moiety.

ATAC-seq is a technique used in molecular biology to assess genome-wide chromatin accessibility. In 2013, the technique was first described as an alternative advanced method for MNase-seq, FAIRE-Seq and DNase-Seq. ATAC-seq is a faster analysis of the epigenome than DNase-seq or MNase-seq.

G&T-seq is a novel form of single cell sequencing technique allowing one to simultaneously obtain both transcriptomic and genomic data from single cells, allowing for direct comparison of gene expression data to its corresponding genomic data in the same cell...

<span class="mw-page-title-main">Epitranscriptomic sequencing</span>

In epitranscriptomic sequencing, most methods focus on either (1) enrichment and purification of the modified RNA molecules before running on the RNA sequencer, or (2) improving or modifying bioinformatics analysis pipelines to call the modification peaks. Most methods have been adapted and optimized for mRNA molecules, except for modified bisulfite sequencing for profiling 5-methylcytidine which was optimized for tRNAs and rRNAs.

CUT&RUN sequencing, also known as cleavage under targets and release using nuclease, is a method used to analyze protein interactions with DNA. CUT&RUN sequencing combines antibody-targeted controlled cleavage by micrococcal nuclease with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN sequencing does not.

CUT&Tag-sequencing, also known as cleavage under targets and tagmentation, is a method used to analyze protein interactions with DNA. CUT&Tag-sequencing combines antibody-targeted controlled cleavage by a protein A-Tn5 fusion with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN and CUT&Tag sequencing do not. CUT&Tag sequencing is an improvement over CUT&RUN because it does not require cells to be lysed or chromatin to be fractionated. CUT&RUN is not suitable for single-cell platforms so CUT&Tag is advantageous for these.

ChIL sequencing (ChIL-seq), also known as Chromatin Integration Labeling sequencing, is a method used to analyze protein interactions with DNA. ChIL-sequencing combines antibody-targeted controlled cleavage by Tn5 transposase with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that ChIL-Sequencing does not. ChIL-Seq is a precise technique that reduces sample loss could be applied to single-cells.

H3R17me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 17th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

H3R8me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 8th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 Ginno, PA; Lott, PL; Christensen, HC; Korf, I; Chédin, F (30 March 2012). "R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters". Molecular Cell. 45 (6): 814–25. doi:10.1016/j.molcel.2012.01.017. PMC   3319272 . PMID   22387027.
  2. 1 2 3 Aguilera, A; García-Muse, T (27 April 2012). "R loops: from transcription byproducts to threats to genome stability". Molecular Cell. 46 (2): 115–24. doi: 10.1016/j.molcel.2012.04.009 . PMID   22541554.
  3. 1 2 3 4 5 6 Chan, YA; Aristizabal, MJ; Lu, PY; Luo, Z; Hamza, A; Kobor, MS; Stirling, PC; Hieter, P (April 2014). "Genome-wide profiling of yeast DNA:RNA hybrid prone sites with DRIP-chip". PLOS Genetics. 10 (4): e1004288. doi: 10.1371/journal.pgen.1004288 . PMC   3990523 . PMID   24743342.
  4. 1 2 Chaudhuri, J; Tian, M; Khuong, C; Chua, K; Pinaud, E; Alt, FW (17 April 2003). "Transcription-targeted DNA deamination by the AID antibody diversification enzyme". Nature. 422 (6933): 726–30. Bibcode:2003Natur.422..726C. doi:10.1038/nature01574. PMID   12692563. S2CID   771802.
  5. 1 2 3 Yu, K; Chedin, F; Hsieh, CL; Wilson, TE; Lieber, MR (May 2003). "R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells". Nature Immunology. 4 (5): 442–51. doi:10.1038/ni919. PMID   12679812. S2CID   1652440.
  6. Boguslawski, SJ; Smith, DE; Michalak, MA; Mickelson, KE; Yehle, CO; Patterson, WL; Carrico, RJ (1 May 1986). "Characterization of monoclonal antibody to DNA-RNA and its application to immunodetection of hybrids". Journal of Immunological Methods. 89 (1): 123–30. doi:10.1016/0022-1759(86)90040-2. PMID   2422282.
  7. 1 2 3 4 Ginno, PA; Lim, YW; Lott, PL; Korf, I; Chédin, F (October 2013). "GC skew at the 5' and 3' ends of human genes links R-loop formation to epigenetic regulation and transcription termination". Genome Research. 23 (10): 1590–600. doi:10.1101/gr.158436.113. PMC   3787257 . PMID   23868195.
  8. Skourti-Stathaki, K; Proudfoot, NJ; Gromak, N (24 June 2011). "Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2-dependent termination". Molecular Cell. 42 (6): 794–805. doi:10.1016/j.molcel.2011.04.026. PMC   3145960 . PMID   21700224.
  9. El Hage, A; French, SL; Beyer, AL; Tollervey, D (15 July 2010). "Loss of Topoisomerase I leads to R-loop-mediated transcriptional blocks during ribosomal RNA synthesis". Genes & Development. 24 (14): 1546–58. doi:10.1101/gad.573310. PMC   2904944 . PMID   20634320.
  10. Duquette, ML; Huber, MD; Maizels, N (15 March 2007). "G-rich proto-oncogenes are targeted for genomic instability in B-cell lymphomas". Cancer Research. 67 (6): 2586–94. doi: 10.1158/0008-5472.can-06-2419 . PMID   17363577.
  11. Huertas, P; Aguilera, A (September 2003). "Cotranscriptionally formed DNA:RNA hybrids mediate transcription elongation impairment and transcription-associated recombination". Molecular Cell. 12 (3): 711–21. doi: 10.1016/j.molcel.2003.08.010 . PMID   14527416.
  12. Drolet, M; Bi, X; Liu, LF (21 January 1994). "Hypernegative supercoiling of the DNA template during transcription elongation in vitro". The Journal of Biological Chemistry. 269 (3): 2068–74. doi: 10.1016/S0021-9258(17)42136-3 . PMID   8294458.
  13. Gómez-González, B; Aguilera, A (15 May 2007). "Activation-induced cytidine deaminase action is strongly stimulated by mutations of the THO complex". Proceedings of the National Academy of Sciences of the United States of America. 104 (20): 8409–14. Bibcode:2007PNAS..104.8409G. doi: 10.1073/pnas.0702836104 . PMC   1895963 . PMID   17488823.
  14. Pohjoismäki, JL; Holmes, JB; Wood, SR; Yang, MY; Yasukawa, T; Reyes, A; Bailey, LJ; Cluett, TJ; Goffart, S; Willcox, S; Rigby, RE; Jackson, AP; Spelbrink, JN; Griffith, JD; Crouch, RJ; Jacobs, HT; Holt, IJ (16 April 2010). "Mammalian mitochondrial DNA replication intermediates are essentially duplex but contain extensive tracts of RNA/DNA hybrid". Journal of Molecular Biology. 397 (5): 1144–55. doi:10.1016/j.jmb.2010.02.029. PMC   2857715 . PMID   20184890.
  15. Mischo, HE; Gómez-González, B; Grzechnik, P; Rondón, AG; Wei, W; Steinmetz, L; Aguilera, A; Proudfoot, NJ (7 January 2011). "Yeast Sen1 helicase protects the genome from transcription-associated instability". Molecular Cell. 41 (1): 21–32. doi:10.1016/j.molcel.2010.12.007. PMC   3314950 . PMID   21211720.
  16. Wahba, L; Amon, JD; Koshland, D; Vuica-Ross, M (23 December 2011). "RNase H and multiple RNA biogenesis factors cooperate to prevent RNA:DNA hybrids from generating genome instability". Molecular Cell. 44 (6): 978–88. doi:10.1016/j.molcel.2011.10.017. PMC   3271842 . PMID   22195970.
  17. Chan, K; Sterling, JF; Roberts, SA; Bhagwat, AS; Resnick, MA; Gordenin, DA (2012). "Base damage within single-strand DNA underlies in vivo hypermutability induced by a ubiquitous environmental agent". PLOS Genetics. 8 (12): e1003149. doi: 10.1371/journal.pgen.1003149 . PMC   3521656 . PMID   23271983.