Reduced representation bisulfite sequencing

Last updated

Reduced representation bisulfite sequencing (RRBS) is an efficient and high-throughput technique for analyzing the genome-wide methylation profiles on a single nucleotide level. It combines restriction enzymes and bisulfite sequencing to enrich for areas of the genome with a high CpG content. Due to the high cost and depth of sequencing to analyze methylation status in the entire genome, Meissner et al. developed this technique in 2005 to reduce the amount of nucleotides required to sequence to 1% of the genome. [1] The fragments that comprise the reduced genome still include the majority of promoters, as well as regions such as repeated sequences that are difficult to profile using conventional bisulfite sequencing approaches. [2]

Contents

An outline of the protocol for reduced representation bisulfite sequencing Brief overview of RRBS.jpg
An outline of the protocol for reduced representation bisulfite sequencing

Overview of protocol

  1. Enzyme digestion: First, genomic DNA is digested using a methylation-insensitive restriction enzyme. It is integral for the enzymes to not be influenced by the methylation status of the CpGs (sites within the genome where a cytosine is next to a guanine) as this allows for the digestion of both methylated and unmethylated areas. MspI is commonly used. This enzyme targets 5’CCGG3’ sequences and cleaves the phosphodiester bonds upstream of CpG dinucleotide. When using this particular enzyme, each fragment has a CpG at each end. This digestion results in DNA fragments of various sizes.
  2. End repair and A-tailing: Due to the nature of how MspI cleaves double stranded DNA, this reaction results in strands with sticky ends. End repair is necessary to fill in the 3’ terminal of the ends of the strands. The next step is adding an extra adenosine to both the plus and minus strands. This is referred to as A-Tailing and is necessary for adapter ligation in the subsequent step. End repair and A-Tailing is done within the same reactions, with dCTP, dGTP and dATP deoxyribonucleotides. To increase the efficiency of A tailing, the dATPs are added in excess in this reaction.
  3. Sequence adapters: Methylated sequence adapters are ligated to the DNA fragments. The methylated adapter oligonucleotides have all cytosines replaced with 5’methyl-cytosines to prevent deamination of these cytosines in the bisulfite conversion reaction. To sequence reactions using Illumina sequencers, sequence adapters hybridize to the adapters on the flow cell.
  4. Fragment purification: The desired size of fragments is selected for purification. Different sizes of the fragments are separated using gel electrophoresis and are purified using gel excising. According to Gu et al., DNA fragments of 40-220 base pair are representative of the majority of promoter sequences and CpG islands [2]
  5. Bisulfite conversion: The DNA fragments are then bisulfite converted, which is a process that deaminates unmethylated cytosine into a uracil. The methylated cytosines remain unchanged, due to the methyl group protecting them from the reaction.
  6. PCR amplification: The bisulfite converted DNA is then amplified using PCR with primers that are complementary to the sequence adapters.
  7. PCR purification: Before sequencing, the PCR product must be free of unused reaction reagents such as unincorporated dNTPs or salts. Thus, a step for PCR purification is required. This can be done by running another electrophoresis gel or by using kits designed specifically for PCR purification.
  8. Sequencing: The fragments are then sequenced. When RRBS was first developed, Sanger sequencing was initially used. Now, next generation sequencing approaches are used. For Illumina sequencing, 36-base single-end sequencing reads are most commonly performed.
  9. Sequence alignment and analysis: Due to the unique properties of RRBS, special software is needed for alignment and analysis. [3] Using MspI to digest genomic DNA results in fragments that always start with a C (if the cytosine is methylated) or a T (if a cytosine was not methylated and was converted to a uracil in the bisulfite conversion reaction). This results in a non-random base pair composition. Additionally, the base composition is skewed due to the biased frequencies of C and T within the samples. Various software for alignment and analysis is available, such as Maq, BS Seeker, Bismark or BSMAP. Alignment to a reference genome allows the programs to identify base pairs within the genome that are methylated.
Reduced Representation Bisulfite Sequencing Protocol Reduced Representation Bisulfite Sequencing Protocol.jpg
Reduced Representation Bisulfite Sequencing Protocol

Advantages

Enrichment of CpGs

RRBS uniquely uses a specific restriction enzyme to enrich for CpGs. MspI digestion, or any restriction enzyme that recognizes CpG's and cuts them, produces only fragments with CG’s at the end. [2] This approach enriches for CpG regions of the genome, so it can decrease the amount of sequencing required as well as decrease the cost. [2] This technique is cost-effective especially when focusing on common CpG regions.

Low sample input

Only a low sample concentration, between 10-300 ng, is required for accurate data analysis. [2] This technique can be employed when there is a lack of precious sample. Another positive aspect is that fresh or live samples are not required. Formalin-fixed and paraffin-embedded inputs can also be used. [2]

Limitations

Restriction enzyme

In the specific protocol steps, there are also some limitations. MspI digestion covers the majority, but not all the CG regions in the genome. [2] Some CpG’s are missed. Missing CpG’s can also occur since this protocol is only a representative sampling of the genome. [2] Some regions thus have lower coverage. Other variations of this protocol use alternative enzymes. [4]

PCR

During the PCR portion of the protocol, a non-proofreading polymerase must be used as a proof-reading enzyme would stop at uracil residues found in the ssDNA template. [1] Using a polymerase that does not proof-read can also lead to increase PCR sequencing errors. [1]

Bisulfite sequencing

Bisulfite sequencing only converts single-stranded DNA (ssDNA). Complete bisulfite conversion requires thorough denaturation and absence of re-annealed double stranded DNA (dsDNA). [1] Easy protocol steps have been shown to drive complete denaturation. Ensuring the usage of small fragments via shearing or digestion, fresh reagents, and sufficient denaturing time is crucial for complete denaturing [5] Another suggested technique is to carry out the bisulfite reaction at 95 °C although DNA degradation also occurs at high temperatures. [5] In the first hour of bisulfite reaction, it is predicted that less than 90% of the sample DNA is lost to degradation [6] A balance between high temperature and low temperature is required to ensure complete denaturation and decreased DNA degradation. Usage of reagents, like urea, that prevent dsDNA from forming can also be employed. [5] With contamination of dsDNA, it can be difficult to accurately computate the data. [1] When an unconverted cytosine is observed, it is challenging to differentiate between lack of methylation and an artifact. [1]

Significance

The significance of this technique is it allows for the sequencing of methylated areas that can't be properly profiled using conventional bisulfite sequencing techniques. Current sequencing technologies are limited in regards to profiling areas of repeated sequences. [7] This is unfortunate in regards to methylation studies, as these repeated sequences often contain methylated cytosines. This is especially limiting for studies involving profiling cancer genomes, as a loss of methylation in this repeated sequences is observed in many cancer types. [8] RRBS eliminates the problems encountered due to these large areas of repeated sequences and thus lets these regions be more fully annotated.

Applications

Methylomes in cancer genomics

Aberrant methylation has been observed in cancer. [9] In cancer, hypermethylation as well as hypomethylation has been seen in tumors. [9] Since RRBS is highly sensitive, this technique can be used to quickly look at aberrant methylation in cancer. [10] If samples from the patient's tumor and normal cells can be obtained, a comparison between these two cell types can be observed. [2] [10] A profile of the overall methylation can be produced quite rapidly. [2] This technique can rapidly determine the overall methylation status of cancer genomes which is cost and time effective.

Methylation states in development

Stage-specific changes can be observed in all living organisms. Modifications in overall methylation levels via reduced representation bisulfite sequencing can be useful in developmental biology.

Comparison with other techniques

Results compared between RRBS and MethylC-seq are highly concordant with one another. [11] Naturally, MethylC-seq has a greater genome-wide coverage of CpGs compared to RRBS, but RRBS has a greater coverage on CpG islands. [11] One of the other most commonly used techniques for profiling methylation is MeDiP-Seq. This technique is done by immunopreciptiation of methylated cytosines and subsequent sequencing. [11] RRBS has a greater resolution compared to this technique, as MeDip-Seq is limited to 150 base pairs compared the one nucleotide resolution of RRBS. [11] Bisulfite methods, such as used by RRBS, were also found more accurate than enrichment based, such as MeDip-Seq. [7] The data obtained on RRBS and the Illumina Infinium methylation are highly comparable, with a Pearson correlation of 0.92. [7] The data for both platforms are also directly comparable as both use an absolute measurement of DNA.

Finally, Anchor-Based Bisulfite Sequencing (ABBS) was developed by Ben Delatte's group at Active Motif. This technology uses specialized primers that capture DNA methylation allowing for increased coverage (approx. 10x more than WGBS) and lowering sequencing costs. They also showed that ABBS is not as restricted as RRBS and can be used as an alternative for MeDIP-seq while maintaining base-resolution. [12]

Related Research Articles

<span class="mw-page-title-main">Polymerase chain reaction</span> Laboratory technique to multiply a DNA sample for study

The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA sufficiently to enable detailed study. PCR was invented in 1983 by American biochemist Kary Mullis at Cetus Corporation. Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993.

<span class="mw-page-title-main">5-Methylcytosine</span> Chemical compound which is a modified DNA base

5-Methylcytosine is a methylated form of the DNA base cytosine (C) that regulates gene transcription and takes several other biological roles. When cytosine is methylated, the DNA maintains the same sequence, but the expression of methylated genes can be altered. 5-Methylcytosine is incorporated in the nucleoside 5-methylcytidine.

<span class="mw-page-title-main">CpG site</span> Region of often-methylated DNA with a cytosine followed by a guanine

The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG islands.

<span class="mw-page-title-main">DNA methylation</span> Biological process

DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts to repress gene transcription. In mammals, DNA methylation is essential for normal development and is associated with a number of key processes including genomic imprinting, X-chromosome inactivation, repression of transposable elements, aging, and carcinogenesis.

<span class="mw-page-title-main">Epigenome</span> Biological term

In biology, the epigenome of an organism is the collection of chemical changes to its DNA and histone proteins that affects when, where, and how the DNA is expressed; these changes can be passed down to an organism's offspring via transgenerational epigenetic inheritance. Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome. The human epigenome, including DNA methylation and histone modification, is maintained through cell division. The epigenome is essential for normal development and cellular differentiation, enabling cells with the same genetic code to perform different functions. The human epigenome is dynamic and can be influenced by environmental factors such as diet, stress, and toxins.

For the purpose of DNA replication, the HpaII tiny fragment Enrichment by Ligation-mediated PCR Assay is one of several techniques used for determining whether DNA has been methylated. The technique can be adapted to examine DNA methylation within and around individual genes, or it can be expanded to examine methylation in an entire genome.

<span class="mw-page-title-main">Bisulfite sequencing</span> Lab procedure detecting 5-methylcytosines in DNA

Bisulfitesequencing (also known as bisulphite sequencing) is the use of bisulfite treatment of DNA before routine sequencing to determine the pattern of methylation. DNA methylation was the first discovered epigenetic mark, and remains the most studied. In animals it predominantly involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity.

The versatility of polymerase chain reaction (PCR) has led to modifications of the basic protocol being used in a large number of variant techniques designed for various purposes. This article summarizes many of the most common variations currently or formerly used in molecular biology laboratories; familiarity with the fundamental premise by which PCR works and corresponding terms and concepts is necessary for understanding these variant techniques.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

The Illumina Methylation Assay using the Infinium I platform uses 'BeadChip' technology to generate a comprehensive genome-wide profiling of human DNA methylation. Similar to bisulfite sequencing and pyrosequencing, this method quantifies methylation levels at various loci within the genome. This assay is used for methylation probes on the Illumina Infinium HumanMethylation27 BeadChip. Probes on the 27k array target regions of the human genome to measure methylation levels at 27,578 CpG dinucleotides in 14,495 genes. In 2008, Illumina released the Infinium HumanMethylation450 BeadChip array, which targets over 450,000 methylation sites. In 2016, the Infinium MethylationEPIC BeadChip ("EPIC") was released, which interrogates over 850,000 methylation sites across the human genome.

Methylated DNA immunoprecipitation is a large-scale purification technique in molecular biology that is used to enrich for methylated DNA sequences. It consists of isolating methylated DNA fragments via an antibody raised against 5-methylcytosine (5mC). This technique was first described by Weber M. et al. in 2005 and has helped pave the way for viable methylome-level assessment efforts, as the purified fraction of methylated DNA can be input to high-throughput DNA detection methods such as high-resolution DNA microarrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). Nonetheless, understanding of the methylome remains rudimentary; its study is complicated by the fact that, like other epigenetic properties, patterns vary from cell-type to cell-type.

<span class="mw-page-title-main">Combined bisulfite restriction analysis</span>

Combined Bisulfite Restriction Analysis is a molecular biology technique that allows for the sensitive quantification of DNA methylation levels at a specific genomic locus on a DNA sequence in a small sample of genomic DNA. The technique is a variation of bisulfite sequencing, and combines bisulfite conversion based polymerase chain reaction with restriction digestion. Originally developed to reliably handle minute amounts of genomic DNA from microdissected paraffin-embedded tissue samples, the technique has since seen widespread usage in cancer research and epigenetics studies.

<span class="mw-page-title-main">Bayesian tool for methylation analysis</span>

Bayesian tool for methylation analysis, also known as BATMAN, is a statistical tool for analysing methylated DNA immunoprecipitation (MeDIP) profiles. It can be applied to large datasets generated using either oligonucleotide arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq), providing a quantitative estimation of absolute methylation state in a region of interest.

Single-cell sequencing examines the nucleic acid sequence information from individual cells with optimized next-generation sequencing technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. For example, in cancer, sequencing the DNA of individual cells can give information about mutations carried by small populations of cells. In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. In microbial systems, a population of the same species can appear genetically clonal. Still, single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.

<span class="mw-page-title-main">Circulating tumor DNA</span> Tumor-derived fragmented DNA in the bloodstream

Circulating tumor DNA (ctDNA) is tumor-derived fragmented DNA in the bloodstream that is not associated with cells. ctDNA should not be confused with cell-free DNA (cfDNA), a broader term which describes DNA that is freely circulating in the bloodstream, but is not necessarily of tumor origin. Because ctDNA may reflect the entire tumor genome, it has gained traction for its potential clinical utility; "liquid biopsies" in the form of blood draws may be taken at various time points to monitor tumor progression throughout the treatment regimen.

DRIP-seq (DRIP-sequencing) is a technology for genome-wide profiling of a type of DNA-RNA hybrid called an "R-loop". DRIP-seq utilizes a sequence-independent but structure-specific antibody for DNA-RNA immunoprecipitation (DRIP) to capture R-loops for massively parallel DNA sequencing.

<span class="mw-page-title-main">Whole genome bisulfite sequencing</span>

Whole genome bisulfite sequencing is a next-generation sequencing technology used to determine the DNA methylation status of single cytosines by treating the DNA with sodium bisulfite before high-throughput DNA sequencing. The DNA methylation status at various genes can reveal information regarding gene regulation and transcriptional activities. This technique was developed in 2009 along with reduced representation bisulfite sequencing after bisulfite sequencing became the gold standard for DNA methylation analysis.

<span class="mw-page-title-main">Epitranscriptomic sequencing</span>

In epitranscriptomic sequencing, most methods focus on either (1) enrichment and purification of the modified RNA molecules before running on the RNA sequencer, or (2) improving or modifying bioinformatics analysis pipelines to call the modification peaks. Most methods have been adapted and optimized for mRNA molecules, except for modified bisulfite sequencing for profiling 5-methylcytidine which was optimized for tRNAs and rRNAs.

<span class="mw-page-title-main">GLAD-PCR assay</span>

Glal hydrolysis and Ligation Adapter Dependent PCR assay is the novel method to determine R(5mC)GY sites produced in the course of de novo DNA methylation with DNMTЗA and DNMTЗB DNA methyltransferases. GLAD-PCR assay do not require bisulfite treatment of the DNA.

<span class="mw-page-title-main">NOMe-seq</span> NOMe-seq is a nucleosome occupancy and methylome technique.

Nucleosome Occupancy and Methylome Sequencing (NOMe-seq) is a genomics technique used to simultaneously detect nucleosome positioning and DNA methylation... This method is an extension of bisulfite sequencing, which is the gold standard for determining DNA methylation. NOMe-seq relies on the methyltransferase M.CviPl, which methylates cytosines in GpC dinucleotides unbound by nucleosomes or other proteins, creating a nucleosome footprint. The mammalian genome naturally contains DNA methylation, but only at CpG sites, so GpC methylation can be differentiated from genomic methylation after bisulfite sequencing. This allows simultaneous analysis of the nucleosome footprint and endogenous methylation on the same DNA molecules. In addition to nucleosome foot-printing, NOMe-seq can determine locations bound by transcription factors. Nucleosomes are bound by 147 base pairs of DNA whereas transcription factors or other proteins will only bind a region of approximately 10-80 base pairs. Following treatment with M.CviPl, nucleosome and transcription factor sites can be differentiated based on the size of the unmethylated GpC region.

References

  1. 1 2 3 4 5 6 Alexander Meissner, Andreas Gnirke, George W. Bell, Bernard Ramsahoye, Eric S. Lander and Rudolf Jaenisch. 2005. "Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis". Nucleic Acids Res. 33(18):5868-77
  2. 1 2 3 4 5 6 7 8 9 10 Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A. 2011. "Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling." Nat Protoc. 6(4):468-81. doi : 10.1038/nprot.2010.190.
  3. Chatterjee A, Rodger EJ, Stockwell PA, Weeks RJ, Morison IM. 2012. "Technical considerations for reduced representation bisulfite sequencing with multiplexed libraries." J Biomed Biotechnol. 2012:741542. doi : 10.1155/2012/741542
  4. Trucchi, Emiliano; Mazzarella, Anna B.; Gilfillan, Gregor D.; Lorenzo, Maria T.; Schönswetter, Peter; Paun, Ovidiu (April 2016). "BsRADseq: screening DNA methylation in natural populations of non-model species". Molecular Ecology. 25 (8): 1697–1713. doi:10.1111/mec.13550. PMC   4949719 . PMID   26818626.
  5. 1 2 3 Warnercke, P.M. Stirzaker, C., Song, J., Grunau, C., Melki, J.R., Clark, S.J. 2002. "Identification and resolution of artifacts in bisulphite sequencing." Methods. 27: 101-107.
  6. Grunau, C., Clark, S.J. and Rosenthal, A. 2001. "Bisulfite genomic sequencing: systematic investigation of critical experimental parameters." Nucleic Acids Res., 29, E65–65.
  7. 1 2 3 Bock C, Tomazou EM, Brinkman AB, Müller F, Simmer F, Gu H, Jäger N, Gnirke A, Stunnenberg HG, Meissner A. 2010. "Quantitative comparison of genome-wide DNA methylation mapping technologies." Nat Biotechnol. 28(10):1106-14. doi : 10.1038/nbt.1681
  8. Ehrlich M. 2009. "DNA hypomethylation in cancer cells". Epigenomics. 1(2):239-59. doi : 10.2217/epi.09.33.
  9. 1 2 Veersteeg, R. 1997. Aberrant methylation in cancer. American Journal of Human Genetics. 60:751-754.
  10. 1 2 Smith, Z.D., Gu, H., Bock, C., Gnike, A., & Meissner, A. 2009. High-throughput bisulfite sequencing in mammalian genomes. Methods. 48: 226-232.
  11. 1 2 3 4 Harris, Alan et al. 2010. “Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications” Nature Biotechnology 28:1097-1105.
  12. Chapin, N., Fernandez, J., Poole, J. et al. Anchor-based bisulfite sequencing determines genome-wide DNA methylation. Commun Biol 5, 596 (2022).