ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) is a technique used in molecular biology to assess genome-wide chromatin accessibility. [1] In 2013, the technique was first described as an alternative advanced method for MNase-seq, FAIRE-Seq and DNase-Seq. [1] ATAC-seq is a faster analysis of the epigenome than DNase-seq or MNase-seq. [2] [3] [4]
ATAC-seq identifies accessible DNA regions by probing open chromatin with hyperactive mutant Tn5 Transposase that inserts sequencing adapters into open regions of the genome. [2] [5] While naturally occurring transposases have a low level of activity, ATAC-seq employs the mutated hyperactive transposase. [6] In a process called "tagmentation", Tn5 transposase cleaves and tags double-stranded DNA with sequencing adaptors. [7] [8] The tagged DNA fragments are then purified, PCR-amplified, and sequenced using next-generation sequencing. [8] Sequencing reads can then be used to infer regions of increased accessibility as well as to map regions of transcription factor binding sites and nucleosome positions. [2] The number of reads for a region correlate with how open that chromatin is, at single nucleotide resolution. [2] ATAC-seq requires no sonication or phenol-chloroform extraction like FAIRE-seq; [9] no antibodies like ChIP-seq; [10] and no sensitive enzymatic digestion like MNase-seq or DNase-seq. [11] ATAC-seq preparation can be completed in under three hours. [12]
ATAC-Seq analysis is used to investigate a number of chromatin-accessibility signatures. The most common use is nucleosome mapping experiments, [3] but it can be applied to mapping transcription factor binding sites, [13] adapted to map DNA methylation sites, [14] or combined with sequencing techniques. [15]
The utility of high-resolution enhancer mapping ranges from studying the evolutionary divergence of enhancer usage (e.g. between chimps and humans) during development [16] and uncovering a lineage-specific enhancer map used during blood cell differentiation. [17]
ATAC-Seq has also been applied to defining the genome-wide chromatin accessibility landscape in human cancers, [18] and revealing an overall decrease in chromatin accessibility in macular degeneration. [19] Computational footprinting methods can be performed on ATAC-seq to find cell specific binding sites and transcription factors with cell specific activity. [13]
Modifications to the ATAC-seq protocol have been made to accommodate single-cell analysis. Microfluidics can be used to separate single nuclei and perform ATAC-seq reactions individually. [12] With this approach, single cells are captured by either a microfluidic device or a liquid deposition system before tagmentation. [12] [20] An alternative technique that does not require single cell isolation is combinatorial cellular indexing. [21] This technique uses barcoding to measure chromatin accessibility in thousands of individual cells; it can generate epigenomic profiles from 10,000-100,000 cells per experiment. [22] But combinatorial cellular indexing requires additional, custom-engineered equipment or a large quantity of custom, modified Tn5. [23] Recently, a pooled barcode method called sci-CAR was developed, allowing joint profiling of chromatin accessibility and gene expression of single cells. [24]
Computational analysis of scATAC-seq is based on construction of a count matrix with number of reads per open chromatin regions. Open chromatin regions can be defined, for example, by standard peak calling of pseudo bulk ATAC-seq data. Further steps include data reduction with PCA and clustering of cells. [20] scATAC-seq matrices can be extremely large (hundreds of thousands of regions) and is extremely sparse, i.e. less than 3% of entries are non-zero. [25] Therefore, imputation of count matrix is another crucial step performed by using various methods such as non-negative matrix factorization. As with bulk ATAC-seq, scATAC-seq allows finding regulators like transcription factors controlling gene expression of cells. This can be achieved by looking at the number of reads around TF motifs [26] or footprinting analysis. [25]
Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.
H3K27ac is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates acetylation of the lysine residue at N-terminal position 27 of the histone H3 protein.
H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.
H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.
H3K36me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 36th lysine residue of the histone H3 protein and often associated with gene bodies.
H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.
H2BK5ac is an epigenetic modification to the DNA packaging protein Histone H2B. It is a mark that indicates the acetylation at the 5th lysine residue of the histone H2B protein. H2BK5ac is involved in maintaining stem cells and colon cancer.
H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.
H4K5ac is an epigenetic modification to the DNA packaging protein histone H4. It is a mark that indicates the acetylation at the 5th lysine residue of the histone H4 protein. H4K5 is the closest lysine residue to the N-terminal tail of histone H4. It is enriched at the transcription start site (TSS) and along gene bodies. Acetylation of histone H4K5 and H4K12ac is enriched at centromeres.
H4K8ac, representing an epigenetic modification to the DNA packaging protein histone H4, is a mark indicating the acetylation at the 8th lysine residue of the histone H4 protein. It has been implicated in the prevalence of malaria.
H3K36ac is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the acetylation at the 36th lysine residue of the histone H3 protein.
H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.
MNase-seq, short for micrococcal nuclease digestion with deep sequencing, is a molecular biological technique that was first pioneered in 2006 to measure nucleosome occupancy in the C. elegans genome, and was subsequently applied to the human genome in 2008. Though, the term ‘MNase-seq’ had not been coined until a year later, in 2009. Briefly, this technique relies on the use of the non-specific endo-exonuclease micrococcal nuclease, an enzyme derived from the bacteria Staphylococcus aureus, to bind and cleave protein-unbound regions of DNA on chromatin. DNA bound to histones or other chromatin-bound proteins may remain undigested. The uncut DNA is then purified from the proteins and sequenced through one or more of the various Next-Generation sequencing methods.
H3K36me is an epigenetic modification to the DNA packaging protein Histone H3, specifically, the mono-methylation at the 36th lysine residue of the histone H3 protein.
H3R42me is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the mono-methylation at the 42nd arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H3R17me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 17th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H3R26me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 26th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H3R8me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 8th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H3R2me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 2nd arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
H4R3me2 is an epigenetic modification to the DNA packaging protein histone H4. It is a mark that indicates the di-methylation at the 3rd arginine residue of the histone H4 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.