The mechanism of identifying chromatin accessibility using the Tn5 transposase. a Open and closed status of chromatin. b When the chromatin accessibility is increased, the Tn5 transposase transpose in the open chromatin more often than in the inaccessible chromatin. The green/red symbols represents adapters.
ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) is a laboratory technique used in molecular biology to assess genome-wide chromatin accessibility.[1] The technique was first described in 2013 as an alternative approach to MNase-seq, FAIRE-Seq and DNase-Seq[1] but providing faster turnaround time, simplified protocol, and lower DNA input amount.[2][3][4]
ATAC-seq identifies accessible DNA regions by probing open chromatin with hyperactive mutant Tn5 Transposase that inserts sequencing adapters into open regions of the genome.[2][5] While naturally occurring transposases have a low level of activity, ATAC-seq employs the mutated hyperactive transposase.[6] In a process called "tagmentation", Tn5 transposase cleaves and tags double-stranded DNA with sequencing adaptors in a single enzymatic step.[7][8] The tagged DNA fragments are then purified, PCR-amplified, and sequenced using next-generation sequencing.[8] Sequencing reads can then be used to infer regions of increased accessibility as well as to map regions of transcription factor binding sites and nucleosome positions.[2] The number of reads for a region correlate with how open that chromatin is, at single nucleotide resolution.[2]
ATAC-seq requires no sonication or phenol-chloroform extraction like FAIRE-seq;[9] no antibodies like ChIP-seq;[10] and no sensitive enzymatic digestion like MNase-seq or DNase-seq.[11] ATAC-seq preparation can be completed in under three hours.[12]
Applications
Comparative epigenomics applications of ATAC-seq such as cancer profiling, identifying cellular subtypes, and cell differentiation analysis.
ATAC-Seq analysis is used to investigate a number of chromatin-accessibility signatures. The most common use is nucleosome mapping experiments,[3] but it can be applied to mapping transcription factor binding sites,[13] adapted to map DNA methylation sites,[14] or combined with sequencing techniques.[15]
The utility of high-resolution enhancer mapping ranges from studying the evolutionary divergence of enhancer usage (e.g. between chimps and humans) during development[16] and uncovering a lineage-specific enhancer map used during blood cell differentiation.[17]
ATAC-Seq has also been applied to defining the genome-wide chromatin accessibility landscape in human cancers,[18] and revealing an overall decrease in chromatin accessibility in macular degeneration.[19] Computational footprinting methods can be performed on ATAC-seq to find cell specific binding sites and transcription factors with cell specific activity.[13]
ATAC-seq has found increasing applications in clinical research and disease studies. EPIC-ATAC has been developed as a deconvolution framework to quantify cell-type heterogeneity in bulk tumor ATAC-seq data, enabling analysis of regulatory processes underlying tumor development and correlation with clinical variables in cancer research.[20][21] In immunology, ATAC-seq has been used to characterize dynamic epigenetic changes in T cell exhaustion, revealing that exhausted T cells possess unique chromatin accessibility patterns compared to naive, effector, and memory T cells, with implications for cancer immunotherapy.[22]The Cancer Genome Atlas has generated genome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types, identifying 562,709 transposase-accessible DNA elements and revealing genetic risk loci of cancer predisposition as active DNA regulatory elements.[23] Integrative analysis combining ATAC-seq with RNA-seq has been used to identify novel oncogenes and elucidate regulatory mechanisms in hepatocellular carcinoma.[24]
Modifications to the ATAC-seq protocol have been made to accommodate single-cell analysis. Microfluidics can be used to separate single nuclei and perform ATAC-seq reactions individually.[12] With this approach, single cells are captured by either a microfluidic device or a liquid deposition system before tagmentation.[12][25] An alternative technique that does not require single cell isolation is combinatorial cellular indexing.[26] This technique uses barcoding to measure chromatin accessibility in thousands of individual cells; it can generate epigenomic profiles from 10,000-100,000 cells per experiment.[27] But combinatorial cellular indexing requires additional, custom-engineered equipment or a large quantity of custom, modified Tn5.[28] Recently, a pooled barcode method called sci-CAR was developed, allowing joint profiling of chromatin accessibility and gene expression of single cells.[29]
Computational analysis of scATAC-seq is based on construction of a count matrix with number of reads per open chromatin regions. Open chromatin regions can be defined, for example, by standard peak calling of pseudo bulk ATAC-seq data. Further steps include data reduction with PCA and clustering of cells.[25] scATAC-seq matrices can be extremely large (hundreds of thousands of regions) and is extremely sparse, i.e. less than 3% of entries are non-zero.[30] Therefore, imputation of count matrix is another crucial step performed by using various methods such as non-negative matrix factorization. As with bulk ATAC-seq, scATAC-seq allows finding regulators like transcription factors controlling gene expression of cells. This can be achieved by looking at the number of reads around TF motifs[31] or footprinting analysis.[30]
Spatial ATAC-seq
Spatial ATAC-seq combines chromatin accessibility profiling with spatial information, enabling researchers to map epigenetic landscapes while preserving tissue architecture. This method combines in situ Tn5 transposition chemistry with microfluidic deterministic barcoding to perform spatially resolved chromatin accessibility analysis on tissue sections at the cellular level and genome scale.[32][33] The technique has been applied to co-profiling of the epigenome and transcriptome, facilitating investigation of the correlation between accessible peaks and expressed genes pixel by pixel in the tissue context.[32] Recent developments include SPACE-seq (SPatial assay for Accessible chromatin, Cell lineages, and gene Expression with sequencing), which enables simultaneous analysis of gene expression, chromatin accessibility, and mitochondrial DNA mutations using commercially available spatial transcriptomics platforms.[34] Laser capture microdissection coupled to ATAC-seq (LCM-ATAC-seq) has also been developed for targeted chromatin accessibility analysis of discrete contiguous or scattered cell populations in tissues, enabling analysis at mini-bulk resolution with the possibility to integrate cellular or morphological stainings.[35]
Multimodal ATAC-seq
Recent advances have enabled simultaneous profiling of chromatin accessibility alongside other molecular modalities in the same cells or tissue sections. Spatial ATAC–RNA-seq and spatial CUT&Tag–RNA-seq allow co-profiling of genome-wide chromatin accessibility or histone modifications in conjunction with whole transcriptome on the same tissue section at near-single-cell resolution.[32] ISSAAC-seq (In Situ Sequencing of chromatin Accessibility And Cellular transcriptomes) represents a multimodal update to ATAC-seq, providing a powerful method for investigating gene expression and chromatin accessibility within the same cell at high sensitivity and lower cost than commercially available kits.[36] These multimodal approaches have led to the development of computational tools like SCRIPro, which combines transcription factor-target importance from epigenomic data with transcription factor-target expression from transcriptomic data to construct gene regulatory networks from single-cell and spatial multiomics data.[37]
Computational Tools and Analysis
ATAC-seq data analysis presents unique methodological challenges due to data sparsity and the need for specialized bioinformatics tools. The major steps include pre-analysis (quality check and alignment), core analysis (peak calling), and advanced analysis (peak differential analysis and annotation, motif enrichment, footprinting, and nucleosome position analysis).[38][39] MACS2 (Model-based Analysis of ChIP-seq 2) remains the most widely used peak caller for ATAC-seq data analysis, serving as the default peak caller in the ENCODE ATAC-seq pipeline. Originally developed for ChIP-seq, MACS2 has been adapted for ATAC-seq analysis and performs well for identifying regions of enriched transposase accessibility, though it requires parameter optimization for ATAC-seq-specific characteristics.[40][41] Standardized analysis workflows have been developed, including the nf-core/atacseq pipeline, which provides a comprehensive, reproducible framework for ATAC-seq data processing from raw reads to final peak calls and quality control metrics. This Nextflow-based pipeline incorporates best practices for adapter trimming, alignment, duplicate removal, peak calling, and downstream analysis, facilitating standardized processing across different research groups and computational environments.[42][43]
↑ Hendrickson DG, Soifer I, Wranik BJ, Botstein D, Scott McIsaac R (2018), "Simultaneous Profiling of DNA Accessibility and Gene Expression Dynamics with ATAC-Seq and RNA-Seq", Computational Cell Biology, Methods in Molecular Biology, vol.1819, Springer New York, pp.317–333, doi:10.1007/978-1-4939-8618-7_15, ISBN9781493986170, PMID30421411
This page is based on this Wikipedia article Text is available under the CC BY-SA 4.0 license; additional terms may apply. Images, videos and audio are available under their respective licenses.