Proximity ligation-assisted chromatin immunoprecipitation sequencing (PLAC-seq) is a chromatin conformation capture(3C)-based technique to detect and quantify genomic chromatin structure from a protein-centric approach. [1] PLAC-seq combines in situ Hi-C and chromatin immunoprecipitation (ChIP), which allows for the identification of long-range chromatin interactions at a high resolution with low sequencing costs. [1] Mapping long-range 3-dimensional(3D) chromatin interactions is important in identifying transcription enhancers and non-coding variants that can be linked to human diseases. [2]
Different 3C-based techniques have been used to study the higher-order 3D chromatin structure, and it has been combined with high-throughput sequencing to determine the chromatin structure on a genome-wide level. [3] Hi-C is one of the most widely used 3C-based techniques because it allows for high-resolution (kilobase-scale) genome-topology identification. However, it requires billions of sequencing reads which has limited its application. [2] Another commonly used 3C-based technique is chromatin interaction analysis by paired-end tag sequencing (ChiA-PET). [2] ChiA-PET can identify long-range interactions of transcription promoters and enhancers at a high resolution but requires millions of cells. [2]
PLAC-seq alleviates these issues by using in situ Hi-C, which creates long-range DNA contacts in situ in the nucleus before lysis. [3] Unlike ChiA-PET which performs ChIP and proximity ligation after chromatin shearing, performing proximity ligation in the nuclei first prevents large disruptions of protein/DNA complexes. [2] This decreases false-positive interactions and improves DNA contact capture efficiency, meaning that PLAC-seq is more accurate and requires fewer cells. [1]
PLAC-seq was developed in 2016 [2] and an almost identical technique called HiChIP was also developed in the same year. [3] Both methods combine in situ Hi-C and ChIP but have different library preparation methods. [1] While PLAC-seq uses biotin pull-down followed by end-repair, adapter ligation, and PCR, HiChIP usesTn5 tagmentation, biotin pull-down, and PCR. [1] However, both techniques can use the same quality control and data analysis techniques. [1]
Different computation software tools can be used to analyze the data from PLAC-seq, for example, Fit-Hi-C, [4] HiCCUPS, [5] Mango, [6] Hichipper, [7] MAPS, [8] and FitHiChIP. [9] Many of the earlier software tools were developed for other 3C-based technologies and were not optimized for PLAC-seq/HiChIP data. Fit-Hi-C and HiCCUPS, both developed in 2014, were mainly developed for Hi-C data, and utilize a matrix-balancing-based normalization approach. [4] [5] Mango was developed in 2015, and is mainly used for ChIA-PET data, but has high false-positive rates in analyzing PLAC-seq/HiChIP data due to the different biases. [6] [8] Hichipper was developed in 2018 to alleviate this issue and introduced a bias-correcting algorithm, but it still has difficulties identifying protein interactions between protein binding and non-protein binding regions on the chromosome. [7] [8] MAPS and FitHiChIP were developed in 2019 as a PLAC-seq/HiChIP-specific analysis pipeline, and are generally thought to be more effective than the existing models to analyze PLAC-seq/HiChIp data. [8] [9]
The general workflow of PLAC-seq involves cell harvesting and crosslinking, in situ digestion and proximity ligation, ChIP, library construction, sequencing, and data analysis. The first step of PLAC-seq includes the preparation and crosslinking of cell and tissue samples, which typically begins with cell collection through centrifugation. The next step involves the use of a DNA crosslinking agent such as formaldehyde (HCOH) followed by the addition of glycine to stop the crosslinking reaction. The cross-linked cells can then be pelleted by centrifugation and either stored at -80 or used in the next step of the procedure. In situ digestion involves cell lysis with the use of a lysis buffer followed by digestion with a restriction enzyme MboI. This step allows for uniform digestion of genetic material while keeping the crosslinked regions of the chromosome intact. After inactivation of the digestion reaction, dNTPs and biotin are added in order to repair overhangs and mark the DNA for pull down respectively. In situ proximity ligation occurs when the biotinylated ends of the crosslinked DNA are ligated with each other. Chromatin fragmentation by sonication allows for the shearing of non-crosslinked fragments of DNA. This is followed by immunoprecipitation of biotinylated DNA through the use of antibody-coated beads. The DNA is then reverse-crosslinked and purified using column-based DNA purification or phenol-chloroform extraction. The library construction step first involves the pull-down of biotinylated DNA and the addition of sequencing adapters. The cycle number for amplification needs to be determined prior to the final amplification and library purification. Data analysis of PLAC-seq sequencing data can be carried out in multiple ways, however, the common methods involve the use of Fit-Hi-C, [4] FitHiChIP, [9] and MAPS. [8] Data analysis involves mapping to a reference genome, using software tools such as Hichipper [7] to identify peaks, and downstream analysis involving peak comparison and functional enrichment analysis. [8] The resulting data can also be integrated with other genomic data such as Hi-C or RNA-seq in order to identify potential regulatory networks.
PLAC-seq was developed to map and analyze long-range chromatin interactions. These interactions have important implications when it comes to the transcriptional regulation of genes. [10]
One challenge for mammalian cells is fitting around two meters of genetic material into a nucleus that is around a few microns in diameter, and at the same time organizing the genetic material to be able to access and use the genetic and epigenetic information. To do this, DNA is compacted around histone octamers into 2D structures, and then further packaged into 3D compartments by various mechanisms such as cis-regulatory interactions and repressive interactions. Therefore, chromosomal regions distant in 2D may have intra- and interchromosomal long-range interactions in 3D. These 3D structures are involved in the induction and repression of genes that have biological implications on basic cell functions such as cell cycle, replication, and development. Aberrant 3D structures have roles in the development of diseases and abnormalities such as cancer. [11] This can involve interactions between promoters and terminators/enhancers through the formation of long-range chromatin loops. [12] [13]
PLAC-seq has been utilized to study H3K4me3 and H3K27ac PLACE (PLAC-Enriched) interactions. [2] It has also been used to call for significant H3K4me3-mediated chromatin interactions, thereby allowing for the identification of differential epigenetic modification in different cell types such as those found in the developing human cortex. [14]
Advantages: Compared to ChIA-PET, PLAC-seq requires significantly less amount of starting biological material. [1] With shearing being one of the first steps in ChIA-PET, this leads to the disruption of protein and DNA complexes. PLAC-seq avoids this by having the crosslinking reaction precede the shearing process. Furthermore, PLAC-seq requires fewer sequencing reads than Hi-C. [1] While ChIA-PET requires 100 million starting cells, PLAC-seq only requires 5 million cells. [2] Even with 20-fold fewer cells, PLAC-seq was able to produce more reads (175 million) with a fewer PCR duplication rate (33%) than ChIA-PET (16 million, and 44% respectively). [2] PLAC-seq was also nearly 100 times more cost-effective than ChIA-PET. [1]
Disadvantages: While many of the 3C-based techniques have different biases from the protocols, PLAC-seq (and HiChIP) data have biases from immunoprecipitation efficiencies that need to be corrected for in the computational step. [15] Effective ways of reducing and/or removing the different biases in 3C-based technologies is still being studied. [15]
Chromosome conformation capture techniques are a set of molecular biology methods used to analyze the spatial organization of chromatin in a cell. These methods quantify the number of interactions between genomic loci that are nearby in 3-D space, but may be separated by many nucleotides in the linear genome. Such interactions may result from biological functions, such as promoter-enhancer interactions, or from random polymer looping, where undirected physical motion of chromatin causes loci to collide. Interaction frequencies may be analyzed directly, or they may be converted to distances and used to reconstruct 3-D structures.
SOLiD (Sequencing by Oligonucleotide Ligation and Detection) is a next-generation DNA sequencing technology developed by Life Technologies and has been commercially available since 2006. This next generation technology generates 108 - 109 small sequence reads at one time. It uses 2 base encoding to decode the raw data generated by the sequencing platform into sequence data.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.
Paired-end tags (PET) are the short sequences at the 5’ and 3' ends of a DNA fragment which are unique enough that they (theoretically) exist together only once in a genome, therefore making the sequence of the DNA in between them available upon search or upon further sequencing. Paired-end tags (PET) exist in PET libraries with the intervening DNA absent, that is, a PET "represents" a larger fragment of genomic or cDNA by consisting of a short 5' linker sequence, a short 5' sequence tag, a short 3' sequence tag, and a short 3' linker sequence. It was shown conceptually that 13 base pairs are sufficient to map tags uniquely. However, longer sequences are more practical for mapping reads uniquely. The endonucleases used to produce PETs give longer tags but sequences of 50–100 base pairs would be optimal for both mapping and cost efficiency. After extracting the PETs from many DNA fragments, they are linked (concatenated) together for efficient sequencing. On average, 20–30 tags could be sequenced with the Sanger method, which has a longer read length. Since the tag sequences are short, individual PETs are well suited for next-generation sequencing that has short read lengths and higher throughput. The main advantages of PET sequencing are its reduced cost by sequencing only short fragments, detection of structural variants in the genome, and increased specificity when aligning back to the genome compared to single tags, which involves only one end of the DNA fragment.
Chromatin Interaction Analysis by Paired-End Tag Sequencing is a technique that incorporates chromatin immunoprecipitation (ChIP)-based enrichment, chromatin proximity ligation, Paired-End Tags, and High-throughput sequencing to determine de novo long-range chromatin interactions genome-wide.
Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genomic regions, such as transcription factors on promoters or other DNA binding sites, and possibly define cistromes. ChIP also aims to determine the specific location in the genome that various histone modifications are associated with, indicating the target of the histone modifiers. ChIP is crucial for the advancements in the field of epigenomics and learning more about epigenetic phenomena.
FAIRE-Seq is a method in molecular biology used for determining the sequences of DNA regions in the genome associated with regulatory activity. The technique was developed in the laboratory of Jason D. Lieb at the University of North Carolina, Chapel Hill. In contrast to DNase-Seq, the FAIRE-Seq protocol doesn't require the permeabilization of cells or isolation of nuclei, and can analyse any cell type. In a study of seven diverse human cell types, DNase-seq and FAIRE-seq produced strong cross-validation, with each cell type having 1-2% of the human genome as open chromatin.
Single cell epigenomics is the study of epigenomics in individual cells by single cell sequencing. Since 2013, methods have been created including whole-genome single-cell bisulfite sequencing to measure DNA methylation, whole-genome ChIP-sequencing to measure histone modifications, whole-genome ATAC-seq to measure chromatin accessibility and chromosome conformation capture.
CUT&RUN sequencing, also known as cleavage under targets and release using nuclease, is a method used to analyze protein interactions with DNA. CUT&RUN sequencing combines antibody-targeted controlled cleavage by micrococcal nuclease with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN sequencing does not.
H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.
H3K4me1 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the mono-methylation at the 4th lysine residue of the histone H3 protein and often associated with gene enhancers.
H3K36me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 36th lysine residue of the histone H3 protein and often associated with gene bodies.
H3K79me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 79th lysine residue of the histone H3 protein. H3K79me2 is detected in the transcribed regions of active genes.
CUT&Tag-sequencing, also known as cleavage under targets and tagmentation, is a method used to analyze protein interactions with DNA. CUT&Tag-sequencing combines antibody-targeted controlled cleavage by a protein A-Tn5 fusion with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN and CUT&Tag sequencing do not. CUT&Tag sequencing is an improvement over CUT&RUN because it does not require cells to be lysed or chromatin to be fractionated. CUT&RUN is not suitable for single-cell platforms so CUT&Tag is advantageous for these.
ChIL sequencing (ChIL-seq), also known as Chromatin Integration Labeling sequencing, is a method used to analyze protein interactions with DNA. ChIL-sequencing combines antibody-targeted controlled cleavage by Tn5 transposase with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that ChIL-Sequencing does not. ChIL-Seq is a precise technique that reduces sample loss could be applied to single-cells.
H3K36me2 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the di-methylation at the 36th lysine residue of the histone H3 protein.
MNase-seq, short for micrococcal nuclease digestion with deep sequencing, is a molecular biological technique that was first pioneered in 2006 to measure nucleosome occupancy in the C. elegans genome, and was subsequently applied to the human genome in 2008. Though, the term ‘MNase-seq’ had not been coined until a year later, in 2009. Briefly, this technique relies on the use of the non-specific endo-exonuclease micrococcal nuclease, an enzyme derived from the bacteria Staphylococcus aureus, to bind and cleave protein-unbound regions of DNA on chromatin. DNA bound to histones or other chromatin-bound proteins may remain undigested. The uncut DNA is then purified from the proteins and sequenced through one or more of the various Next-Generation sequencing methods.
H3R17me2 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the di-methylation at the 17th arginine residue of the histone H3 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.
Hi-C is a high-throughput genomic and epigenomic technique first described in 2009 by Lieberman-Aiden et al. to capture chromatin conformation. In general, Hi-C is considered as a derivative of a series of chromosome conformation capture technologies, including but not limited to 3C, 4C, and 5C. Hi-C comprehensively detects genome-wide chromatin interactions in the cell nucleus by combining 3C and next-generation sequencing (NGS) approaches and has been considered as a qualitative leap in C-technology development and the beginning of 3D genomics.
Pore-C is an emerging genomic technique which utilizes chromatin conformation capture (3C) and Oxford Nanopore Technologies' (ONT) long-read sequencing to characterize three-dimensional (3D) chromatin structure. To characterize concatemers, the originators of Pore-C developed an algorithm to identify alignments that are assigned to a restriction fragment; concatemers with greater than two associated fragments are deemed high order. Pore-C attempts to improve on previous 3C technologies, such as Hi-C and SPRITE, by not requiring DNA amplification prior to sequencing. This technology was developed as a simpler and more easily scalable method of capturing higher-order chromatin structure and mapping regions of chromatin contact. In addition, Pore-C can be used to visualize epigenomic interactions due to the capability of ONT long-read sequencing to detect DNA methylation. Applications of this technology include analysis of combinatorial chromatin interactions, the generation of de novo chromosome scale assemblies, visualization of regions associated with multi-locus histone bodies, and detection and resolution of structural variants.
{{cite journal}}
: CS1 maint: unflagged free DOI (link){{cite journal}}
: CS1 maint: unflagged free DOI (link)