DNA footprinting is a method of investigating the sequence specificity of DNA-binding proteins in vitro. This technique can be used to study protein-DNA interactions both outside and within cells.
The regulation of transcription has been studied extensively, and yet there is still much that is unknown. Transcription factors and associated proteins that bind promoters, enhancers, or silencers to drive or repress transcription are fundamental to understanding the unique regulation of individual genes within the genome. Techniques like DNA footprinting help elucidate which proteins bind to these associated regions of DNA and unravel the complexities of transcriptional control.
In January 1978, David J. Galas and Albert Schmitz developed the DNA footprinting technique to study the binding specificity of the lac repressor protein. It was originally a modification of the Maxam-Gilbert chemical sequencing technique. [1] The method was submitted and published without revision in Nucleic Acids Research. After the submission of their work, Galas and Schmitz’s method was cited in a 1980 article by David R. Engelke and colleagues describing eukaryotic proteins and their binding sites. [2] The DNA footprinting technique was further refined by Thomas D. Tullius and colleagues in August 1986, publishing a paper that used more accurate DNA cleavage mechanisms to boost the scientific rigour of their research and future research. [3]
In January 2008, Alan P. Boyle and colleagues developed a genome-wide DNA footprinting method, which involved running pre-digested nuclei through multiple rounds of digestion and repair to produce DNase-seq, an enzyme analogous to the DNase I used by Galas and Schmitz, to map genomic open chromatin. [4]
In recent years, many laboratories and researchers have developed computational methods to statistically analyze deeply sequenced DNase-seq information that originally required an extensive background in bioinformatics to understand and sequence. [5] [6] [7] [8] [9] [10] [11] [12]
The simplest application of this technique is to assess whether a given protein binds to a region of interest within a DNA molecule. [13] Polymerase chain reaction (PCR) amplify and label region of interest that contains a potential protein-binding site, ideally amplicon is between 50 and 200 base pairs in length. Add protein of interest to a portion of the labeled template DNA; a portion should remain separate without protein, for later comparison. Add a cleavage agent to both portions of DNA template. The cleavage agent is a chemical or enzyme that will cut at random locations in a sequence independent manner. The reaction should occur just long enough to cut each DNA molecule in only one location. A protein that specifically binds a region within the DNA template will protect the DNA it is bound to from the cleavage agent. Run both samples side by side on a polyacrylamide gel electrophoresis. The portion of DNA template without protein will be cut at random locations, and thus when it is run on a gel, will produce a ladder-like distribution. The DNA template with the protein will result in ladder distribution with a break in it, the "footprint", where the DNA has been protected from the cleavage agent. Note: Maxam-Gilbert chemical DNA sequencing can be run alongside the samples on the polyacrylamide gel to allow the prediction of the exact location of ligand binding site.
The DNA template labeled at the 3' or 5' end, depending on the location of the binding site(s). Labels that can be used are: radioactivity and fluorescence. Radioactivity has been traditionally used to label DNA fragments for footprinting analysis, as the method was originally developed from the Maxam-Gilbert chemical sequencing technique. Radioactive labeling is very sensitive and is optimal for visualizing small amounts of DNA. Fluorescence is a desirable advancement due to the hazards of using radio-chemicals. However, it has been more difficult to optimize because it is not always sensitive enough to detect the low concentrations of the target DNA strands used in DNA footprinting experiments. Electrophoretic sequencing gels or capillary electrophoresis have been successful in analyzing footprinting of fluorescent tagged fragments. [13]
A variety of cleavage agents can be chosen. a desirable agent is one that is sequence neutral, easy to use, and is easy to control. Unfortunately no available agents meet all of these standards, so an appropriate agent can be chosen, depending on your DNA sequence and ligand of interest. The following cleavage agents are described in detail: DNase I is a large protein that functions as a double-strand endonuclease. It binds the minor groove of DNA and cleaves the phosphodiester backbone. It is a good cleavage agent for footprinting because its size makes it easily physically hindered. Thus is more likely to have its action blocked by a bound protein on a DNA sequence. In addition, the DNase I enzyme is easily controlled by adding EDTA to stop the reaction. There are however some limitations in using DNase I. The enzyme does not cut DNA randomly; its activity is affected by local DNA structure and sequence and therefore results in an uneven ladder. This can limit the precision of predicting a protein's binding site on the DNA molecule. [13] [14] Hydroxyl radicals are created from the Fenton reaction, which involves reducing Fe2+ with H2O2 to form free hydroxyl molecules. These hydroxyl molecules react with the DNA backbone, resulting in a break. Due to their small size, the resulting DNA footprint has high resolution. Unlike DNase I they have no sequence dependence and result in a much more evenly distributed ladder. The negative aspect of using hydroxyl radicals is that they are more time-consuming to use, due to a slower reaction and digestion time. [15] Ultraviolet irradiation can be used to excite nucleic acids and create photoreactions, which results in damaged bases in the DNA strand. [16] Photoreactions can include: single strand breaks, interactions between or within DNA strands, reactions with solvents, or crosslinks with proteins. The workflow for this method has an additional step, once both your protected and unprotected DNA have been treated, there is subsequent primer extension of the cleaved products. [17] [18] The extension will terminate upon reaching a damaged base, and thus when the PCR products are run side-by-side on a gel; the protected sample will show an additional band where the DNA was crosslinked with a bound protein. Advantages of using UV are that it reacts very quickly and can therefore capture interactions that are only momentary. Additionally it can be applied to in vivo experiments, because UV can penetrate cell membranes. A disadvantage is that the gel can be difficult to interpret, as the bound protein does not protect the DNA, it merely alters the photoreactions in the vicinity. [19]
In vivo footprinting is a technique used to analyze the protein-DNA interactions that are occurring in a cell at a given time point. [16] [20] DNase I can be used as a cleavage agent if the cellular membrane has been permeabilized. However the most common cleavage agent used is UV irradiation because it penetrates the cell membrane without disrupting cell state and can thus capture interactions that are sensitive to cellular changes. However, this comes with the drawback that DNAse I provides higher specificity and accuracy, where DNAse I is capable of cleaving unprotected DNA regions, leaving a footprint where proteins are bound. This allows for precise identification of protein-DNA interactions, while UV radiation induces widespread damage, such that, it can be difficult to find the exact binding sites.
Once the DNA has been cleaved or damaged by UV, the cells can be lysed and DNA purified for analysis of a region of interest. Ligation-mediated PCR is an alternative method to footprint in vivo. Once a cleavage agent has been used on the genomic DNA, resulting in single strand breaks, and the DNA is isolated, a linker is added onto the break points. A region of interest is amplified between the linker and a gene-specific primer, and when run on a polyacrylamide gel, will have a footprint where a protein was bound. [21]
In vivo footprinting combined with immunoprecipitation can be used to assess protein specificity at many locations throughout the genome. This assay involves the DNA to crosslink with a protein under the in vivo condition using chemical cross linkers or UV light. Chromatin digestion will undergo to leave only the DNA-protein complex that get detected using an antibody that detects the protein. Once detected, the immunoprecipitated DNA can be purified, released from cross link and analyzed using DNA footprinting techniques like PCR or sequencing the region of interest. [22] The DNA bound to a protein of interest can be immunoprecipitated with an antibody to that protein, and then specific region binding can be assessed using the DNA footprinting technique. [23]
The DNA footprinting technique can be modified to assess the binding strength of a protein to a region of DNA. Using varying concentrations of the protein for the footprinting experiment, the appearance of the footprint can be observed as the concentrations increase and the proteins binding affinity can then be estimated. [13]
To adapt the footprinting technique to updated detection methods, the labelled DNA fragments are detected by a capillary electrophoresis device instead of being run on a polyacrylamide gel. If the DNA fragment to be analyzed is produced by polymerase chain reaction (PCR), it is straightforward to couple a fluorescent molecule such as carboxyfluorescein (FAM) to the primers. This way, the fragments produced by DNaseI digestion will contain FAM, and will be detectable by the capillary electrophoresis machine. Typically, carboxytetramethyl-rhodamine (ROX)-labelled size standards are also added to the mixture of fragments to be analyzed. Binding sites of transcription factors have been successfully identified this way. [24]
Capillary electrophoresis can be used to detect length differences of DNA fragments of interest within a sample. Additionally, this technique offers high resolution, allowing for the detection of even minor variations in fragment length. The automation and high-throughput capabilities of capillary electrophoresis make it a valuable tool for large-scale studies and applications where rapid and accurate results are required. Furthermore, the use of fluorescent labelling enhances sensitivity and allows for multiplexing, enabling the simultaneous analysis of multiple samples or target regions within a single run. [25]
Cell-free DNA fragmentomics analyzes the fragmentation patterns of cfDNA. DNA footprinting is applied to cfDNA to study the binding sites of DNA-binding proteins. This allows researchers to identify and analyze protein-DNA interactions non-invasively. This is specifically used fro early cancer detection by assessing disease-associated fragmentation patterns. This is done by extracting DNA from a body fluid sample, undergo sequencing and analysis, where specific features like fragment size, end motifs, and fragment distribution across different genomic regions to detect disease in a non-invasive manner. [26]
Next-generation sequencing has enabled a genome-wide approach to identify DNA footprints. Open chromatin assays such as DNase-Seq [27] and FAIRE-Seq [28] have proven to provide a robust regulatory landscape for many cell types. [29] However, these assays require some downstream bioinformatics analyses in order to provide genome-wide DNA footprints. The computational tools proposed can be categorized in two classes: segmentation-based and site-centric approaches.
Segmentation-based methods are based on the application of Hidden Markov models or sliding window methods to segment the genome into open/closed chromatin region. Examples of such methods are: HINT, [30] Boyle method [31] and Neph method. [32] Site-centric methods, on the other hand, find footprints given the open chromatin profile around motif-predicted binding sites, i.e., regulatory regions predicted using DNA-protein sequence information (encoded in structures such as position weight matrix). Examples of these methods are CENTIPEDE [33] and Cuellar-Partida method. [34]
{{cite journal}}
: CS1 maint: unflagged free DOI (link)