DNA footprinting

Last updated

DNA footprinting is a method of investigating the sequence specificity of DNA-binding proteins in vitro. This technique can be used to study protein-DNA interactions both outside and within cells.

Contents

The regulation of transcription has been studied extensively, and yet there is still much that is unknown. Transcription factors and associated proteins that bind promoters, enhancers, or silencers to drive or repress transcription are fundamental to understanding the unique regulation of individual genes within the genome. Techniques like DNA footprinting help elucidate which proteins bind to these associated regions of DNA and unravel the complexities of transcriptional control.

History

In January 1978, David J. Galas and Albert Schmitz developed the DNA footprinting technique to study the binding specificity of the lac repressor protein. It was originally a modification of the Maxam-Gilbert chemical sequencing technique. [1] The method was submitted and published without revision in Nucleic Acids Research. After the submission of their work, Galas and Schmitz’s method was cited in a 1980 article by David R. Engelke and colleagues describing eukaryotic proteins and their binding sites. [2] The DNA footprinting technique was further refined by Thomas D. Tullius and colleagues in August 1986, publishing a paper that used more accurate DNA cleavage mechanisms to boost the scientific rigour of their research and future research. [3]

In January 2008, Alan P. Boyle and colleagues developed a genome-wide DNA footprinting method, which involved running pre-digested nuclei through multiple rounds of digestion and repair to produce DNase-seq, an enzyme analogous to the DNase I used by Galas and Schmitz, to map genomic open chromatin. [4]

In recent years, many laboratories and researchers have developed computational methods to statistically analyze deeply sequenced DNase-seq information that originally required an extensive background in bioinformatics to understand and sequence. [5] [6] [7] [8] [9] [10] [11] [12]

Method

The simplest application of this technique is to assess whether a given protein binds to a region of interest within a DNA molecule. [13] Polymerase chain reaction (PCR) amplify and label region of interest that contains a potential protein-binding site, ideally amplicon is between 50 and 200 base pairs in length. Add protein of interest to a portion of the labeled template DNA; a portion should remain separate without protein, for later comparison. Add a cleavage agent to both portions of DNA template. The cleavage agent is a chemical or enzyme that will cut at random locations in a sequence independent manner. The reaction should occur just long enough to cut each DNA molecule in only one location. A protein that specifically binds a region within the DNA template will protect the DNA it is bound to from the cleavage agent. Run both samples side by side on a polyacrylamide gel electrophoresis. The portion of DNA template without protein will be cut at random locations, and thus when it is run on a gel, will produce a ladder-like distribution. The DNA template with the protein will result in ladder distribution with a break in it, the "footprint", where the DNA has been protected from the cleavage agent. Note: Maxam-Gilbert chemical DNA sequencing can be run alongside the samples on the polyacrylamide gel to allow the prediction of the exact location of ligand binding site.

Labeling

The DNA template labeled at the 3' or 5' end, depending on the location of the binding site(s). Labels that can be used are: radioactivity and fluorescence. Radioactivity has been traditionally used to label DNA fragments for footprinting analysis, as the method was originally developed from the Maxam-Gilbert chemical sequencing technique. Radioactive labeling is very sensitive and is optimal for visualizing small amounts of DNA. Fluorescence is a desirable advancement due to the hazards of using radio-chemicals. However, it has been more difficult to optimize because it is not always sensitive enough to detect the low concentrations of the target DNA strands used in DNA footprinting experiments. Electrophoretic sequencing gels or capillary electrophoresis have been successful in analyzing footprinting of fluorescent tagged fragments. [13]

Cleavage agent

A variety of cleavage agents can be chosen. a desirable agent is one that is sequence neutral, easy to use, and is easy to control. Unfortunately no available agents meet all of these standards, so an appropriate agent can be chosen, depending on your DNA sequence and ligand of interest. The following cleavage agents are described in detail: DNase I is a large protein that functions as a double-strand endonuclease. It binds the minor groove of DNA and cleaves the phosphodiester backbone. It is a good cleavage agent for footprinting because its size makes it easily physically hindered. Thus is more likely to have its action blocked by a bound protein on a DNA sequence. In addition, the DNase I enzyme is easily controlled by adding EDTA to stop the reaction. There are however some limitations in using DNase I. The enzyme does not cut DNA randomly; its activity is affected by local DNA structure and sequence and therefore results in an uneven ladder. This can limit the precision of predicting a protein's binding site on the DNA molecule. [13] [14] Hydroxyl radicals are created from the Fenton reaction, which involves reducing Fe2+ with H2O2 to form free hydroxyl molecules. These hydroxyl molecules react with the DNA backbone, resulting in a break. Due to their small size, the resulting DNA footprint has high resolution. Unlike DNase I they have no sequence dependence and result in a much more evenly distributed ladder. The negative aspect of using hydroxyl radicals is that they are more time-consuming to use, due to a slower reaction and digestion time. [15] Ultraviolet irradiation can be used to excite nucleic acids and create photoreactions, which results in damaged bases in the DNA strand. [16] Photoreactions can include: single strand breaks, interactions between or within DNA strands, reactions with solvents, or crosslinks with proteins. The workflow for this method has an additional step, once both your protected and unprotected DNA have been treated, there is subsequent primer extension of the cleaved products. [17] [18] The extension will terminate upon reaching a damaged base, and thus when the PCR products are run side-by-side on a gel; the protected sample will show an additional band where the DNA was crosslinked with a bound protein. Advantages of using UV are that it reacts very quickly and can therefore capture interactions that are only momentary. Additionally it can be applied to in vivo experiments, because UV can penetrate cell membranes. A disadvantage is that the gel can be difficult to interpret, as the bound protein does not protect the DNA, it merely alters the photoreactions in the vicinity. [19]

Advanced applications

In vivo footprinting

In vivo footprinting is a technique used to analyze the protein-DNA interactions that are occurring in a cell at a given time point. [16] [20] DNase I can be used as a cleavage agent if the cellular membrane has been permeabilized. However the most common cleavage agent used is UV irradiation because it penetrates the cell membrane without disrupting cell state and can thus capture interactions that are sensitive to cellular changes. However, this comes with the drawback that DNAse I provides higher specificity and accuracy, where DNAse I is capable of cleaving unprotected DNA regions, leaving a footprint where proteins are bound. This allows for precise identification of protein-DNA interactions, while UV radiation induces widespread damage, such that, it can be difficult to find the exact binding sites.

Once the DNA has been cleaved or damaged by UV, the cells can be lysed and DNA purified for analysis of a region of interest. Ligation-mediated PCR is an alternative method to footprint in vivo. Once a cleavage agent has been used on the genomic DNA, resulting in single strand breaks, and the DNA is isolated, a linker is added onto the break points. A region of interest is amplified between the linker and a gene-specific primer, and when run on a polyacrylamide gel, will have a footprint where a protein was bound. [21]

In vivo footprinting combined with immunoprecipitation can be used to assess protein specificity at many locations throughout the genome. This assay involves the DNA to crosslink with a protein under the in vivo condition using chemical cross linkers or UV light. Chromatin digestion will undergo to leave only the DNA-protein complex that get detected using an antibody that detects the protein. Once detected, the immunoprecipitated DNA can be purified, released from cross link and analyzed using DNA footprinting techniques like PCR or sequencing the region of interest. [22] The DNA bound to a protein of interest can be immunoprecipitated with an antibody to that protein, and then specific region binding can be assessed using the DNA footprinting technique. [23]

Quantitative footprinting

The DNA footprinting technique can be modified to assess the binding strength of a protein to a region of DNA. Using varying concentrations of the protein for the footprinting experiment, the appearance of the footprint can be observed as the concentrations increase and the proteins binding affinity can then be estimated. [13]

Detection by capillary electrophoresis

To adapt the footprinting technique to updated detection methods, the labelled DNA fragments are detected by a capillary electrophoresis device instead of being run on a polyacrylamide gel. If the DNA fragment to be analyzed is produced by polymerase chain reaction (PCR), it is straightforward to couple a fluorescent molecule such as carboxyfluorescein (FAM) to the primers. This way, the fragments produced by DNaseI digestion will contain FAM, and will be detectable by the capillary electrophoresis machine. Typically, carboxytetramethyl-rhodamine (ROX)-labelled size standards are also added to the mixture of fragments to be analyzed. Binding sites of transcription factors have been successfully identified this way. [24]

Capillary electrophoresis can be used to detect length differences of DNA fragments of interest within a sample. Additionally, this technique offers high resolution, allowing for the detection of even minor variations in fragment length. The automation and high-throughput capabilities of capillary electrophoresis make it a valuable tool for large-scale studies and applications where rapid and accurate results are required. Furthermore, the use of fluorescent labelling enhances sensitivity and allows for multiplexing, enabling the simultaneous analysis of multiple samples or target regions within a single run. [25]

Cell-free fragmentomics

Cell-free DNA fragmentomics analyzes the fragmentation patterns of cfDNA. DNA footprinting is applied to cfDNA to study the binding sites of DNA-binding proteins. This allows researchers to identify and analyze protein-DNA interactions non-invasively. This is specifically used fro early cancer detection by assessing disease-associated fragmentation patterns. This is done by extracting DNA from a body fluid sample, undergo sequencing and analysis, where specific features like fragment size, end motifs, and fragment distribution across different genomic regions to detect disease in a non-invasive manner. [26]

Genome-wide assays

Next-generation sequencing has enabled a genome-wide approach to identify DNA footprints. Open chromatin assays such as DNase-Seq [27] and FAIRE-Seq [28] have proven to provide a robust regulatory landscape for many cell types. [29] However, these assays require some downstream bioinformatics analyses in order to provide genome-wide DNA footprints. The computational tools proposed can be categorized in two classes: segmentation-based and site-centric approaches.

Segmentation-based methods are based on the application of Hidden Markov models or sliding window methods to segment the genome into open/closed chromatin region. Examples of such methods are: HINT, [30] Boyle method [31] and Neph method. [32] Site-centric methods, on the other hand, find footprints given the open chromatin profile around motif-predicted binding sites, i.e., regulatory regions predicted using DNA-protein sequence information (encoded in structures such as position weight matrix). Examples of these methods are CENTIPEDE [33] and Cuellar-Partida method. [34]

See also

References

  1. Galas, D; Schmitz, A (1978). "DNAse footprinting: a simple method for the detection of protein-DNA binding specificity". Nucleic Acids Research. 5 (9): 3157–70. doi:10.1093/nar/5.9.3157. PMC   342238 . PMID   212715.
  2. Engelke, D; Ng, S; Shastry, B; Roeder, R (March 1980). "Specific interaction of a purified transcription factor with an internal control region of 5S RNA genes". Cell. 19 (3): 717–728. doi:10.1016/S0092-8674(80)80048-1.
  3. Tullius, T D; Dombroski, B A (August 1986). "Hydroxyl radical "footprinting": high-resolution information about DNA-protein contacts and application to lambda repressor and Cro protein". Proceedings of the National Academy of Sciences. 83 (15): 5469–5473. doi:10.1073/pnas.83.15.5469. PMC   386308 . PMID   3090544.
  4. Boyle, Alan P.; Davis, Sean; Shulha, Hennady P.; Meltzer, Paul; Margulies, Elliott H.; Weng, Zhiping; Furey, Terrence S.; Crawford, Gregory E. (2008-01-25). "High-Resolution Mapping and Characterization of Open Chromatin across the Genome". Cell. 132 (2): 311–322. doi:10.1016/j.cell.2007.12.014. ISSN   0092-8674.
  5. Pique-Regi, Roger; Degner, Jacob F.; Pai, Athma A.; Gaffney, Daniel J.; Gilad, Yoav; Pritchard, Jonathan K. (2011-03-01). "Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data". Genome Research. 21 (3): 447–455. doi:10.1101/gr.112623.110. ISSN   1088-9051. PMC   3044858 . PMID   21106904.
  6. Neph, Shane; Vierstra, Jeff; Stergachis, Andrew B.; Reynolds, Alex P.; Haugen, Eric; Vernot, Benjamin; Thurman, Robert E.; John, Sam; Sandstrom, Richard; Johnson, Audra K.; Maurano, Matthew T.; Humbert, Richard; Rynes, Eric; Wang, Hao; Vong, Shinny (September 2012). "An expansive human regulatory lexicon encoded in transcription factor footprints". Nature. 489 (7414): 83–90. doi:10.1038/nature11212. ISSN   1476-4687. PMC   3736582 . PMID   22955618.
  7. Sung, Myong-Hee; Guertin, Michael J.; Baek, Songjoon; Hager, Gordon L. (2014-10-23). "DNase Footprint Signatures Are Dictated by Factor Dynamics and DNA Sequence". Molecular Cell. 56 (2): 275–285. doi:10.1016/j.molcel.2014.08.016. ISSN   1097-2765. PMC   4272573 . PMID   25242143.
  8. Luo, Kaixuan; Hartemink, Alexander J (2013). "Using DNAse Digestion Data to Accurately Identify Transcription Factor Binding Sites". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. Archived from the original on 2025-02-02.
  9. Gusmao, Eduardo G.; Dieterich, Christoph; Zenke, Martin; Costa, Ivan G. (2014-11-15). "Detection of active transcription factor binding sites with the combination of DNase hypersensitivity and histone modifications". Bioinformatics. 30 (22): 3143–3151. doi:10.1093/bioinformatics/btu519. ISSN   1367-4803.
  10. Sherwood, Richard I.; Hashimoto, Tatsunori; O'Donnell, Charles W.; Lewis, Sophia; Barkal, Amira A.; van Hoff, John Peter; Karun, Vivek; Jaakkola, Tommi; Gifford, David K. (February 2014). "Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape". Nature Biotechnology. 32 (2): 171–178. doi:10.1038/nbt.2798. ISSN   1546-1696. PMC   3951735 . PMID   24441470.
  11. Piper, Jason; Elze, Markus C.; Cauchy, Pierre; Cockerill, Peter N.; Bonifer, Constanze; Ott, Sascha (2013-11-01). "Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data". Nucleic Acids Research. 41 (21): e201. doi:10.1093/nar/gkt850. ISSN   0305-1048. PMC   3834841 . PMID   24071585.
  12. Cuellar-Partida, Gabriel; Buske, Fabian A.; McLeay, Robert C.; Whitington, Tom; Noble, William Stafford; Bailey, Timothy L. (2012-01-01). "Epigenetic priors for identifying active transcription factor binding sites". Bioinformatics. 28 (1): 56–62. doi:10.1093/bioinformatics/btr614. ISSN   1367-4803. PMC   3244768 . PMID   22072382.
  13. 1 2 3 4 Hampshire, A; Rusling, D; Broughton-Head, V; Fox, K (2007). "Footprinting: A method for determining the sequence selectivity, affinity and kinetics of DNA-binding ligands". Methods. 42 (2): 128–140. doi:10.1016/j.ymeth.2007.01.002. PMID   17472895.
  14. LeBlanc B and Moss T. (2001) DNase I Footprinting. Methods in Molecular Biology. 148: 31–8.
  15. Zaychikov E, Schickor P, Denissova L, and Heumann H. (2001) Hydroxyl radical footprinting. Methods in Molecular Biology. 148: 49–61.
  16. 1 2 Becker, M.M.; Wang, J.C. (1984). "Use of Light for Footprinting DNA in vivo". Nature. 309 (5970): 682–687. Bibcode:1984Natur.309..682B. doi:10.1038/309682a0. PMID   6728031. S2CID   31638231.
  17. Axelrod, J.D.; Majors, J (1989). "An Improved Method for Photofootprinting Yeast Genes In Vivo Using Taq Polymerase". Nucleic Acids Res. 17 (1): 171–183. doi:10.1093/NAR/17.1.171. PMC   331543 . PMID   2643080.
  18. Becker, M.M.; Wang, Z.; Grossmann, G.; Becherer, K.A. (1989). "Genomic Footprinting in Mammalian Cells With Ultraviolet Light". PNAS. 86 (14): 5315–5319. Bibcode:1989PNAS...86.5315B. doi: 10.1073/PNAS.86.14.5315 . PMC   297612 . PMID   2748587.
  19. Geiselmann J and Boccard F. (2001) Ultraviolet-laser footprinting. Methods in Molecular Biology. 148:161-73.
  20. Ephrussi, A.; Church, G.M.; Tonegawa, S.; Gilbert, W. (1985). "B Lineage-Specific Interactions of an Immunoglobulin Enhancer With Cellular Factors In Vivo". Science. 227 (4683): 134–140. Bibcode:1985Sci...227..134E. doi:10.1126/science.3917574. PMID   3917574.
  21. Dai S, Chen H, Chang C, Riggs A, Flanagan S. (2000) Ligation-mediated PCR for quantitative in vivo footprinting. Nature Biotechnology. 18:1108–1111.
  22. Dey, Bipasha; Thukral, Sameer; Krishnan, Shruti; Chakrobarty, Mainak; Gupta, Sahil; Manghani, Chanchal; Rani, Vibha (2012-06-01). "DNA–protein interactions: methods for detection and analysis". Molecular and Cellular Biochemistry. 365 (1): 279–299. doi:10.1007/s11010-012-1269-z. ISSN   1573-4919.
  23. Zaret, K (1997). "Editorial". Methods. 11 (2): 149–150. doi:10.1006/meth.1996.0400. PMID   8993026.
  24. Kovacs, Krisztian A.; Steinmann, Myriam; Magistretti, Pierre J.; Halfon, Olivier; Cardinaux, Jean-Rene (2006). "C/EBPβ couples dopamine signalling to substance P precursor gene expression in striatal neurones". Journal of Neurochemistry. 98 (5): 1390–1399. doi:10.1111/j.1471-4159.2006.03957.x. PMID   16771829. S2CID   36225447.
  25. Ferraz, Ricardo André Campos; Lopes, Ana Lúcia Gonçalves; da Silva, Jessy Ariana Faria; Moreira, Diana Filipa Viana; Ferreira, Maria João Nogueira; de Almeida Coimbra, Sílvia Vieira (2021-07-23). "DNA–protein interaction studies: a historical and comparative analysis". Plant Methods. 17 (1): 82. doi:10.1186/s13007-021-00780-z. ISSN   1746-4811. PMC   8299673 . PMID   34301293.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  26. Chiu, Rossa W K; Heitzer, Ellen; Lo, Y M Dennis; Mouliere, Florent; Tsui, Dana W Y (2020-12-01). "Cell-Free DNA Fragmentomics: The New "Omics" on the Block". Clinical Chemistry. 66 (12): 1480–1484. doi:10.1093/clinchem/hvaa258. ISSN   0009-9147.
  27. Song, L; Crawford, GE (Feb 2010). "DNase-seq: a high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells". Cold Spring Harbor Protocols. 2010 (2): pdb.prot5384+. doi:10.1101/pdb.prot5384. PMC   3627383 . PMID   20150147.
  28. Giresi, PG; Kim, J; McDaniell, RM; Iyer, VR; Lieb, JD (Jun 2007). "FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin". Genome Research. 17 (6): 877–85. doi:10.1101/gr.5533506. PMC   1891346 . PMID   17179217.
  29. Thurman, RE; et al. (Sep 2012). "The accessible chromatin landscape of the human genome". Nature. 489 (7414): 75–82. Bibcode:2012Natur.489...75T. doi:10.1038/nature11232. PMC   3721348 . PMID   22955617.
  30. Gusmao, EG; Dieterich, C; Zenke, M; Costa, IG (Aug 2014). "Detection of Active Transcription Factor Binding Sites with the Combination of DNase Hypersensitivity and Histone Modifications". Bioinformatics. 30 (22): 3143–51. doi: 10.1093/bioinformatics/btu519 . PMID   25086003.
  31. Boyle, AP; et al. (Mar 2011). "High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells". Genome Research. 21 (3): 456–464. doi:10.1101/gr.112656.110. PMC   3044859 . PMID   21106903.
  32. Neph, S; et al. (Sep 2012). "An expansive human regulatory lexicon encoded in transcription factor footprints". Nature. 489 (7414): 83–90. Bibcode:2012Natur.489...83N. doi:10.1038/nature11212. PMC   3736582 . PMID   22955618.
  33. Pique-Regi, R; et al. (Mar 2011). "Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data". Genome Research. 21 (3): 447–455. doi:10.1101/gr.112623.110. PMC   3044858 . PMID   21106904.
  34. Cuellar-Partida, G; et al. (Jan 2012). "Epigenetic priors for identifying active transcription factor binding sites". Bioinformatics. 28 (1): 56–62. doi:10.1093/bioinformatics/btr614. PMC   3244768 . PMID   22072382.