Zinc finger protein chimera are chimeric proteins composed of a DNA-binding zinc finger protein domain and another domain through which the protein exerts its effect. The effector domain may be a transcriptional activator (A) or repressor (R), [1] a methylation domain (M) or a nuclease (N). [2]
Modification of the endogenous DNA-binding zinc finger domain is the basis of the most advanced field in construction of gene-specific artificial transcription factors. [1] Linking together six ZFPs produces a target-site of 18-19 bp. Assuming specificity to that one sequence and that the sequence of the genome is random, 18 bp is long enough to be unique in all known genomes [3] [4] Indeed, the spacing between subsites becomes part of the target sequence due to restrictions in the flexibility of the protein which can be controlled. [1] Targeting sites as small as 9 bp provides some degree of specificity, almost certainly attributable in some part to chromatin occlusion. [4]
Depending upon the requirements of the investigation, there are several techniques available to define a DNA-recognition domain that will confer the specificity of a ZFP-based transcription factor. Three phage display strategies have been described, involving either parallel, sequential or bipartite selection of the constituent zinc fingers.
The parallel selection (Fig. 1 (A)) approach assumes that the individual zinc finger domains are functionally independent. On this basis, existing predetermined domains should be usable with no additional design or selection, making it a rapid and accessible technique to any laboratory. [5] [6] This is not true in every case, such that this strategy is liable to suffer issues related to target-site overlap at a number of target sequences, as discussed later. If necessary, it may be possible to surmount the problem of target site overlap by randomising the amino acid residues at the interface of two zinc fingers at which it occurs. [5]
Sequential selection (Fig. 1 (B)), put forward by the Pabo group in 1997 embraces the cooperative binding between zinc fingers to produce DNA-binding domains of great affinity and specificity. [7] As suggested by the name, each finger is selected from a randomised library in the context of the previously selected finger. The techniques used in selection are similar to those described below except that the target oligonucleotide used in selection contains the entire target sequence. As shown in Fig. 1, a library is created in which finger three contains the randomised alpha-helix. The domain with the best binding characteristics is selected and then included in another library in which the finger-one anchor is removed and another randomised finger is added to the opposite end. This continues and results in a DNA-binding domain in which all fingers were selected in the context of the neighbouring finger and since each round of selection is applied to the same final target sequence, target site overlap still occurs but is an asset rather than a hindrance.
The main drawback of this approach is the necessity to produce a separate library for each zinc finger, which exceeds the capacity of most laboratories.
The bipartite selection (Fig 1 (C)) method was proposed by Isalan et al., 2001 [8] as a compromise between the parallel and sequential selection strategies. The first and last 5 bp of the 9 bp target site are selected in parallel and combined to produce a library from which the final ZFP is chosen.
In order to keep the library size within reasonable limits, this technique is restricted to randomisation of only the key residues involved in base recognition. Furthermore, unlike in parallel selection, this technique requires multiple pannings before a novel ZFP can be constructed. [6]
In order to determine the most appropriate sequence of amino acids in the alpha-helix of a zinc finger for binding to a given DNA sequence, a technique involving phage display may be employed. By altering the genome of selected bacteriophage, it is possible to create a phage that will display a ZFP as part of its protein coat. Such phage can subsequently be tested for adherence, via the attached zinc fingers, to an oligonucleotide containing the sequence of interest, whilst other, non-adherent phage are washed away. The DNA within the phage codes for the ZFPs expressed, so extracting and sequencing the DNA of bound phage provides information as to suitable amino acid configurations for binding a specific sequence. This forms the basis of the investigation of ZFPs binding by phage display. [9] [10]
Work is typically performed using the murine ZFP-TF Zif268 or one of its derivatives, as the basis of the zinc finger sequence modifications since it is the most well characterised of all zinc finger proteins. [3] [10] Its derivatives C7 or C7.GAT, are often used for their superior binding affinity and specificity. C7.GAT has been used to investigate the 5'-ANN-3' and 5'-CNN-3' families of sequences since the third finger of C7 defines a guanine or thymine in the 5' position of the finger two sequence (target site overlap). [4] [10] [11] Filamentous helper phage and the DNA from lambda phage are utilised in phage display. Due to limitations in the size of libraries that can be routinely constructed, randomisation may be limited to the most influential amino acids in the ZFP sequence as inferred by X-ray crystallography. The positions were identified as helix positions -1, 2, 3, 4, 5 and 6 in fingers one and three, and positions -2, -1, 1, 2, 3 and 4 in finger two in a study published by Wu et al. (1995), [10] however another study by Segal et al. (1999) [3] suggests the importance of all positions from -2 to 6 due to the unspecific affinity of some amino acids and the ability of others to stabilise adjacent interactions.
A short (~34 nt) hairpin DNA containing the ZFP binding site with alterations occurring in a single subsite is used as the target. The oligonucleotide used may be synthesised to include a primary n‑hexyl amino group at its 5' end, later utilised to attach bovine serum albumin (BSA). In this case, the conjugate is used to precoat a microtitre well, before applying ~1013 colony-forming units of phage. Following incubation, the phage are removed and the plate washed with buffer containing 0.5% Tween 20 to remove non-adherent phage. [10] Using an acidic elution buffer, the adherent phage are removed and neutralised with Tris base. [9] Further rounds of panning are completed to ensure enrichment of the sample, by infecting bacterial cells with the eluted phage and helper phage and then collecting the [ZFP-displaying] phage produced for the next round of panning. As an alternative to BSA, the hairpin target DNA may be biotinylated and later extracted using streptavidin-coated magnetic beads (streptavidin forms very strong bonds with biotin). [3]
To increase the specificity of the selected phage, especially where larger libraries are being investigated, competitor oligonucleotides are used to sequester those zinc finger proteins of lesser specificity before the biotinylated target oligonucleotide is added. Sheared herring sperm DNA for example, will bind phage with a non-specific adherence to DNA. Subsequent rounds of panning involve increasing concentrations of specifically synthesised non-target oligonucleotides where all but the sequence of the target subsite remains the same, down to a single nucleotide difference. In particular, the target sequence of the original ZFP which was subject to mutagenesis is used in high quantity to select against 'parental phage' contaminating the library. The binding of streptavidin-coated magnetic beads can be blocked by blotto and antibody-displaying (irrelevant) phage so that binding only occurs to molecules with such a high affinity as biotin. Non-specific phage are removed as before, using a buffer including dilute Tween 20. Bound phage are collected by virtue of the magnetic beads and may be eluted by incubation with trypsin. Only those phage displaying highly specific ZFPs will thus be selected. [3] [11]
After elution, the phage can be plated and DNA extracted from individual plaque forming units, restricted and the appropriately sized fragments extracted after separation by PAGE. The DNA can then be sequenced to discover the protein primary structure that produces adherence to the target sequence. This process is repeated for each of the 5'-NNN-3' single finger subsites being investigated.
Generating arrays of engineered Cys2His2 zinc fingers is the most developed method for creating proteins capable of targeting desired genomic DNA sequences. The majority of engineered zinc finger arrays are based on the zinc finger domain of the murine transcription factor Zif268, although some groups have used zinc finger arrays based on the human transcription factor SP1. Zif268 has three individual zinc finger motifs that collectively bind a 9 bp sequence with high affinity. [12] The structure of this protein bound to DNA was solved in 1991 [13] and stimulated a great deal of research into engineered zinc finger arrays. In 1994 and 1995, a number of groups used phage display to alter the specificity of a single zinc finger of Zif268. [14] [15] [16] [17] Carlos F. Barbas et al. also reported the development of zinc finger technology in the patent literature and have been granted a number of patents that have been important for the commercial development of zinc finger technology. [18] [19] Typical engineered zinc finger arrays have between 3 and 6 individual zinc finger motifs and bind target sites ranging from 9 basepairs to 18 basepairs in length. Arrays with 6 zinc finger motifs are particularly attractive because they bind a target site that is long enough to have a good chance of being unique in a mammalian genome. [20] There are two main methods currently used to generate engineered zinc finger arrays, modular assembly and a bacterial selection system, and there is some debate about which method is best suited for most applications. [21] [22]
The most straightforward method to generate new zinc finger arrays is to combine smaller zinc finger "modules" of known specificity. The structure of the zinc finger protein Zif268 bound to DNA described by Pavletich and Pabo in their 1991 publication has been key to much of this work and describes the concept of obtaining fingers for each of the 64 possible base pair triplets and then mixing and matching these fingers to design proteins with any desired sequence specificity. [13] The most common modular assembly process involves combining separate zinc fingers that can each recognize a 3 basepair DNA sequence to generate 3-finger, 4-, 5-, or 6-finger arrays that recognize target sites ranging from 9 basepairs to 18 basepairs in length. Another method uses 2-finger modules to generate zinc finger arrays with up to six individual zinc fingers. [23] The Barbas Laboratory of The Scripps Research Institute used phage display to develop and characterize zinc finger domains that recognize most DNA triplet sequences [24] [25] [26] while another group isolated and characterized individual fingers from the human genome. [27] A potential drawback with modular assembly in general is that specificities of individual zinc finger can overlap and can depend on the context of the surrounding zinc fingers and DNA. A recent study demonstrated that a high proportion of 3-finger zinc finger arrays generated by modular assembly fail to bind their intended target with sufficient affinity in a bacterial two-hybrid assay and fail to function as zinc finger nucleases, but the success rate was somewhat higher when sites of the form GNNGNNGNN were targeted. [28] A subsequent study used modular assembly to generate zinc finger nucleases with both 3-finger arrays and 4-finger arrays and observed a much higher success rate with 4-finger arrays. [29] A variant of modular assembly that takes the context of neighboring fingers into account has also been reported and this method tends to yield proteins with improved performance relative to standard modular assembly. [30]
Numerous selection methods have been used to generate zinc finger arrays capable of targeting desired sequences. Initial selection efforts utilized phage display to select proteins that bound a given DNA target from a large pool of partially randomized zinc finger arrays. This technique is difficult to use on more than a single zinc finger at a time, so a multi-step processes that generated a completely optimized 3-finger array by adding and optimizing a single zinc finger at a time was developed. [31] More recent efforts have utilized yeast one-hybrid systems, bacterial one-hybrid and two-hybrid systems, and mammalian cells. A promising new method to select novel 3-finger zinc finger arrays utilizes a bacterial two-hybrid system and has been dubbed "OPEN" by its creators. [32] This system combines pre-selected pools of individual zinc fingers that were each selected to bind a given triplet and then utilizes a second round of selection to obtain 3-finger arrays capable of binding a desired 9-bp sequence. This system was developed by the Zinc Finger Consortium as an alternative to commercial sources of engineered zinc finger arrays. It is somewhat difficult to directly compare the binding properties of proteins generated with this method to proteins generated by modular assembly as the specificity profiles of proteins generated by the OPEN method have never been reported.
Engineered zinc finger arrays can then be used in numerous applications such as artificial transcription factors, zinc finger methylases, zinc finger recombinases, and Zinc finger nucleases. [33] While initial studies with another DNA-binding domain from bacterial TAL effectors show promise, [34] [35] [36] [37] it remains to be seen whether these domains are suitable for some or all of the applications where engineered zinc fingers are currently used. Artificial transcription factors with engineered zinc finger arrays have been used in numerous scientific studies, and an artificial transcription factor that activates expression of VEGF is currently being evaluated in humans as a potential treatment for several clinical indications. Zinc finger nucleases have become useful reagents for manipulating genomes of many higher organisms including Drosophila melanogaster , Caenorhabditis elegans , tobacco, corn, [23] zebrafish, [38] various types of mammalian cells, [39] and rats. [40] An ongoing clinical trial is evaluating Zinc finger nucleases that disrupt the CCR5 gene in CD4+ human T-cells as a potential treatment for HIV/AIDS. [41]
These investigations require the use of soluble ZFPs, since attachment to phage can alter the binding characteristics of the zinc fingers. [3] Once a ZFP has been selected, its sequence is subcloned from pComb3H into a modified bacterial expression vector, pMal-c2, linking it to a sequence coding the maltose binding protein. The recombinant is then transformed into XL1-Blue cells and expression is induced by the addition of isopropyl β-D-thiogalactoside (IPTG). Freeze/thaw extracts may then be purified for use in the following experiments. Whilst purification is not necessary for multitarget ELISA, it is essential for measuring binding affinity by plasmon resonance and DNase footprints. It can be performed using a Heparin-Sepharose FPLC column equilibrated with zinc buffer followed by confirmation of homogeneity by SDS PAGE gel densitometry [4] The same techniques are used to examine the binding properties of completed polydactyl ZFP chimera [42]
The specificity of ZFPs selected by phage display, is tested using a multitarget enzyme-linked immunosorbent assay (ELISA). The ZFPs are applied to microtitre wells coated with streptavidin and a biotinylated target oligonucleotide. After incubation, the wells are washed to remove zinc fingers if they are not adherent to the target sequence, followed by the application of mouse anti-MBP (maltose binding protein) antibody and incubation. Goat anti-mouse antibody coupled to alkaline phosphatase is added and allowed to bind, followed by washing to remove antibody, if it is not bound to zinc fingers. Alkaline phosphatase substrate is added and after stopping the reaction, the optical density at 405 nm (OD405) is determined by spectrophotometry [4]
The reading from the spectrophotometer is dependent on the amount of alkaline phosphatase present in the well, which in turn is proportional to the binding of the ZFP in question. If the ZFP binds to a sequence for which it was not selected with too great an affinity, it is not specific enough for most medical purposes and will most likely be rejected.
These assays are repeated using different target oligonucleotides. When investigating zinc fingers binding 5'-XNN-3' sequences for example, all 16 of the possible oligonucleotide sequences will need to be investigated. Further, to test specificity to the 5' nucleotide, the full complement of the four 5'-ANN-3', 5'-CNN-3', 5'-GNN-3'. 5'-TNN-3' families are used as targets in four separate reactions and the relative binding in each is compared [4]
Kinetic analysis provides information regarding both the affinity and the specificity of zinc finger adherence to the target. It can be performed using commercially available equipment utilising surface plasmon resonance. The surface of the sensor chip is coated with affinity purified streptavidin before application of biotinylated oligonucleotides which also adhere to the surface. [10] The association rate (kon) is calculated by measuring the rate of ZFP binding to the surface using several different protein concentrations whilst the dissociation rate (koff) can be calculated by increasing the rate of flow after association. The mathematics is performed by software provided with the instrument. [10]
Alternatively, Kd can be calculated from a gel mobility shift assay in which the same purified protein is incubated with serial dilutions of gel-purified, 32P-end-labelled target oligonucleotide. The incubation reactions are then resolved, over a short period, on a polyacrylamide gel and quantitated using a commercially available imager and software. Kd is calculated via Scatchard analysis using the binding isotherm equation; θb = [peptide]/([peptide] + Kd). [3] [43]
To determine the space occupied by a ZFP when bound to its DNA target, a fragment of the target gene is amplified by PCR and mixed with a 32P-end-labelled promoter fragment. This reaction is then incubated with several different concentrations of ZFP produced and purified using one of the previously described overexpression (e.g. pMal-c2 and XL1-Blue) and purification methods. Digestion with DNase I will produce fragments of varying lengths, but where the ZFP has been allowed to bind at high concentration, the corresponding fragment lengths will not be present in the mixture, since DNase activity has been occluded by the ZFP at these locations. The samples are separated on an acrylamide (~6%), urea (8 M) gel, used to expose phosphorimaging plates and recorded by a commercially available phosphorimaging machine. Software analysis can also be used to produce Kd values [4]
Certain sequences of amino acid residues are able to recognise and are specific to an extended target-site of four or even five nucleotides [44] When this occurs in a ZFP in which the three-nucleotide subsites are contiguous, one zinc finger interferes with the target-site of the zinc finger adjacent to it, a situation known as target-site overlap.
ZFP-TFs, consisting of activators and repressors are transcription factors composed of a zinc finger protein domain and any of a variety of transcription-factor effector-domains which exert their modulatory effect around any sequence to which the ZFP domain binds.
Zinc finger nucleases include a nuclease domain such as FokI, capable of introducing double-stranded breaks at the locus of any sequence to which the zinc finger protein domain binds.
Gene knockouts are a widely used genetic engineering technique that involves the targeted removal or inactivation of a specific gene within an organism's genome. This can be done through a variety of methods, including homologous recombination, CRISPR-Cas9, and TALENs.
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. It was originally coined to describe the finger-like appearance of a hypothesized structure from the African clawed frog (Xenopus laevis) transcription factor IIIA. However, it has been found to encompass a wide variety of differing protein structures in eukaryotic cells. Xenopus laevis TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein followed soon thereafter by the Krüppel factor in Drosophila. It often appears as a metal-binding domain in multi-domain proteins.
DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair.
EGR-1 also known as ZNF268 or NGFI-A is a protein that in humans is encoded by the EGR1 gene.
Two-hybrid screening is a molecular biology technique used to discover protein–protein interactions (PPIs) and protein–DNA interactions by testing for physical interactions between two proteins or a single protein and a DNA molecule, respectively.
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.
C7 protein is an engineered zinc finger protein based on the murine ZFP, Zif268 and discovered by Wu et al. in 1994. It shares the same zinc finger 2 and zinc finger 3 of Zif268, but differs in the sequence of finger 1. It also shares the same DNA target, 5'-GCGTGGGCG-3'.
Therapeutic gene modulation refers to the practice of altering the expression of a gene at one of various stages, with a view to alleviate some form of ailment. It differs from gene therapy in that gene modulation seeks to alter the expression of an endogenous gene whereas gene therapy concerns the introduction of a gene whose product aids the recipient directly.
Zinc-finger nucleases (ZFNs) are artificial restriction enzymes generated by fusing a zinc finger DNA-binding domain to a DNA-cleavage domain. Zinc finger domains can be engineered to target specific desired DNA sequences and this enables zinc-finger nucleases to target unique sequences within complex genomes. By taking advantage of endogenous DNA repair machinery, these reagents can be used to precisely alter the genomes of higher organisms. Alongside CRISPR/Cas9 and TALEN, ZFN is a prominent tool in the field of genome editing.
Artificial transcription factors (ATFs) are engineered individual or multi molecule transcription factors that either activate or repress gene transcription (biology).
In a zinc finger protein, certain sequences of amino acid residues are able to recognise and bind to an extended target-site of four or even five nucleotides When this occurs in a ZFP in which the three-nucleotide subsites are contiguous, one zinc finger interferes with the target-site of the zinc finger adjacent to it, a situation known as target-site overlap. For example, a zinc finger containing arginine at position -1 and aspartic acid at position 2 along its alpha-helix will recognise an extended sequence of four nucleotides of the sequence 5'-NNG(G/T)-3'. The hydrogen bond between Asp2 and the N4 of either a cytosine or adenine base paired to the guanine or thymine, respectively defines these two nucleotides at the 3' position, defining a sequence that overlaps into the subsite of any zinc finger that may be attached N-terminally.
Chimeric nucleases are an example of engineered proteins which must comprise a DNA-binding domain to give sequence specificity and a nuclease domain for DNA cleavage.
The bacterial one-hybrid (B1H) system is a method for identifying the sequence-specific target site of a DNA-binding domain. In this system, a given transcription factor (TF) is expressed as a fusion to a subunit of RNA polymerase. In parallel, a library of randomized oligonucleotides representing potential TF target sequences are cloned into a separate vector containing the selectable genes HIS3 and URA3. If the DNA-binding domain (bait) binds a potential DNA target site (prey) in vivo, it will recruit RNA polymerase to the promoter and activate transcription of the reporter genes in that clone. The two reporter genes, HIS3 and URA3, allow for positive and negative selections, respectively. At the end of the process, positive clones are sequenced and examined with motif-finding tools in order to resolve the favoured DNA target sequence.
TALeffectors are proteins secreted by some β- and γ-proteobacteria. Most of these are Xanthomonads. Plant pathogenic Xanthomonas bacteria are especially known for TALEs, produced via their type III secretion system. These proteins can bind promoter sequences in the host plant and activate the expression of plant genes that aid bacterial infection. The TALE domain responsible for binding to DNA is known to have 1.5 to 33.5 short sequences that are repeated multiple times. Each of these repeats was found to be specific for a certain base pair of the DNA. These repeats also have repeat variable residues (RVD) that can detect specific DNA base pairs. They recognize plant DNA sequences through a central repeat domain consisting of a variable number of ~34 amino acid repeats. There appears to be a one-to-one correspondence between the identity of two critical amino acids in each repeat and each DNA base in the target sequence. These proteins are interesting to researchers both for their role in disease of important crop species and the relative ease of retargeting them to bind new DNA sequences. Similar proteins can be found in the pathogenic bacterium Ralstonia solanacearum and Burkholderia rhizoxinica, as well as yet unidentified marine microorganisms. The term TALE-likes is used to refer to the putative protein family encompassing the TALEs and these related proteins.
Transcription activator-like effector nucleases (TALEN) are restriction enzymes that can be engineered to cut specific sequences of DNA. They are made by fusing a TAL effector DNA-binding domain to a DNA cleavage domain. Transcription activator-like effectors (TALEs) can be engineered to bind to practically any desired DNA sequence, so when combined with a nuclease, DNA can be cut at specific locations. The restriction enzymes can be introduced into cells, for use in gene editing or for genome editing in situ, a technique known as genome editing with engineered nucleases. Alongside zinc finger nucleases and CRISPR/Cas9, TALEN is a prominent tool in the field of genome editing.
Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site-specific locations. The basic mechanism involved in genetic manipulations through programmable nucleases is the recognition of target genomic loci and binding of effector DNA-binding domain (DBD), double-strand breaks (DSBs) in target DNA by the restriction endonucleases, and the repair of DSBs through homology-directed recombination (HDR) or non-homologous end joining (NHEJ).
Zinc finger transcription factors or ZF-TFs, are transcription factors composed of a zinc finger-binding domain and any of a variety of transcription-factor effector-domains that exert their modulatory effect in the vicinity of any sequence to which the protein domain binds.
The WRKY domain is found in the WRKY transcription factor family, a class of transcription factors. The WRKY domain is found almost exclusively in plants although WRKY genes appear present in some diplomonads, social amoebae and other amoebozoa, and fungi incertae sedis. They appear absent in other non-plant species. WRKY transcription factors have been a significant area of plant research for the past 20 years. The WRKY DNA-binding domain recognizes the W-box (T)TGAC(C/T) cis-regulatory element.
Epigenome editing or epigenome engineering is a type of genetic engineering in which the epigenome is modified at specific sites using engineered molecules targeted to those sites. Whereas gene editing involves changing the actual DNA sequence itself, epigenetic editing involves modifying and presenting DNA sequences to proteins and other DNA binding factors that influence DNA function. By "editing” epigenomic features in this manner, researchers can determine the exact biological role of an epigenetic modification at the site in question.
Since antiretroviral therapy requires a lifelong treatment regimen, research to find more permanent cures for HIV infection is currently underway. It is possible to synthesize zinc finger nucleotides with zinc finger components that selectively bind to specific portions of DNA. Conceptually, targeting and editing could focus on host cellular co-receptors for HIV or on proviral HIV DNA.