TAL effector repeat | |||||||||
---|---|---|---|---|---|---|---|---|---|
Pfam sequence logo for its TALE-like entry constructed from TALE, RipTAL, and BATs. Repeat starts with the F/L; RVD is the final N/H. | |||||||||
Identifiers | |||||||||
Symbol | TAL_effector | ||||||||
Pfam | PF03377 | ||||||||
InterPro | IPR005042 | ||||||||
|
Transcription Activator-Like Effector-Likes (TALE-likes) are a group of bacterial DNA binding proteins named for the first and still best-studied group, the TALEs of Xanthomonas bacteria. TALEs are important factors in the plant diseases caused by Xanthomonas bacteria, but are known primarily for their role in biotechnology as programmable DNA binding proteins, particularly in the context of TALE nucleases. TALE-likes have additionally been found in many strains of the Ralstonia solanacearum bacterial species complex, in Paraburkholderia rhizoxinica strain HKI 454, and in two unknown marine bacteria. Whether or not all these proteins form a single phylogenetic grouping is as yet unclear.
The unifying feature of the TALE-likes are their tandem arrays of DNA binding repeats. These repeats are, with few exceptions, 33-35 amino acids in length, and composed of two alpha-helices on either side of a flexible loop containing the DNA base binding residues and with neighbouring repeats joined by flexible linker loops. [1] Evidence for this common structure comes in part from solved crystal structures of TALEs [2] and a Burkholderia TALE-like (BAT), [3] but also from the conservation of the code that all TALE-likes use to recognise DNA-sequences. In fact, TALE, RipTAL, and BAT repeats can be mixed and matched to generate functional DNA-binding proteins with varying affinity. [4]
TALEs are the first identified, best-studied and largest group within the TALE-likes. TALEs are found throughout the bacterial genus Xanthomonas, [5] comprising mostly plant pathogens. Those TALEs which have been studied have all been shown to be secreted as part of the Type III secretion system into host plant cells. Once inside the host cell they translocate to the nucleus, bind specific DNA sequences within host promoters and turn on downstream genes. Every part of this process is thought to be conserved across all TALEs. The single meaningful difference between individual TALEs, based on current understanding, is the specific DNA sequence that each TALE binds. TALEs from even closely related strains differ in the composition of repeats that make up their DNA binding domain. [6] Repeat composition determines DNA binding preference. In particular position 13 of each repeat confers the DNA base preference of each repeat. During early research it was noted that almost all the differences between repeats of a single TALE repeat array are found in positions 12 and 13 and this finding led to the hypothesis that these residues determine base preference. [7] In fact repeat positions 12 and 13, referred to jointly as the Repeat Variable Diresidue (RVD) are commonly said to confer base specificity despite clear evidence that position 13 is the base determining residue. [8] In addition to the repeat domain TALEs also possess a number of conserved features in the domains flanking the repeats. These include domains for type-III-secretion, nuclear localization and transcriptional activation. This allows TALEs to carry out their biological role as effector proteins secreted into host plant cells to activate expression of specific host genes.
Whilst the RVD positions are commonly the only variable positions within a single TALE repeat array, there are more differences when comparing repeat arrays of different TALEs. The diversity of TALEs across the Xanthomonas genus is considerable, but a particularly striking finding is that the evolutionary history one arrives at by comparing repeat compositions differs from that found when comparing non-repeat sequences. [6] Repeat arrays of TALEs are thought to evolve rapidly, with a number of recombinatorial processes suggested to shape repeat array evolution. [5] Recombination of TALE repeat arrays has been demonstrated in a forced-selection experiment. [9] This evolutionary dynamism is thought to be made possible by the very high sequence identity of TALE repeats, which is a unique feature of TALEs as opposed to other TALE-likes.
Another unique feature of TALEs is a set of four repeat structures at the N-terminal flank of the core repeat array. These structures, termed non-canonical or degenerate repeats have been shown to be vital for DNA binding, [10] though all but one do not contact DNA bases and thus make no contribution to sequence preference. The one exception is repeat -1, which encodes a fixed T-zero preference to all TALEs. This means that the target sequences of TALEs are always preceded by a thymine base. This is thought to be common to all TALEs, with the possible exception of TalC from Xanthomonas oryzae pv. oryzae strain AXO1947 ( G1FM79 ). [11]
TAL effector protein Brg11 | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Organism | |||||||
Symbol | brg11 | ||||||
UniProt | Q8XYE3 | ||||||
|
It was noted in the 2002 publication of the genome of reference strain Ralstonia solanacearum GMI1000 that its genome encodes a protein similar to Xanthomonas TALEs. [12] Based on similar domain structure and repeat sequences it was presumed that this gene and homologs in other Ralstonia strains would encode proteins with the same molecular properties as TALEs, including sequence-specific DNA binding. In 2013 this was confirmed by two studies. [13] [14] These genes and the proteins they encode are referred to as RipTALs (Ralstonia injected protein TALE-like) in line with the standard nomenclature of Ralstonia effectors. [15] Whilst the DNA binding code of the core repeats is conserved with TALEs, RipTALs do not share the T-zero preference, instead they have a strict G-zero requirement. [13] In addition repeats within a single RipTAL repeat array have multiple sequence differences beyond the RVD positions, unlike the near-identical repeats of TALEs.
RipTALs have been found in all four phylotypes of R. solanacearum, making it an ancestral feature of this clade. Despite differences in the flanking domains, the sequences their RVDs target are highly similar. [16]
Several lines of evidence support the idea that RipTALs function as effector proteins, promoting bacterial growth or disease by manipulating the expression of plant genes. They are secreted into plant cells by the Type III secretion system, which is the main delivery system for effector proteins. [17] They localize to the cell nucleus and are able to function as sequence-specific transcription factors in plant cells. [13] In addition a strain lacking its RipTAL was shown to grow slower inside eggplant leaf tissue than the wild type. [18] Furthermore, a study based on DNA polymorphisms in ripTAL repeat domain sequences and host plants found a statistically significant connection between host plant and repeat domain variants. [19] This is expected if the RipTALs of different strains are adapted to target genes in specific host plants. Despite this, no target genes have been identified for any RipTAL, as of June 2019 [update] .
Burkholderia TALE-like protein 1 | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Organism | |||||||
Symbol | bat1 | ||||||
UniProt | E5AV36 | ||||||
|
The publication of the genome of bacterial strain Paraburkholderia rhizoxinica HKI 454, in 2011 [20] led to the discovery of a set of TALE-like genes that differed considerably in nature from the TALEs and RipTALS. The proteins encoded by these genes were studied for their DNA binding properties by two groups independently and named the Bats (Burkholderia TALE-likes; E5AV36 ) or BurrH. [21] [22] This research showed that the repeat units of the Burkholderia TALE-likes bind DNA with the same code as TALEs, governed by position 13 of each repeat. There are, however, a number of differences.
Burkholderia TALE-likes are composed almost entirely of repeats, lacking the large non-repetitive domains found flanking the repeats in TALEs and RpTALs. Those domains are key to the functions of TALEs and RipTALs allowing them to infiltrate the plant nucleus and turn on gene expression. It is therefore currently unclear what the biological roles of Burkholderia TALE-likes are. What is clear is that they are not effector proteins secreted into plant cells to act as transcription factors, the biological role of TALEs and RipTALs. It is not unexpected that they may differ in biological roles from TALEs and RipTALs since the life style of the bacterium they derive from is very unlike that of TALE and RipTAL bearing bacteria. B. rhizoxinica is an endosymbiont, living inside a fungus, unlike Rhizopus microsporus, a plant pathogen. The same fungus is also an opportunistic human pathogen in immuno-compromised patients, but whereas B. rhizoxinica is necessary for pathogenicity on plant hosts it is irrelevant to human infection. [23] It is unclear whether the Burkholderia TALE-likes are ever secreted either into the fungus, let alone into host plants.
As noted in the publications on Burkholderia TALE-likes there may be some advantages to using these proteins as a scaffold for programmable DNA-binding proteins to function as transcription factors or designer-nucleases, compared to TALEs. [21] [22] It has been fused with a FokI nuclease analogous to TALEN. [3] Advantages include a shorter repeat size, more compact domain structure (no large non-repeat domains), greater repeat sequence diversity enabling the use of PCR on the genes encoding them and making them less vulnerable to recombinatorial repeat loss. In addition, Burkholderia TALE-likes have no T-zero requirement relaxing the constraints on DNA target selection. However, few uses of Burkholderia TALE-likes as programmable DNA binding proteins have been published, outside of the original characterization publications.
In 2007 the results of a metagenomic sweep of the world's oceans by the J. Craig Venter Institute were made publicly available. [24] The paper in 2014 on Burkholderia TALE-likes [22] was also the first to report that two entries from that database resembled TALE-likes, based on sequence similarity. These were further characterized and assessed for their DNA-binding potential in 2015. [25] The repeat units encoded by these sequences were found to mediate DNA binding with base preference matching the TALE code, and judged likely to form structures nearly identical to Bat1 repeats based on molecular dynamics simulations. The proteins encoded by these DNA sequences were therefore designated Marine Organism TALE-likes (MOrTLs) 1 and 2 (GenBank: ECG96325 , EBN91409 ). [25] Similar sequences found in metagenomes include EBN19408 and ECR81667 . [26]
Whilst repeats of MOrTL1 and 2 both conform structurally and functionally to the TALE-like norm, they differ considerably at the sequence level both from all other TALE-likes and from one another. It is not known whether they are truly homologous to the other TALE-likes, and thus constitute together with the TALEs, RipTALs and Bats a true protein-family. Alternatively, they may have evolved independently. It is particularly difficult to judge the relationship to the other TALE-likes because almost nothing is known of the organisms that MOrTL1 and MOrTL2 come from. It is known only that they were found in two separate sea-water samples from the Gulf of Mexico and are likely to be bacteria based on size-exclusion before DNA sequencing. [25]
A patent for BATs and marine TALE-likes in protein engineering was filed in July 2012. As of May 2019 [update] , it is currently pending in all jurisdictions. [27]
In molecular biology, a transcription factor (TF) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the right cell at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are up to 1600 TFs in the human genome.
Pseudomonas is a genus of Gram-negative, Gammaproteobacteria, belonging to the family Pseudomonadaceae and containing 191 validly described species. The members of the genus demonstrate a great deal of metabolic diversity and consequently are able to colonize a wide range of niches. Their ease of culture in vitro and availability of an increasing number of Pseudomonas strain genome sequences has made the genus an excellent focus for scientific research; the best studied species include P. aeruginosa in its role as an opportunistic human pathogen, the plant pathogen P. syringae, the soil bacterium P. putida, and the plant growth-promoting P. fluorescens, P. lini, P. migulae, and P. graminis.
A point mutation or substitution is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences that are moderately predictable based upon the specifics of the mutation. These consequences can range from no effect to deleterious effects, with regard to protein production, composition, and function.
DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair. However, there are some known minor groove DNA-binding ligands such as netropsin, distamycin, Hoechst 33258, pentamidine, DAPI and others.
A tumour inducing (Ti) plasmid is a plasmid found in pathogenic species of Agrobacterium, including A. tumefaciens, A. rhizogenes, A. rubi and A. vitis.
CRISPR is a family of DNA sequences found in the genomes of prokaryotic organisms such as bacteria and archaea. These sequences are derived from DNA fragments of bacteriophages that had previously infected the prokaryote. They are used to detect and destroy DNA from similar bacteriophages during subsequent infections. Hence these sequences play a key role in the antiviral defense system of prokaryotes.
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.
Zinc-finger nucleases (ZFNs) are artificial restriction enzymes generated by fusing a zinc finger DNA-binding domain to a DNA-cleavage domain. Zinc finger domains can be engineered to target specific desired DNA sequences and this enables zinc-finger nucleases to target unique sequences within complex genomes. By taking advantage of endogenous DNA repair machinery, these reagents can be used to precisely alter the genomes of higher organisms. Alongside CRISPR/Cas9 and TALEN, ZFN is a prominent tool in the field of genome editing.
The gene-for-gene relationship was discovered by Harold Henry Flor who was working with rust of flax. Flor showed that the inheritance of both resistance in the host and parasite ability to cause disease is controlled by pairs of matching genes. One is a plant gene called the resistance (R) gene. The other is a parasite gene called the avirulence (Avr) gene. Plants producing a specific R gene product are resistant towards a pathogen that produces the corresponding Avr gene product. Gene-for-gene relationships are a widespread and very important aspect of plant disease resistance. An example can be seen with Lactuca serriola.
Ralstonia is a genus of Proteobacteria, previously included in the genus Pseudomonas. It is named after the American bacteriologist Ericka Ralston. Ericka Ralston was born Ericka Barrett in 1944 in Saratoga, California, and died in 2015 in Sebastopol, California. While in graduate school at the University of California at Berkeley, she identified 20 strains of Pseudomonas which formed a phenotypical homologous group, and named them Pseudomonas pickettii, after M.J. Pickett in the Department of Bacteriology at the University of California at Los Angeles, from whom she had received the strains. Later, P. pickettii was transferred to the new genus Ralstonia, along with several other species. She continued her research into bacterial pathogenesis under the name of Ericka Barrett while a professor of microbiology at the University of California at Davis from 1977 until her retirement in 1996.
Ralstonia solanacearum is an aerobic non-spore-forming, Gram-negative, plant pathogenic bacterium. R. solanacearum is soil-borne and motile with a polar flagellar tuft. It colonises the xylem, causing bacterial wilt in a very wide range of potential host plants. It is known as Granville wilt when it occurs in tobacco. Bacterial wilts of tomato, pepper, eggplant, and Irish potato caused by R. solanacearum were among the first diseases that Erwin Frink Smith proved to be caused by a bacterial pathogen. Because of its devastating lethality, R. solanacearum is now one of the more intensively studied phytopathogenic bacteria, and bacterial wilt of tomato is a model system for investigating mechanisms of pathogenesis. Ralstonia was recently reclassified as Pseudomonas, with similarity in most aspects, except that it does not produce fluorescent pigment like Pseudomonas. The genomes from different strains vary from 5.5 Mb up to 6 Mb, roughly being 3.5 Mb of a chromosome and 2 Mb of a megaplasmid. While the strain GMI1000 was one of the first phytopathogenic bacterias to have its genome completed, the strain UY031 was the first R. solanacearum to have its methylome reported. Within the R. solanacearum species complex, the four major monophyletic clusters of strains are termed phylotypes, that are geographically distinct: phylotypes I-IV are found in Asia, the Americas, Africa, and Oceania, respectively.
Xanthomonas is a genus of Proteobacteria, many of which cause plant diseases. There are at least 27 plant associated Xanthomonas spp., that altogether infect at least 400 plant species. Different species typically have specific host and/or tissue range and colonization strategies.
Plant disease resistance protects plants from pathogens in two ways: by pre-formed structures and chemicals, and by infection-induced responses of the immune system. Relative to a susceptible plant, disease resistance is the reduction of pathogen growth on or in the plant, while the term disease tolerance describes plants that exhibit little disease damage despite substantial pathogen levels. Disease outcome is determined by the three-way interaction of the pathogen, the plant and the environmental conditions.
TALeffectors are proteins secreted by Xanthomonas bacteria via their type III secretion system when they infect various plant species. These proteins can bind promoter sequences in the host plant and activate the expression of plant genes that aid bacterial infection. They recognize plant DNA sequences through a central repeat domain consisting of a variable number of ~34 amino acid repeats. There appears to be a one-to-one correspondence between the identity of two critical amino acids in each repeat and each DNA base in the target sequence. These proteins are interesting to researchers both for their role in disease of important crop species and the relative ease of retargeting them to bind new DNA sequences. Similar proteins can be found in the pathogenic bacterium Ralstonia solanacearum and Burkholderia rhizoxinica, as well as yet unidentified marine microorganisms. The term TALE-likes is used to refer to the putative protein family encompassing the TALEs and these related proteins.
Transcription activator-like effector nucleases (TALEN) are restriction enzymes that can be engineered to cut specific sequences of DNA. They are made by fusing a TAL effector DNA-binding domain to a DNA cleavage domain. Transcription activator-like effectors (TALEs) can be engineered to bind to practically any desired DNA sequence, so when combined with a nuclease, DNA can be cut at specific locations. The restriction enzymes can be introduced into cells, for use in gene editing or for genome editing in situ, a technique known as genome editing with engineered nucleases. Alongside zinc finger nucleases and CRISPR/Cas9, TALEN is a prominent tool in the field of genome editing.
Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site specific locations.
Genetic engineering can be accomplished using multiple techniques. There are a number of steps that are followed before a genetically modified organism (GMO) is created. Genetic engineers must first choose what gene they wish to insert, modify, or delete. The gene must then be isolated and incorporated, along with other genetic elements, into a suitable vector. This vector is then used to insert the gene into the host genome, creating a transgenic or edited organism. The ability to genetically engineer organisms is built on years of research and discovery on how genes function and how we can manipulate them. Important advances included the discovery of restriction enzymes and DNA ligases and the development of polymerase chain reaction and sequencing.
Xanthomonas axonopodis pv. manihotis is the pathogen that causes bacterial blight of cassava. Originally discovered in Brazil in 1912, the disease has followed cultivation of cassava across the world. Among diseases which afflict cassava worldwide, bacterial blight causes the largest losses in terms of yield.
Epigenome editing or Epigenome engineering is a type of genetic engineering in which the epigenome is modified at specific sites using engineered molecules targeted to those sites. Whereas gene editing involves changing the actual DNA sequence itself, epigenetic editing involves modifying and presenting DNA sequences to proteins and other DNA binding factors that influence DNA function. By "editing” epigenomic features in this manner, researchers can determine the exact biological role of an epigenetic modification at the site in question.
Adam J. Bogdanove is a Professor of Plant Pathology at Cornell University. He is most notable for his central role in the development of TAL effector based DNA targeting reagents, following his discovery of TAL effector modularity with Matthew Moscou in 2009. Since, he has been a leader in the field, pioneering applications in genome editing and contributing one of the most widely used methods for designing custom TAL effectors using Golden Gate Cloning. Bogdanove is now widely recognized for revolutionizing the area of DNA targeting, along with scientists such as Jennifer Doudna and Emmanuelle Charpentier.