Artificial transcription factor

Last updated
Figure 1. Example of a natural transcription factor up-regulating gene expression. 1. The transcription factors (labeled activator proteins) bind to their specific DNA sequence (labeled enhancers). 2. The transcription factors recruit other proteins and transcription factors to form a protein complex which binds to the gene promoter. 3. After the activating protein complex binds to the promoter, RNA polymerase easily binds and starts transcribing the target gene. 4. and 5. are additional scenarios where in 4. an insulator/inhibitor can bind to the DNA preventing activation for transcription and in 5. methylation can prevent the insulator from binding. Transcription Factors.svg
Figure 1. Example of a natural transcription factor up-regulating gene expression. 1. The transcription factors (labeled activator proteins) bind to their specific DNA sequence (labeled enhancers). 2. The transcription factors recruit other proteins and transcription factors to form a protein complex which binds to the gene promoter. 3. After the activating protein complex binds to the promoter, RNA polymerase easily binds and starts transcribing the target gene. 4. and 5. are additional scenarios where in 4. an insulator/inhibitor can bind to the DNA preventing activation for transcription and in 5. methylation can prevent the insulator from binding.

Artificial transcription factors (ATFs) are engineered individual or multi molecule transcription factors that either activate or repress gene transcription (biology). [1]

Contents

ATFs often contain two main components linked together, a DNA-binding domain and a regulatory domain, also known as an effector domain or modulatory domain. [1] The DNA-binding domain targets a specific DNA sequence with high affinity, and the regulatory domain is responsible for activating or repressing the bound gene. [1] The ATF can directly regulate gene expression, can recruit proteins and other transcription factors to initiate transcription, or recruit proteins and other transcription factors to compact the DNA which inhibits RNA polymerase from binding and transcribing the DNA; an example of transcription factors up-regulating gene expression is displayed in figure 1 on the left. [1] [2] Because ATFs are composed of two separable components, the DNA-binding domain and the regulatory domain, the two domains are interchangeable, permitting the design of new ATFs from existing natural transcription factors. [1]

Some applications of ATFs include reprogramming cell state, cancer treatment, and a plausible treatment for Angelman Syndrome. [2] [3] [4]

ATF Design

DNA-Binding Domain

The DNA-binding domain routes the ATF to a specific gene sequence. Natural DNA binding proteins are commonly used because of their high affinity for their DNA target sequence, however currently no algorithm that matches the protein amino-acid sequence to the complementary DNA binding sequence exists, limiting the rational design of new DNA-binding proteins. [1] Non-peptide, oligonucleotide, and polyamide DNA-binding domains have recently been explored which permit rational design. [1] The type of DNA binding domain chosen depends on the desired application of the ATF, common DNA-binding domains are presented in Types of ATF DNA-Binding Domains section below. [1] [2]

Regulatory Domain

The regulatory domain is responsible for activating or repressing the bound gene and accomplishes this regulation by either directly regulating gene expression or recruiting other proteins and transcription factors to change transcription levels. [1] [2] One route to upregulate a gene is for the ATF to recruit proteins that loosen the DNA wrapping around histones allowing RNA polymerase to bind and transcribe the gene; likewise, compacting the DNA would downregulate gene expression by inhibiting RNA polymerase from binding. [1] Regulatory domains promoting gene transcription are usually acidic activators, composed of acidic and hydrophobic amino acids, and regulatory domains repressing gene transcription usually contain more basic amino acids. [1] Factors influencing the effect the ATF has on transcription include the distance the regulatory domain is from the transcription site, the cell type, and the number of activating or repressing sequences present in the regulatory domain. [1] Activating domains, regulatory domains that promote gene transcription, are often capable of upregulating transcription by 5 to 40-fold and RNA regulatory domains have been shown to result in 100 fold transcription levels. [1] An alternative strategy for repressing genes is for the ATF to out-compete natural transcriptions factors and physically block transcription by RNA polymerase; however, creating ATFs with higher affinity for the DNA sequence than the natural transcription factors remains a challenge. [1]

Linkers

Linkers covalently or non-covalently link the DNA-binding domain and regulatory domain. [1] Frequently, peptide linkers are used, but polyethylene glycol and small molecules linkers also exist. [1] The linkers enable the DNA-binding domains and regulatory domains to be interchangeable allowing the design of new ATFs from natural transcription factor components. [1] Although linkers are less studied, the linker length is important because it alters the extent of impact the regulatory domain has on gene expression. [1]

History

Most ATFs have been constructed by exchanging existing DNA-binding domains and regulatory domains to generate ATFs with new targeting sites and transcription regulation consequences. [1] Designed DNA-binding domains, such as CRISPR-Cas, with new targeting capabilities are being explored to engineer higher specificity and control potential side effects. [2] In the future, ATFs which can respond to physiological cues, only change transcription levels in a specific cell type, and can easily be delivered without the use of electroporation are of great interest. [1]

Types of ATF DNA-Binding Domains

CRISPR-Cas

The clustered regularly interspaced short palindromic repeats - Cas (CRISPR-Cas) system has been extensively studied to target a specific DNA sequence using a single guide RNA (sgRNA). [5] For ATF applications the CRISPR-Cas system is modified to inactivate the Cas enzyme's natural function and link a regulatory domain to the Cas enzyme. [2] The CRISPR-Cas system benefits from high specificity between the sgRNA and the target DNA sequence and the simplicity of designing new sgRNAs; however, the CRISPR-Cas system requires a PAM sequence directly upstream of the target DNA site and the large size of the Cas protein hinders delivery into the cell. [2]

TALEs

Transcription activator-like effectors (TALEs) are peptide structures composed of repeating 34 amino acids long segments forming a peptide ranging in total length from 340 to 510 amino acids. [2] Each repeating segment folds into two alpha helices and amino acids at residue positions 12 and 13 in the repeating segment determines the DNA binding sequence. [2] The TALEs peptide has high specificity to the target DNA preventing secondary side effects, but this high specificity prevents the ATF from binding to multiple sites and requires a different ATF for each desired effect. [2]

Zinc Fingers

Zinc fingers are naturally abundant, involved in multiple regulatory processes, and are common eukaryotic transcriptional factors. [6] Cis2/His2 zinc fingers have been extensively studied, are composed of 30 amino acids, can bind to non-palindromic sequences, and contain 3 to 4 critical amino acids at positions 1, 3, and 6 on the alpha helix which designate the complementary binding sequence. [4] [7] [8] Because zinc fingers are only 30 amino acids long they are easier to deliver, and multiple zinc fingers can be linked together to target larger DNA sequences with one ATF; however, connecting more than three zinc fingers together reduces each zinc finger’s specificity and increases off-site targeting. [2]

ATF Applications

Reprogramming Cell State

Directing cell differentiation and reprogramming cell fate have traditionally been achieved via a mixture of transcription factors. [9] The field gained significant interest once four transcription factors Oct4/Sox2/cMyc/Klf4 were found to reprogram cells from a differentiated state into an induced pluripotent stem cell state similar to embryonic stem cells. [10] Multiple ATFs composed of three zinc finger proteins linked together can each activate genes that eventually lead to the production of the Oct4 transcription factor in the cell, causing the cell to reprogram to an induced pluripotent state without the addition of external Oct4 transcription factors. [2] The change in cell state demonstrates that ATFs can replace traditional transcription factors in cell reprogramming. [2]

Angelman Syndrome

Angelman syndrome is a neurological development disorder caused by the deactivation of the maternal UBE3A gene. [3] Two potential treatment strategies using ATFs are to upregulate the expression of the maternal UBE3A gene or downregulate the expression of UBE3A-AS gene, the gene that causes repression of the paternal UBE3A gene. [3] Zinc finger ATF TAT-S1 acts as a strong repressor against the UBE3A-AS gene, and when administered to mice, resulted in increased Ube3a in the brain. [3]

Cancer

Abnormal gene expression is regularly associated with cancer and uncontrolled tumor growth, making ATFs a promising therapeutic for cancer treatment. [4] By linking 6 zinc fingers together in an ATF, the ATF only binds to an 18 base pair sequence containing smaller subsequences complementary to each zinc finger in the ATF, so the ATF is more specific than one zinc finger which only targets a specific 3 to 4 base pair sequence. [4] ATFs linked to the KRAB repressor regulatory domain decreases cancer cells' drug resistance to chemotherapy, and ATFs linked to activator domains can upregulate Bax gene expression causing cell apoptosis; however, these treatments remain in the early stages because of inadequate delivery methods. [4]

See also

Related Research Articles

Transcription factor Protein that controls the rate of DNA transcription

In molecular biology, a transcription factor (TF) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the right cell at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are up to 1600 TFs in the human genome. Transcription factors are members of the proteome as well as regulome.

A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and viruses.

In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.

A transcriptional activator is a protein that increases transcription of a gene or set of genes. Activators are considered to have positive control over gene expression, as they function to promote gene transcription and, in some cases, are required for the transcription of genes to occur. Most activators are DNA-binding proteins that bind to enhancers or promoter-proximal elements. The DNA site bound by the activator is referred to as an "activator-binding site". The part of the activator that makes protein–protein interactions with the general transcription machinery is referred to as an "activating region" or "activation domain".

DNA-binding protein Proteins that bind with DNA, such as transcription factors, polymerases, nucleases and histones

DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair. However, there are some known minor groove DNA-binding ligands such as netropsin, distamycin, Hoechst 33258, pentamidine, DAPI and others.

EGR1

EGR-1 also known as ZNF268 or NGFI-A is a protein that in humans is encoded by the EGR1 gene.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

In molecular genetics, the Krüppel-like family of transcription factors (KLFs) are a set of eukaryotic C2H2 zinc finger DNA-binding proteins that regulate gene expression. This family has been expanded to also include the Sp transcription factor and related proteins, forming the Sp/KLF family.

Therapeutic gene modulation refers to the practice of altering the expression of a gene at one of various stages, with a view to alleviate some form of ailment. It differs from gene therapy in that gene modulation seeks to alter the expression of an endogenous gene whereas gene therapy concerns the introduction of a gene whose product aids the recipient directly.

RAR-related orphan receptor alpha

RAR-related orphan receptor alpha (RORα), also known as NR1F1 is a nuclear receptor that in humans is encoded by the RORA gene. RORα participates in the transcriptional regulation of some genes involved in circadian rhythm. In mice, RORα is essential for development of cerebellum through direct regulation of genes expressed in Purkinje cells. It also plays an essential role in the development of type 2 innate lymphoid cells (ILC2) and mutant animals are ILC2 deficient. In addition, although present in normal numbers, the ILC3 and Th17 cells from RORα deficient mice are defective for cytokine production.

Bacterial DNA binding protein

In molecular biology, bacterial DNA binding proteins are a family of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins. Since bacterial binding proteins have a diversity of functions, it has been difficult to develop a common function for all of them. They are commonly referred to as histone-like and have many similar traits with the eukaryotic histone proteins. Eukaryotic histones package DNA to help it to fit in the nucleus, and they are known to be the most conserved proteins in nature. Examples include the HU protein in Escherichia coli, a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of bacteria and archaea, and are also encoded in the chloroplast genome of some algae. The integration host factor (IHF), a dimer of closely related chains which is suggested to function in genetic recombination as well as in translational and transcriptional control is found in Enterobacteria and viral proteins including the African swine fever virus protein A104R.

Genome editing Type of genetic engineering

Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site specific locations.

Cas9 Microbial protein found in Streptococcus pyogenes M1 GAS

Cas9 is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.

GLIS1

Glis1 is gene encoding a Krüppel-like protein of the same name whose locus is found on Chromosome 1p32.3. The gene is enriched in unfertilised eggs and embryos at the one cell stage and it can be used to promote direct reprogramming of somatic cells to induced pluripotent stem cells, also known as iPS cells. Glis1 is a highly promiscuous transcription factor, regulating the expression of numerous genes, either positively or negatively. In organisms, Glis1 does not appear to have any directly important functions. Mice whose Glis1 gene has been removed have no noticeable change to their phenotype.

CRISPR interference genetic perturbation technique

CRISPR interference (CRISPRi) is a genetic perturbation technique that allows for sequence-specific repression of gene expression in prokaryotic and eukaryotic cells. It was first developed by Stanley Qi and colleagues in the laboratories of Wendell Lim, Adam Arkin, Jonathan Weissman, and Jennifer Doudna. Sequence-specific activation of gene expression refers to CRISPR activation (CRISPRa).

Epigenome editing

Epigenome editing or Epigenome engineering is a type of genetic engineering in which the epigenome is modified at specific sites using engineered molecules targeted to those sites. Whereas gene editing involves changing the actual DNA sequence itself, epigenetic editing involves modifying and presenting DNA sequences to proteins and other DNA binding factors that influence DNA function. By "editing” epigenomic features in this manner, researchers can determine the exact biological role of an epigenetic modification at the site in question.

Zinc finger protein 226

Zinc finger protein 226 is a protein that in humans is encoded by the ZNF226 gene.

CRISPR-Display (CRISP-Disp) is a modification of the CRISPR/Cas9 system for genome editing. The CRISPR/Cas9 system uses a short guide RNA (sgRNA) sequence to direct a Streptococcus pyogenes Cas9 nuclease, acting as a programmable DNA binding protein, to cleave DNA at a site of interest.

CRISPR activation (CRISPRa) is a type of CRISPR tool that uses modified versions of CRISPR effectors without endonuclease activity, with added transcriptional activators on dCas9 or the guide RNAs (gRNAs).

ZNF337

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Ansari, Aseem Z; Mapp, Anna K (2002-12-01). "Modular design of artificial transcription factors". Current Opinion in Chemical Biology. 6 (6): 765–772. doi:10.1016/S1367-5931(02)00377-0. ISSN   1367-5931. PMID   12470729.
  2. 1 2 3 4 5 6 7 8 9 10 11 12 13 Heiderscheit, Evan A.; Eguchi, Asuka; Spurgat, Mackenzie C.; Ansari, Aseem Z. (2018). "Reprogramming cell fate with artificial transcription factors". FEBS Letters. 592 (6): 888–900. doi:10.1002/1873-3468.12993. ISSN   1873-3468. PMC   5869137 . PMID   29389011.
  3. 1 2 3 4 Tan, Wen-Hann; Bird, Lynne M. (December 2016). "Angelman syndrome: Current and emerging therapies in 2016". American Journal of Medical Genetics. Part C, Seminars in Medical Genetics. 172 (4): 384–401. doi:10.1002/ajmg.c.31536. ISSN   1552-4876. PMID   27860204. S2CID   4377191.
  4. 1 2 3 4 5 Yan, Chunhong; Higgins, Paul J. (2013-01-01). "Drugging the undruggable: Transcription therapy for cancer". Biochimica et Biophysica Acta (BBA) - Reviews on Cancer. 1835 (1): 76–85. doi:10.1016/j.bbcan.2012.11.002. ISSN   0304-419X. PMC   3529832 . PMID   23147197.
  5. Nidhi, Sweta; Anand, Uttpal; Oleksak, Patrik; Tripathi, Pooja; Lal, Jonathan A.; Thomas, George; Kuca, Kamil; Tripathi, Vijay (2021-03-24). "Novel CRISPR–Cas Systems: An Updated Review of the Current Achievements, Applications, and Future Research Perspectives". International Journal of Molecular Sciences. 22 (7): 3327. doi: 10.3390/ijms22073327 . ISSN   1422-0067. PMC   8036902 . PMID   33805113.
  6. Cassandri, Matteo; Smirnov, Artem; Novelli, Flavia; Pitolli, Consuelo; Agostini, Massimiliano; Malewicz, Michal; Melino, Gerry; Raschellà, Giuseppe (2017-11-13). "Zinc-finger proteins in health and disease". Cell Death Discovery. 3 (1): 17071. doi:10.1038/cddiscovery.2017.71. ISSN   2058-7716. PMC   5683310 . PMID   29152378.
  7. Gommans, Willemijn M.; Haisma, Hidde J.; Rots, Marianne G. (2005-12-02). "Engineering Zinc Finger Protein Transcription Factors: The Therapeutic Relevance of Switching Endogenous Gene Expression On or Off at Command". Journal of Molecular Biology. 354 (3): 507–519. doi:10.1016/j.jmb.2005.06.082. ISSN   0022-2836. PMID   16253273.
  8. Urnov, Fyodor D; Rebar, Edward J (2002-09-01). "Designed transcription factors as tools for therapeutics and functional genomics". Biochemical Pharmacology. Cell Signaling, Transcription and Translation as Therapeutic Targets. 64 (5): 919–923. doi:10.1016/S0006-2952(02)01150-4. ISSN   0006-2952. PMID   12213587.
  9. Takahashi, Kazutoshi; Yamanaka, Shinya (March 2016). "A decade of transcription factor-mediated reprogramming to pluripotency". Nature Reviews Molecular Cell Biology. 17 (3): 183–193. doi:10.1038/nrm.2016.8. ISSN   1471-0080. PMID   26883003. S2CID   7593915.
  10. Qi, Huayu; Pei, Duanqing (July 2007). "The magic of four: induction of pluripotent stem cells from somatic cells by Oct4, Sox2, Myc and Klf4". Cell Research. 17 (7): 578–580. doi:10.1038/cr.2007.59. ISSN   1748-7838. PMID   17632550. S2CID   9643825.