The bacterial one-hybrid (B1H) system is a method for identifying the sequence-specific target site of a DNA-binding domain. In this system, a given transcription factor (TF) is expressed as a fusion to a subunit of RNA polymerase. In parallel, a library of randomized oligonucleotides representing potential TF target sequences are cloned into a separate vector containing the selectable genes HIS3 and URA3. If the DNA-binding domain (bait) binds a potential DNA target site (prey) in vivo, it will recruit RNA polymerase to the promoter and activate transcription of the reporter genes in that clone. The two reporter genes, HIS3 and URA3, allow for positive and negative selections, respectively. At the end of the process, positive clones are sequenced and examined with motif-finding tools in order to resolve the favoured DNA target sequence. [1]
Across all living organisms, regulation of gene expression is controlled by interactions between DNA-binding regulatory proteins (transcription factors) and cis-regulatory elements, DNA sequences in or around genes that act as target sites for DNA-binding proteins. By binding to cis-regulatory sequences and to each other, transcription factors fine-tune transcriptional levels by stabilizing/destabilizing binding of RNA polymerase to a gene's promoter. But despite their importance and ubiquity, little is known about where exactly each of these regulatory proteins binds. Literature suggests that nearly 8% of human genes encode transcription factors and the functions and specificities of their interactions remain largely unexplored. [2] We are on the brink of a convergence of high-throughput technologies and genomic theory that is allowing researchers to start mapping these interactions on a genome-wide scale. Only recently has a complete survey of DNA-binding specificities been attempted for a large family of DNA-binding domains. B1H is just one emerging technique among many that is useful for studying protein–DNA interactions. [3]
Transformation of a bacterial host with two different plasmids is required. One is designed to express a DNA-binding protein-of-interest as a fusion construct with a subunit of RNA polymerase (bait). The other plasmid contains a region of randomized sequence representing potential binding sites (prey) which, if bound to by the chimeric fusion product, drives expression of downstream reporter genes. This reporter region facilitates both positive and negative selection by HIS3 and URA3, respectively, which together allow for isolation of the prey containing the true DNA target sequence. HIS3 and URA3 encode proteins required for biosynthesis of histidine and uracil.
Using a negative selectable marker is crucial for greatly reducing the incidence of false-positives. Self-activating prey, where the randomized region facilitate reporter expression in the absence of TF binding, are removed by transforming the reporter vector library into bacteria in the absence of bait and assaying for growth on plates containing 5-fluoro-orotic acid (5-FOA). The protein product of URA3 converts 5-FOA into a toxic compound, thereby allowing survival of only those colonies that contain reporter vectors which are not self-activating. Negative selection normally precedes positive selection so that a smaller, purified prey library can be subjected to the more rigorous positive selection process. Upon transformation of the purified prey library with the bait plasmid, positive selection is achieved by growing the host E. coli on minimal medium lacking histidine (NM selective medium) that is usually supplemented with varying concentrations of 3-amino-triazole (3-AT), a competitive inhibitor of HIS3. HIS3 encodes a protein required for histidine biosynthesis and thus only those cells containing bait-prey combinations that activate the reporter genes will be able to grow. Manipulating 3-AT concentrations allows for the characterization of binding stringencies. In this way, researches can gauge how strongly bait binds its prey (correlated with the level of expression of HIS3) and thus determines which nucleotide binding-sites have strong or weak preferences for a given base. In other words, if cells can grow despite a high concentration of 3-AT, bait-prey binding must be of high enough stringency to drive reporter gene expression (HIS3) at a sufficient level to overcome the resulting competitive inhibition. Finally, positive clones are sequenced and examined with preexisting motif-finding tools (ex, MEME, BioProspector). [3]
The bacteria one-hybrid system has undergone numerous modifications since its inception in 2005. [3] It ultimately arose as a variation of the bacteria two-hybrid system, conceived in 2000, which itself was inspired by the yeast one- and two-hybrid systems. [4] Whereas the two-hybrid versions can assess both protein–protein interaction and protein–DNA interactions, the one-hybrid system specializes in the latter. Meng et al.’s B1H system differs from the two-hybrid version in two key respects. It uses a randomized prey library consisting of many (<2x108) unique potential target sequences and also adds a negative selection step in order to purge this library of self-activating clones. [1] [3] Although these ideas were borrowed from the original yeast one-hybrid system, [5] they had not yet been applied to a bacterial host before 2005. As the technique grew in popularity, researchers amended their protocols to improve the B1H system. Designing the fusion construct (bait) to the omega, rather than the alpha, subunit of RNA polymerase has recently been favoured in order to improve the chimera’s stereochemistry and dynamic range. [6] A zinc-finger domain on the fusion construct and its corresponding DNA target site, adjacent to the randomized prey sequence, has also been added to the increases affinity and specificity of protein–DNA interactions. This increased overall binding affinity allows for the characterization of even those DNA-binding domain proteins which interact weakly with a target sequence. [7]
.
The B1H system has significant advantages over other methods that investigate protein–DNA interactions. Microarray-based readout of chromatin immunoprecipitation (ChIP-chip) for high-throughput binding-site determination relies on specific antibodies which may not always be available. Methods that rely on protein-binding microarrays also require additional protein purification steps that are not required in the B1H system. Furthermore, these microarray-based techniques are often prohibitive in terms of requiring special facilities and expertise to analyze the resulting data. SELEX, another system commonly used to identify the target nucleic acids for DNA-binding proteins, requires multiple rounds of selection. In contrast, the bacterial one-hybrid system requires just one round of in vitro selection and also offers a low-tech alternative to microarray-based technologies. Antibodies are not required for studying the interactions of DNA-binding proteins in the B1H system. A further advantage is that the B1H system works not only for monomeric proteins but also for proteins that bind DNA as complexes. The B1H system should be considered a specialized technique for studying DNA-protein interactions whereas the two-hybrid variations (B2H and Y2H) can assess both protein–protein and protein–DNA interactions. These two-hybrid systems are multi-purpose but are limited in terms of assaying only a single “prey” library. An advantage of the bacterial one-hybrid system over the yeast one-hybrid system (Y1H) lies in the higher transformation efficiency of plasmids into bacteria which allows for more complex “prey” libraries to be examined. [1] [3]
Despite its aforementioned advantages as a specialized tool, the B1H system does have some drawbacks. First, the B1H selection system is limited in its capacity to determine the binding specificities of transcription factors with lengthy binding sites. This arises from the fact that the number of randomized “prey” clones required to represent all possible target sequences increases exponentially with the number of nucleotides in that target sequence. Second, some eukaryotic factors may not express or fold efficiently in the bacterial system, attributed to differing regulatory networks and transcriptional machinery. Hence when working with DNA-binding proteins of eukaryotic origin, a yeast-based hybrid system may be beneficial. Third, the B1H system may not be ideally suited for transcription factors that recognize binding sites with low affinity. The logic here is that competition created by binding sites elsewhere in the bacterial genome may limit the signal that can be realized from a single binding site that is present upstream of the reporter. [1] [3]
B1H system provides a tool in our arsenal for identifying the DNA-binding specificities of transcription factors and thus predicting their target genes and genomic DNA regulatory elements. It also allows for examination of the effects of protein–protein interactions on DNA binding, which may further guide the prediction of cis regulatory modules based on binding-site clustering. Moreover, the B1H selection system has implications for the predicting regulatory roles of previously uncharacterized transcription factors. [1]
Using the bacterial one-hybrid system, one study has characterized 35 members of the Drosophila melanogaster segmentation network which includes representative members of all the major classes of DNA-binding domain proteins. [8] Implications for medical research are evident from another study that used the B1H system to identify the DNA-binding specificity of a transcriptional regulator for a gene in Mycobacterium tuberculosis. [9] The B1H system has also been used to identify an important turnover element in Escherichia coli. [10]
Wikimedia Commons has media related to Bacterial one-hybrid system . |
In genetics, a promoter is a sequence of DNA to which proteins bind that initiate transcription of a single RNA from the DNA downstream of it. This RNA may encode a protein, or can have a function in and of itself, such as tRNA, mRNA, or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site and species of organism.
In molecular biology, a transcription factor (TF) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the right cell at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are up to 1600 TFs in the human genome. Transcription factors are members of the proteome as well as regulome.
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). Averaged over multiple cell types in a given tissue, the quantity of mRNA is more than 10 times the quantity of ncRNA. The general preponderance of mRNA in cells is valid even though less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) in order to stabilize the fold. It was originally coined to describe the finger-like appearance of a hypothesized structure from the African clawed frog (Xenopus laevis) transcription factor IIIA. However, it has been found to encompass a wide variety of differing protein structures in eukaryotic cells. Xenopus laevis TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein followed soon thereafter by the Krüppel factor in Drosophila. It often appears as a metal-binding domain in multi-domain proteins.
A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and viruses.
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.
In molecular biology, the TATA box is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has a shorter consensus sequence.
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products. Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.
A transcriptional activator is a protein that increases transcription of a gene or set of genes. Activators are considered to have positive control over gene expression, as they function to promote gene transcription and, in some cases, are required for the transcription of genes to occur. Most activators are DNA-binding proteins that bind to enhancers or promoter-proximal elements. The DNA site bound by the activator is referred to as an "activator-binding site". The part of the activator that makes protein–protein interactions with the general transcription machinery is referred to as an "activating region" or "activation domain".
DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair. However, there are some known minor groove DNA-binding ligands such as netropsin, distamycin, Hoechst 33258, pentamidine, DAPI and others.
Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "gene-by-gene" approach.
Antisense RNA (asRNA), also referred to as antisense transcript, natural antisense transcript (NAT) or antisense oligonucleotide, is a single stranded RNA that is complementary to a protein coding messenger RNA (mRNA) with which it hybridizes, and thereby blocks its translation into protein. asRNAs have been found in both prokaryotes and eukaryotes, and can be classified into short and long non-coding RNAs (ncRNAs). The primary function of asRNA is regulating gene expression. asRNAs may also be produced synthetically and have found wide spread use as research tools for gene knockdown. They may also have therapeutic applications.
Two-hybrid screening is a molecular biology technique used to discover protein–protein interactions (PPIs) and protein–DNA interactions by testing for physical interactions between two proteins or a single protein and a DNA molecule, respectively.
A regulator gene, regulator, or regulatory gene is a gene involved in controlling the expression of one or more other genes. Regulatory sequences, which encode regulatory genes, are often at the five prime end (5') to the start site of transcription of the gene they regulate. In addition, these sequences can also be found at the three prime end (3') to the transcription start site. In both cases, whether the regulatory sequence occurs before (5') or after (3') the gene it regulates, the sequence is often many kilobases away from the transcription start site. A regulator gene may encode a protein, or it may work at the level of RNA, as in the case of genes encoding microRNAs. An example of a regulator gene is a gene that codes for a repressor protein that inhibits the activity of an operator.
Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.
Transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC) and aids in stimulating transcription initiation. TFIIB is localised to the nucleus and provides a platform for PIC formation by binding and stabilising the DNA-TBP complex and by recruiting RNA polymerase II and other transcription factors. It is encoded by the TFIIB gene, and is homologous to archaeal transcription factor B and analogous to bacterial sigma factors.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.
In molecular biology, bacterial DNA binding proteins are a family of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins. Since bacterial binding proteins have a diversity of functions, it has been difficult to develop a common function for all of them. They are commonly referred to as histone-like and have many similar traits with the eukaryotic histone proteins. Eukaryotic histones package DNA to help it to fit in the nucleus, and they are known to be the most conserved proteins in nature. Examples include the HU protein in Escherichia coli, a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of bacteria and archaea, and are also encoded in the chloroplast genome of some algae. The integration host factor (IHF), a dimer of closely related chains which is suggested to function in genetic recombination as well as in translational and transcriptional control is found in Enterobacteria and viral proteins including the African swine fever virus protein A104R.
Selective microfluidics-based ligand enrichment followed by sequencing (SMiLE-seq) is a technique developed for the rapid identification of DNA binding specificities and affinities of full length monomeric and dimeric transcription factors in a fast and semi-high-throughput fashion.