In vitro compartmentalization (IVC) is an emulsion-based technology that generates cell-like compartments in vitro . These compartments are designed such that each contains no more than one gene. When the gene is transcribed and/or translated, its products (RNAs and/or proteins) become 'trapped' with the encoding gene inside the compartment. By coupling the genotype (DNA) and phenotype (RNA, protein), compartmentalization allows the selection and evolution of phenotype.
In vitro compartmentalization method was first developed by Dan Tawfik and Andrew Griffiths. [1] Based on the idea that Darwinian evolution relies on the linkage of genotype to phenotype, Tawfik and Griffiths designed aqueous compartments of water-in-oil (w/o) emulsions to mimic cellular compartments that can link genotype and phenotype. Emulsions of cell-like compartments were formed by adding in vitro transcription/translation reaction mixture to stirred mineral oil containing surfactants. The mean droplet diameter was measured to be 2.6 μm by laser diffraction. As a proof of concept, Tawfik and Griffiths designed a selection experiment using a pool of DNA sequences, including the gene encoding HaeIII DNA methyltransferase (M.HaeIII) in the presence of 107-fold excess of genes encoding a different enzyme folA. The 3’ of each DNA sequences was purposely designed to contain a HaeIII recognition site which, in the presence of expressed methyltransferase, would be methylated and, thus, resistant to restriction enzyme digestion. By selecting for DNA sequences that survive the endonuclease digestion, Tawfik and Griffiths found that the M.HaeIII genes were enriched by at least 1000-fold over the folA genes within the first round of selection.
Water-in-oil (w/o) emulsions are created by mixing aqueous and oil phases with the help of surfactants. A typical IVC emulsion is formed by first generating oil-surfactant mixture by stirring, and then gradually adding the aqueous phase to the oil-surfactant mixture. For stable emulsion formation, a mixture of HLB (hydrophile-lipophile balance) and low HLB surfactants are needed. [3] Some combinations of surfactants used to generate oil-surfactant mixture are mineral oil / 0.5% Tween 80 / 4.5% Span 80 / sodium deoxycholate [1] and a more heat stable version, light mineral oil / 0.4% Tween 80 / 4.5% Span 80 / 0.05% Triton X-100. [4] The aqueous phase containing transcription and/or translation components is slowly added to the oil surfactants, and the formation of w/o is facilitated by homogenizing, stirring or using hand extruding device.
The emulsion quality can be determined by light microscopy and/or dynamic light scattering techniques. The emulsion is quite diverse, and greater homogenization speeds helps to produce smaller droplets with narrower size distribution. However, homogenization speeds has to be controlled, since speed over 13,500 r.p.m tends to result in a significant loss of enzyme activity on the level of transcription. The most widely used emulsion formation gives droplets with a mean diameter of 2-3μm, and an average volume of ~5 femtoliters, or 1010 aqueous droplet per ml of emulsions. [5] The ratio of genes to droplets is designed such that most of the droplets contains no more than a single gene statistically.
IVC enables the miniaturization of large-scale techniques that can now be done on the micro scale including coupled in vitro transcription and translation (IVTT) experiments. Streamlining and integrating transcription and translation allows for fast and highly controllable experimental designs. [6] [7] [8] IVTT can be done both in bulk emulsions and in microdroplets by utilizing droplet-based microfluidics. Microdroplets, droplets on the scale of pico to femtoliters, have been successfully used as single DNA molecule vessels. [9] [10] This droplet technology allows high throughput analysis with many different selection pressures in a single experimental setup. [6] [10] IVTT in microdroplets is preferred when overexpression of a desired protein would be toxic to a host cell minimizing the utility of the transcription and translation mechanisms. [11]
IVC has used bacterial cell, wheat germ and rabbit reticulocyte (RRL) extracts for transcription and translation. It is also possible to use bacterial reconstituted translation system such as PURE in which translation components are individually purified and later combined. When expressing eukaryote or complex proteins, it is desirable to use eukaryotic translation systems such as wheat germ extract or more superior alternative, RRL extract. In order to use RRL for transcription and translation, traditional emulsion formulation cannot be used as it abolishes translation. Instead, a novel emulsion formulation: 4% Abil EM90 / light mineral oil was developed and demonstrated to be functional in expressing luciferase and human telomerase. [12]
Once transcription and/or translation has completed in the droplets, emulsion will be broken by successive steps of removing mineral oil and surfactants to allow for subsequent selection. At this stage, it is crucial to have a method to ‘track’ each gene products to the encoding gene as they become free floating in a heterogeneous population of molecules. There are three major approaches to track down each phenotype to its genotype. [13] The first method is to attach each DNA molecule with a biotin group and an additional coding sequence for streptavidin (STABLE display). [14] All the newly formed proteins/peptides will be in fusion with streptavidin molecules and bind to their biotinylated coding sequence. An improved version attached two biotin molecules to the ends of a DNA molecule to increase the avidity between DNA molecule and streptavidin-fused peptides, and used a low GC content synthetic streptavidin gene to increase efficiency and specificity during PCR amplification. [15] The second method is to covalently link DNA and protein. Two strategies have been demonstrated. The first is to form M.HaeIII fusion proteins. [16] Each expressed protein/polypeptide will be in fusion with Hae III DNA methyltransferase domain, which is able to bind covalently to DNA fragments containing the sequence 5′-GGC*-3′, where C* is 5-fluoro-2 deoxycytidine. The second strategy is to use monomeric mutant of VirD2 enzyme. [17] When a protein/peptide is expressed in fusion with Agrobacterium protein VirD2, it will bind to its DNA coding sequence that has a single-stranded overhang comprising VirD2 T-border recognition sequences. The third method is to link phenotype and genotype via beads. [18] The beads used will be coated with streptavidin to allow for the binding of biotinylated DNA, in addition, the beads will also display cognate binding partner to the affinity tag that will be expressed in fusion with the protein/peptide.
Depending on the phenotype to be selected, difference selection strategies will be used. Selection strategy can be divided into three major categories: selection for binding, selection for catalysis and selection for regulation. [19] The phenotype to be selected can range from RNA to peptide to protein. By selecting for binding, the most commonly evolved phenotypes are peptide/proteins that have selective affinity to a specific antibody or DNA molecule. An example is the selection of proteins that have affinity to zinc finger DNA by Sepp et al. [20] By selecting for catalytic proteins/RNAs, new variants with novel or improved enzymatic property are usually isolated. For example, new ribozyme variants with trans-ligase activity were selected and exhibited multiple turnovers. [21] By selecting for regulation, inhibitors of DNA nucleases can be selected, such as protein inhibitors of the Colicin E7 DNase. [22]
Comparing to other in vitro display technologies, IVC has two major advantages. The first advantage is its ability to control reactions within the droplets. Hydrophobic and hydrophilic components can be delivered to each droplet in a step-wise fashion without compromising the chemical integrity of the droplet, and thus by controlling what to be added and when to be added, the reaction in each droplet is controlled. In addition, depending on the nature of the reaction to be carried out, the pH of each droplet can also be changed. More recently, photocaged substrates were used and their participation in a reaction was regulated by photo-activation. [19] The second advantage is that IVC allows the selection of catalytic molecules. As an example, Griffiths et al. was able to select for phosphotriesterase variants with higher Kcat by detecting product formation and amount using anti-product antibody and flow cytometry respectively. [23]
An emulsion is a mixture of two or more liquids that are normally immiscible owing to liquid-liquid phase separation. Emulsions are part of a more general class of two-phase systems of matter called colloids. Although the terms colloid and emulsion are sometimes used interchangeably, emulsion should be used when both phases, dispersed and continuous, are liquids. In an emulsion, one liquid is dispersed in the other. Examples of emulsions include vinaigrettes, homogenized milk, liquid biomolecular condensates, and some cutting fluids for metal working.
The central dogma of molecular biology is an explanation of the flow of genetic information within a biological system. It is often stated as "DNA makes RNA, and RNA makes protein", although this is not its original meaning. It was first stated by Francis Crick in 1957, then published in 1958:
The Central Dogma. This states that once "information" has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information here means the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.
A Morpholino, also known as a Morpholino oligomer and as a phosphorodiamidate Morpholino oligomer (PMO), is a type of oligomer molecule used in molecular biology to modify gene expression. Its molecular structure contains DNA bases attached to a backbone of methylenemorpholine rings linked through phosphorodiamidate groups. Morpholinos block access of other molecules to small specific sequences of the base-pairing surfaces of ribonucleic acid (RNA). Morpholinos are used as research tools for reverse genetics by knocking down gene function.
Phage display is a laboratory technique for the study of protein–protein, protein–peptide, and protein–DNA interactions that uses bacteriophages to connect proteins with the genetic information that encodes them. In this technique, a gene encoding a protein of interest is inserted into a phage coat protein gene, causing the phage to "display" the protein on its outside while containing the gene for the protein on its inside, resulting in a connection between genotype and phenotype. The proteins that the phages are displaying can then be screened against other proteins, peptides or DNA sequences, in order to detect interaction between the displayed protein and those of other molecules. In this way, large libraries of proteins can be screened and amplified in a process called in vitro selection, which is analogous to natural selection.
Chemical biology is a scientific discipline between the fields of chemistry and biology. The discipline involves the application of chemical techniques, analysis, and often small molecules produced through synthetic chemistry, to the study and manipulation of biological systems. Although often confused with biochemistry, which studies the chemistry of biomolecules and regulation of biochemical pathways within and between cells, chemical biology remains distinct by focusing on the application of chemical tools to address biological questions.
Ribosome display is a technique used to perform in vitro protein evolution to create proteins that can bind to a desired ligand. The process results in translated proteins that are associated with their mRNA progenitor which is used, as a complex, to bind to an immobilized ligand in a selection step. The mRNA-protein hybrids that bind well are then reverse transcribed to cDNA and their sequence amplified via PCR. The result is a nucleotide sequence that can be used to create tightly binding proteins.
Aptamers are short sequences of artificial DNA, RNA, XNA, or peptide that bind a specific target molecule, or family of target molecules. They exhibit a range of affinities, with variable levels of off-target binding and are sometimes classified as chemical antibodies. Aptamers and antibodies can be used in many of the same applications, but the nucleic acid-based structure of aptamers, which are mostly oligonucleotides, is very different from the amino acid-based structure of antibodies, which are proteins. This difference can make aptamers a better choice than antibodies for some purposes.
Directed evolution (DE) is a method used in protein engineering that mimics the process of natural selection to steer proteins or nucleic acids toward a user-defined goal. It consists of subjecting a gene to iterative rounds of mutagenesis, selection and amplification. It can be performed in vivo, or in vitro. Directed evolution is used both for protein engineering as an alternative to rationally designing modified proteins, as well as for experimental evolution studies of fundamental evolutionary principles in a controlled, laboratory environment.
mRNA display is a display technique used for in vitro protein, and/or peptide evolution to create molecules that can bind to a desired target. The process results in translated peptides or proteins that are associated with their mRNA progenitor via a puromycin linkage. The complex then binds to an immobilized target in a selection step. The mRNA-protein fusions that bind well are then reverse transcribed to cDNA and their sequence amplified via a polymerase chain reaction. The result is a nucleotide sequence that encodes a peptide with high affinity for the molecule of interest.
Zinc finger protein chimera are chimeric proteins composed of a DNA-binding zinc finger protein domain and another domain through which the protein exerts its effect. The effector domain may be a transcriptional activator (A) or repressor (R), a methylation domain (M) or a nuclease (N).
A computational gene is a molecular automaton consisting of a structural part and a functional part; and its design is such that it might work in a cellular environment.
DNA-encoded chemical libraries (DECL) is a technology for the synthesis and screening on an unprecedented scale of collections of small molecule compounds. DECL is used in medicinal chemistry to bridge the fields of combinatorial chemistry and molecular biology. The aim of DECL technology is to accelerate the drug discovery process and in particular early phase discovery activities such as target validation and hit identification.
RNA-based evolution is a theory that posits that RNA is not merely an intermediate between Watson and Crick model of the DNA molecule and proteins, but rather a far more dynamic and independent role-player in determining phenotype. By regulating the transcription in DNA sequences, the stability of RNA, and the capability of messenger RNA to be translated, RNA processing events allow for a diverse array of proteins to be synthesized from a single gene. Since RNA processing is heritable, it is subject to natural selection suggested by Darwin and contributes to the evolution and diversity of most eukaryotic organisms.
The Streptavidin-Binding Peptide (SBP)-Tag is a 38-amino acid sequence that may be engineered into recombinant proteins. Recombinant proteins containing the SBP-Tag bind to streptavidin and this property may be utilized in specific purification, detection or immobilization strategies.
Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.
Droplet-based microfluidics manipulate discrete volumes of fluids in immiscible phases with low Reynolds number and laminar flow regimes. Interest in droplet-based microfluidics systems has been growing substantially in past decades. Microdroplets offer the feasibility of handling miniature volumes of fluids conveniently, provide better mixing, encapsulation, sorting, sensing and are suitable for high throughput experiments. Two immiscible phases used for the droplet based systems are referred to as the continuous phase and dispersed phase.
In biotechnology BEAMing, which stands for beads, emulsion, amplification, magnetics, is a highly sensitive digital PCR method that combines emulsion PCR and flow cytometry to identify and quantify specific somatic mutations present in DNA.
Dan Salah Tawfik was an Israeli biochemist, best known for his contributions in protein engineering, evolutionary biochemistry and, more particularly, enzyme evolution.
This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.