Cell-free protein synthesis, also known as in vitro protein synthesis or CFPS, is the production of protein using biological machinery in a cell-free system, that is, without the use of living cells. The in vitro protein synthesis environment is not constrained by a cell wall or homeostasis conditions necessary to maintain cell viability. [1] Thus, CFPS enables direct access and control of the translation environment which is advantageous for a number of applications including co-translational solubilisation of membrane proteins, optimisation of protein production, incorporation of non-natural amino acids, selective and site-specific labelling. [2] [3] Due to the open nature of the system, different expression conditions such as pH, redox potentials, temperatures, and chaperones can be screened. Since there is no need to maintain cell viability, toxic proteins can be produced.
Common components of a cell-free reaction include a cell extract, an energy source, a supply of amino acids, cofactors such as magnesium, and the DNA with the desired genes. A cell extract is obtained by lysing the cell of interest and centrifuging out the cell walls, DNA genome, and other debris. The remains are the necessary cell machinery including ribosomes, aminoacyl-tRNA synthetases, translation initiation and elongation factors, nucleases, etc.
Two types of DNA can be used in CFPS: plasmids and linear expression templates (LETs). Plasmids are circular, and only made inside cells. LETs can be made much more effectively via PCR, which replicates DNA much faster than raising cells in an incubator. While LETs are easier and faster to make, plasmid yields are usually much higher in CFPS. Because of this, much research today is focused on optimizing CFPS LET yields to approach the yields of CFPS with plasmids.
An energy source is an important part of a cell-free reaction. Usually, a separate mixture containing the needed energy source, along with a supply of amino acids, is added to the extract for the reaction. Common sources are phosphoenol pyruvate, acetyl phosphate, and creatine phosphate.
CFPS has many advantages over the traditional in vivo synthesis of proteins. Most notably, a cell-free reaction, including extract preparation, usually takes 1 –2 days, whereas in vivo protein expression may take 1–2 weeks. [4] [5] [6] [7]
CFPS is an open reaction. The lack of cell wall allows direct manipulation of the chemical environment. Samples are easily taken, concentrations optimized, and the reaction can be monitored. In contrast, once DNA is inserted into live cells, the reaction cannot be accessed until it is over and the cells are lysed.
Another advantage to CFPS is the lack of concern for toxicity. Some desired proteins and labeled proteins are toxic to cells when synthesized. [8] Since live cells are not being used, the toxicity of the product protein is not a significant concern.
These advantages enable numerous applications. [9] [1] A major application of CFPS is incorporation of unnatural amino acids into protein structures (see expanded genetic code). The openness of the reaction is ideal for inserting the modified tRNAs and unnatural amino acids required for such a reaction.
Synthetic biology has many other uses and is a bright future in fields such as protein evolution, nanomachines, nucleic acid circuits, and synthesis of virus-like particles for vaccines and drug therapy. [10] [11] [12]
One challenge associated with CFPS is the degradation of the DNA by endogenous nucleases in the cell extract. This is particularly problematic with LETs. Cells have endonucleases that attack random sites of a DNA strands; however, much more common are the exonucleases which attack DNA from the ends. Since plasmids are circular and have no end to which the exonucleases may attach, they are not affected by the latter. LETs, however, are susceptible to both. Because of LET vulnerability, much research today is focused on optimizing CFPS LET yields to approach the yields of CFPS using plasmids.
One example of this improved protection with plasmids is use of the bacteriophage lambda gam protein. [13] Gam is an inhibitor of RecBCD, an exonuclease found in Escherichia coli (E. coli). [14] With the use of gam, CFPS yields with LETs were greatly increased, and were comparable to CFPS yields with plasmids. [15] PURE extracts can also be made, eliminating the concern of exonucleases. These extracts are expensive to make and are not currently an economical solution to the issue of exogenous DNA degradation.
Common cell extracts in use today are made from E. coli (ECE), rabbit reticulocytes (RRL), wheat germ (WGE), insect cells (ICE) and Yeast Kluyveromyces (the D2P system). [1] [5] All of these extracts are commercially available.
ECE is the most popular lysate for several reasons. It is the most inexpensive extract and the least time intensive to create. Also, large amounts of E. coli are easily grown, and then easily lysed through use of a homogenizer or a sonicator. [1] ECE also provides the highest protein yields. However, high yield production can limit the complexity of the synthesized protein, particularly in post-translational modification. In that regard, the lower efficient eukaryotic systems could be advantageous, provided that modifying enzyme systems have been maintained in the extracts.
Each eukaryotic system has their advantages and disadvantages. For example, WGE extract produces the highest yields of the three eukaryotic extracts; however, it is not as effective for some post-translational modifications such as glycosylation. [5] When choosing an extract, the type of post-translational modification, desired yields, and cost should be taken into account.
Cell-free protein synthesis has been used for over 60 years, and notably, the first elucidation of a codon was done by Marshall Nirenberg and Heinrich J. Matthaei in 1961 at the National Institutes of Health. [1] [16] They used a cell-free system to translate a poly-uracil RNA sequence (or UUUUU... in biochemical terms) and discovered that the polypeptide they had synthesized consisted of only the amino acid phenylalanine. They thereby deduced from this poly-phenylalanine that the codon UUU specified the amino-acid phenylalanine. Extending this work, Nirenberg and his coworkers were able to determine the nucleotide makeup of each codon.
The genetic code is the set of rules used by living cells to translate information encoded within genetic material into proteins. Translation is accomplished by the ribosome, which links proteinogenic amino acids in an order specified by messenger RNA (mRNA), using transfer RNA (tRNA) molecules to carry amino acids and to read the mRNA three nucleotides at a time. The genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries.
Protein engineering is the process of developing useful or valuable proteins through the design and production of unnatural polypeptides, often by altering amino acid sequences found in nature. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to improve the function of many enzymes for industrial catalysis. It is also a product and services market, with an estimated value of $168 billion by 2017.
In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.
DNA polymerase I is an enzyme that participates in the process of prokaryotic DNA replication. Discovered by Arthur Kornberg in 1956, it was the first known DNA polymerase. It was initially characterized in E. coli and is ubiquitous in prokaryotes. In E. coli and many other bacteria, the gene that encodes Pol I is known as polA. The E. coli Pol I enzyme is composed of 928 amino acids, and is an example of a processive enzyme — it can sequentially catalyze multiple polymerisation steps without releasing the single-stranded template. The physiological function of Pol I is mainly to support repair of damaged DNA, but it also contributes to connecting Okazaki fragments by deleting RNA primers and replacing the ribonucleotides with DNA.
Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional mutating changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, it is used for investigating the structure and biological activity of DNA, RNA, and protein molecules, and for protein engineering.
In molecular biology and genetics, transformation is the genetic alteration of a cell resulting from the direct uptake and incorporation of exogenous genetic material from its surroundings through the cell membrane(s). For transformation to take place, the recipient bacterium must be in a state of competence, which might occur in nature as a time-limited response to environmental conditions such as starvation and cell density, and may also be induced in a laboratory.
The Nirenberg and Matthaei experiment was a scientific experiment performed in May 1961 by Marshall W. Nirenberg and his post-doctoral fellow, J. Heinrich Matthaei, at the National Institutes of Health (NIH). The experiment deciphered the first of the 64 triplet codons in the genetic code by using nucleic acid homopolymers to translate specific amino acids.
Marshall Warren Nirenberg was an American biochemist and geneticist. He shared a Nobel Prize in Physiology or Medicine in 1968 with Har Gobind Khorana and Robert W. Holley for "breaking the genetic code" and describing how it operates in protein synthesis. In the same year, together with Har Gobind Khorana, he was awarded the Louisa Gross Horwitz Prize from Columbia University.
In molecular biology, biosynthesis is a multi-step, enzyme-catalyzed process where substrates are converted into more complex products in living organisms. In biosynthesis, simple compounds are modified, converted into other compounds, or joined to form macromolecules. This process often consists of metabolic pathways. Some of these biosynthetic pathways are located within a single cellular organelle, while others involve enzymes that are located within multiple cellular organelles. Examples of these biosynthetic pathways include the production of lipid membrane components and nucleotides. Biosynthesis is usually synonymous with anabolism.
Exonucleases are enzymes that work by cleaving nucleotides one at a time from the end (exo) of a polynucleotide chain. A hydrolyzing reaction that breaks phosphodiester bonds at either the 3′ or the 5′ end occurs. Its close relative is the endonuclease, which cleaves phosphodiester bonds in the middle (endo) of a polynucleotide chain. Eukaryotes and prokaryotes have three types of exonucleases involved in the normal turnover of mRNA: 5′ to 3′ exonuclease (Xrn1), which is a dependent decapping protein; 3′ to 5′ exonuclease, an independent protein; and poly(A)-specific 3′ to 5′ exonuclease.
Taq polymerase is a thermostable DNA polymerase I named after the thermophilic eubacterial microorganism Thermus aquaticus, from which it was originally isolated by Chien et al. in 1976. Its name is often abbreviated to Taq or Taq pol. It is frequently used in the polymerase chain reaction (PCR), a method for greatly amplifying the quantity of short segments of DNA.
Johannes Heinrich Matthaei is a German biochemist. He is best known for his unique contribution to solving the genetic code on 15 May 1961.
DNA polymerase II is a prokaryotic DNA-dependent DNA polymerase encoded by the PolB gene.
A cell-free system is an in vitro tool widely used to study biological reactions that happen within cells apart from a full cell system, thus reducing the complex interactions typically found when working in a whole cell. Subcellular fractions can be isolated by ultracentrifugation to provide molecular machinery that can be used in reactions in the absence of many of the other cellular components. Eukaryotic and prokaryotic cell internals have been used for creation of these simplified environments. These systems have enabled cell-free synthetic biology to emerge, providing control over what reaction is being examined, as well as its yield, and lessening the considerations otherwise invoked when working with more sensitive live cells.
Transformation efficiency refers to the ability of a cell to take up and incorporate exogenous DNA, such as plasmids, during a process called transformation. The efficiency of transformation is typically measured as the number of transformants per microgram of DNA added to the cells. A higher transformation efficiency means that more cells are able to take up the DNA, and a lower efficiency means that fewer cells are able to do so.
An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.
ParM is a prokaryotic actin homologue which provides the force to drive copies of the R1 plasmid to opposite ends of rod shaped bacteria before cytokinesis.
T7 DNA polymerase is an enzyme used during the DNA replication of the T7 bacteriophage. During this process, the DNA polymerase “reads” existing DNA strands and creates two new strands that match the existing ones. The T7 DNA polymerase requires a host factor, E. coli thioredoxin, in order to carry out its function. This helps stabilize the binding of the necessary protein to the primer-template to improve processivity by more than 100-fold, which is a feature unique to this enzyme. It is a member of the Family A DNA polymerases, which include E. coli DNA polymerase I and Taq DNA polymerase.
DNA Polymerase V is a polymerase enzyme involved in DNA repair mechanisms in bacteria, such as Escherichia coli. It is composed of a UmuD' homodimer and a UmuC monomer, forming the UmuD'2C protein complex. It is part of the Y-family of DNA Polymerases, which are capable of performing DNA translesion synthesis (TLS). Translesion polymerases bypass DNA damage lesions during DNA replication - if a lesion is not repaired or bypassed the replication fork can stall and lead to cell death. However, Y polymerases have low sequence fidelity during replication. When the UmuC and UmuD' proteins were initially discovered in E. coli, they were thought to be agents that inhibit faithful DNA replication and caused DNA synthesis to have high mutation rates after exposure to UV-light. The polymerase function of Pol V was not discovered until the late 1990s when UmuC was successfully extracted, consequent experiments unequivocally proved UmuD'2C is a polymerase. This finding lead to the detection of many Pol V orthologs and the discovery of the Y-family of polymerases.
No-SCAR genome editing is an editing method that is able to manipulate the Escherichia coli genome. The system relies on recombineering whereby DNA sequences are combined and manipulated through homologous recombination. No-SCAR is able to manipulate the E. coli genome without the use of the chromosomal markers detailed in previous recombineering methods. Instead, the λ-Red recombination system facilitates donor DNA integration while Cas9 cleaves double-stranded DNA to counter-select against wild-type cells. Although λ-Red and Cas9 genome editing are widely used technologies, the no-SCAR method is novel in combining the two functions; this technique is able to establish point mutations, gene deletions, and short sequence insertions in several genomic loci with increased efficiency and time sensitivity.