Cell-free protein array technology produces protein microarrays by performing in vitro synthesis of the target proteins from their DNA templates. This method of synthesizing protein microarrays overcomes the many obstacles and challenges faced by traditional methods of protein array production [1] that have prevented widespread adoption of protein microarrays in proteomics. Protein arrays made from this technology can be used for testing protein–protein interactions, as well as protein interactions with other cellular molecules such as DNA and lipids. Other applications include enzymatic inhibition assays and screenings of antibody specificity.
The runaway success of DNA microarrays has generated much enthusiasm for protein microarrays. However, protein microarrays have not quite taken off as expected, even with the necessary tools and know-how from DNA microarrays being in place and ready for adaptation. One major reason is that protein microarrays are much more laborious and technically challenging to construct than DNA microarrays.
The traditional methods of producing protein arrays require the separate in vivo expression of hundreds or thousands of proteins, followed by separate purification and immobilization of the proteins on a solid surface. Cell-free protein array technology attempts to simplify protein microarray construction by bypassing the need to express the proteins in bacteria cells and the subsequent need to purify them. It takes advantage of available cell-free protein synthesis technology which has demonstrated that protein synthesis can occur without an intact cell as long as cell extracts containing the DNA template, transcription and translation raw materials and machinery are provided. [2] Common sources of cell extracts used in cell-free protein array technology include wheat germ, Escherichia coli , and rabbit reticulocyte. Cell extracts from other sources such as hyperthermophiles, hybridomas, Xenopus oocytes, insect, mammalian and human cells have also been used. [3]
The target proteins are synthesized in situ on the protein microarray, directly from the DNA template, thus skipping many of the steps in traditional protein microarray production and their accompanying technical limitations. More importantly, the expression of the proteins can be done in parallel, meaning all the proteins can be expressed together in a single reaction. This ability to multiplex protein expression is a major time-saver in the production process.
In the in situ method, protein synthesis is carried out on a protein array surface that is pre-coated with a protein-capturing reagent or antibody. Once the newly synthesized proteins are released from the ribosome, the tag sequence that is also synthesized at the N- or C-terminus of each nascent protein will be bound by the capture reagent or antibody, thus immobilizing the proteins to form an array. Commonly used tags include polyhistidine (His)6 and glutathione s-transferase (GST).
Various research groups have developed their own methods, each differing in their approach, but can be summarized into 3 main groups.
NAPPA [4] uses DNA template that has already been immobilized onto the same protein capture surface. The DNA template is biotinylated and is bound to avidin that is pre-coated onto the protein capture surface. Newly synthesized proteins which are tagged with GST are then immobilized next to the template DNA by binding to the adjacent polyclonal anti-GST capture antibody that is also pre-coated onto the capture surface. The main drawback of this method is the extra and tedious preparation steps at the beginning of the process: (1) the cloning of cDNAs in an expression-ready vector; and (2) the need to biotinylate the plasmid DNA but not to interfere with transcription. Moreover, the resulting protein array is not ‘pure’ because the proteins are co-localized with their DNA templates and capture antibodies. [3]
Unlike NAPPA, PISA [5] completely bypasses DNA immobilization as the DNA template is added as a free molecule in the reaction mixture. In 2006, another group refined and miniaturized this method by using multiple spotting technique to spot the DNA template and cell-free transcription and translation mixture on a high-density protein microarray with up to 13,000 spots. [6] This was made possible by the automated system used to accurately and sequentially supply the reagents for the transcription/translation reaction occurs in a small, sub-nanolitre droplet.
This method is an adaptation of mRNA display technology. PCR DNA is first transcribed to mRNA, and a single-stranded DNA oligonucleotide modified with biotin and puromycin on each end is then hybridized to the 3’-end of the mRNA. The mRNAs are then arrayed on a slide and immobilized by the binding of biotin to streptavidin that is pre-coated on the slide. Cell extract is then dispensed on the slide for in situ translation to take place. When the ribosome reaches the hybridized oligonucleotide, it stalls and incorporates the puromycin molecule to the nascent polypeptide chain, thereby attaching the newly synthesized protein to the microarray via the DNA oligonucleotide. [7] A pure protein array is obtained after the mRNA is digested with RNase. The protein spots generated by this method are very sharply defined and can be produced at a high density.
Nanowell array formats are used to express individual proteins in small volume reaction vessels or nanowells [8] [9] (Figure 4). This format is sometimes preferred because it avoids the need to immobilize the target protein which might result in the potential loss of protein activity. The miniaturization of the array also conserves solution and precious compounds that might be used in screening assays. Moreover, the structural properties of individual wells help to prevent cross-contamination among chambers. In 2012 an improved NAPPA was published, which used a nanowell array to prevent diffusion. Here the DNA was immobilized in the well together with an anti-GST antibody. Then cell-free expression mix was added and the wells closed by a lid. The nascent proteins containing a GST-tag were bound to the well surface enabling a NAPPA-array with higher density and nearly no cross-contaminations. [10]
DNA array to protein array (DAPA) is a method developed in 2007 to repeatedly produce protein arrays by ‘printing’ them from a single DNA template array, on demand [11] (Figure 5). It starts with the spotting and immobilization of an array of DNA templates onto a glass slide. The slide is then assembled face-to-face with a second slide pre-coated with a protein-capturing reagent, and a membrane soaked with cell extract is placed between the two slides for transcription and translation to take place. The newly synthesized his-tagged proteins are then immobilized onto the slide to form the array. In the publication in 18 of 20 replications a protein microarray copy could be generated. Potentially the process can be repeated as often as needed, as long as the DNA is unharmed by DNAses, degradation or mechanical abrasion.
Many of the advantages of cell-free protein array technology address the limitations of cell-based expression system used in traditional methods of protein microarray production.
The method avoids DNA cloning (with the exception of NAPPA) and can quickly convert genetic information into functional proteins by using PCR DNA. The reduced steps in production and the ability to miniaturize the system saves on reagent consumption and cuts production costs.
Many proteins, including antibodies, are difficult to express in host cells due to problems with insolubility, disulfide bonds or host cell toxicity. [1] Cell-free protein array makes many of such proteins available for use in protein microarrays.
Unlike DNA, which is a highly stable molecule, proteins are a heterogeneous class of molecules with different stability and physiochemical properties. Maintaining the proteins’ folding and function in an immobilized state over long periods of storage is a major challenge for protein microarrays. Cell-free methods provide the option to quickly obtaining protein microarrays on demand, thus eliminating any problems associated with long-term storage.
The method is amenable to a range of different templates: PCR products, plasmids and mRNA. Additional components can be included during synthesis to adjust the environment for protein folding, disulfide bond formation, modification or protein activity. [3]
Protein interactions: To screen for protein–protein interactions [4] and protein interactions with other molecules such as metabolites, lipids, DNA and small molecules.; [14] enzyme inhibition assay: [8] for high throughput drug candidate screening and to discover novel enzymes for use in biotechnology; screening antibody specificity. [15]
In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a specific protein in a cell that does not normally express that protein, or to sequence or quantify mRNA molecules using DNA based methods. cDNA that codes for a specific protein can be transferred to a recipient cell for expression, often bacterial or yeast expression systems. cDNA is also generated to analyze transcriptomic profiles in bulk tissue, single cells, or single nuclei in assays such as microarrays, qPCR, and RNA-seq.
Molecular biology is a branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions.
In genetics and biochemistry, sequencing means to determine the primary structure of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succinctly summarizes much of the atomic-level structure of the sequenced molecule.
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.
A microarray is a multiplex lab-on-a-chip. Its purpose is to simultaneously detect the expression of thousands of biological interactions. It is a two-dimensional array on a solid substrate—usually a glass slide or silicon thin-film cell—that assays (tests) large amounts of biological material using high-throughput screening miniaturized, multiplexed and parallel processing and detection methods. The concept and methodology of microarrays was first introduced and illustrated in antibody microarrays by Tse Wen Chang in 1983 in a scientific publication and a series of patents. The "gene chip" industry started to grow significantly after the 1995 Science Magazine article by the Ron Davis and Pat Brown labs at Stanford University. With the establishment of companies, such as Affymetrix, Agilent, Applied Microarrays, Arrayjet, Illumina, and others, the technology of DNA microarrays has become the most sophisticated and the most widely used, while the use of protein, peptide and carbohydrate microarrays is expanding.
Oligonucleotides are short DNA or RNA molecules, oligomers, that have a wide range of applications in genetic testing, research, and forensics. Commonly made in the laboratory by solid-phase chemical synthesis, these small fragments of nucleic acids can be manufactured as single-stranded molecules with any user-specified sequence, and so are vital for artificial gene synthesis, polymerase chain reaction (PCR), DNA sequencing, molecular cloning and as molecular probes. In nature, oligonucleotides are usually found as small RNA molecules that function in the regulation of gene expression, or are degradation intermediates derived from the breakdown of larger nucleic acid molecules.
A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.
DNA synthesis is the natural or artificial creation of deoxyribonucleic acid (DNA) molecules. DNA is a macromolecule made up of nucleotide units, which are linked by covalent bonds and hydrogen bonds, in a repeating structure. DNA synthesis occurs when these nucleotide units are joined to form DNA; this can occur artificially or naturally. Nucleotide units are made up of a nitrogenous base, pentose sugar (deoxyribose) and phosphate group. Each unit is joined when a covalent bond forms between its phosphate group and the pentose sugar of the next nucleotide, forming a sugar-phosphate backbone. DNA is a complementary, double stranded structure as specific base pairing occurs naturally when hydrogen bonds form between the nucleotide bases.
In molecular biology, biochips are engineered substrates that can host large numbers of simultaneous biochemical reactions. One of the goals of biochip technology is to efficiently screen large numbers of biological analytes, with potential applications ranging from disease diagnosis to detection of bioterrorism agents. For example, digital microfluidic biochips are under investigation for applications in biomedical fields. In a digital microfluidic biochip, a group of (adjacent) cells in the microfluidic array can be configured to work as storage, functional operations, as well as for transporting fluid droplets dynamically.
A protein microarray is a high-throughput method used to track the interactions and activities of proteins, and to determine their function, and determining function on a large scale. Its main advantage lies in the fact that large numbers of proteins can be tracked in parallel. The chip consists of a support surface such as a glass slide, nitrocellulose membrane, bead, or microtitre plate, to which an array of capture proteins is bound. Probe molecules, typically labeled with a fluorescent dye, are added to the array. Any reaction between the probe and the immobilised protein emits a fluorescent signal that is read by a laser scanner. Protein microarrays are rapid, automated, economical, and highly sensitive, consuming small quantities of samples and reagents. The concept and methodology of protein microarrays was first introduced and illustrated in antibody microarrays in 1983 in a scientific publication and a series of patents. The high-throughput technology behind the protein microarray was relatively easy to develop since it is based on the technology developed for DNA microarrays, which have become the most widely used microarrays.
An RNA spike-in is an RNA transcript of known sequence and quantity used to calibrate measurements in RNA hybridization assays, such as DNA microarray experiments, RT-qPCR, and RNA-Seq.
mRNA display is a display technique used for in vitro protein, and/or peptide evolution to create molecules that can bind to a desired target. The process results in translated peptides or proteins that are associated with their mRNA progenitor via a puromycin linkage. The complex then binds to an immobilized target in a selection step. The mRNA-protein fusions that bind well are then reverse transcribed to cDNA and their sequence amplified via a polymerase chain reaction. The result is a nucleotide sequence that encodes a peptide with high affinity for the molecule of interest.
SOLiD (Sequencing by Oligonucleotide Ligation and Detection) is a next-generation DNA sequencing technology developed by Life Technologies and has been commercially available since 2006. This next generation technology generates 108 - 109 small sequence reads at one time. It uses 2 base encoding to decode the raw data generated by the sequencing platform into sequence data.
Bio-MEMS is an abbreviation for biomedical microelectromechanical systems. Bio-MEMS have considerable overlap, and is sometimes considered synonymous, with lab-on-a-chip (LOC) and micro total analysis systems (μTAS). Bio-MEMS is typically more focused on mechanical parts and microfabrication technologies made suitable for biological applications. On the other hand, lab-on-a-chip is concerned with miniaturization and integration of laboratory processes and experiments into single chips. In this definition, lab-on-a-chip devices do not strictly have biological applications, although most do or are amenable to be adapted for biological purposes. Similarly, micro total analysis systems may not have biological applications in mind, and are usually dedicated to chemical analysis. A broad definition for bio-MEMS can be used to refer to the science and technology of operating at the microscale for biological and biomedical applications, which may or may not include any electronic or mechanical functions. The interdisciplinary nature of bio-MEMS combines material sciences, clinical sciences, medicine, surgery, electrical engineering, mechanical engineering, optical engineering, chemical engineering, and biomedical engineering. Some of its major applications include genomics, proteomics, molecular diagnostics, point-of-care diagnostics, tissue engineering, single cell analysis and implantable microdevices.
A reverse phase protein lysate microarray (RPMA) is a protein microarray designed as a dot-blot platform that allows measurement of protein expression levels in a large number of biological samples simultaneously in a quantitative manner when high-quality antibodies are available.
MAGIChips, also known as "microarrays of gel-immobilized compounds on a chip" or "three-dimensional DNA microarrays", are devices for molecular hybridization produced by immobilizing oligonucleotides, DNA, enzymes, antibodies, and other compounds on a photopolymerized micromatrix of polyacrylamide gel pads of 100x100x20µm or smaller size. This technology is used for analysis of nucleic acid hybridization, specific binding of DNA, and low-molecular weight compounds with proteins, and protein-protein interactions.
DNA-encoded chemical libraries (DECL) is a technology for the synthesis and screening on an unprecedented scale of collections of small molecule compounds. DECL is used in medicinal chemistry to bridge the fields of combinatorial chemistry and molecular biology. The aim of DECL technology is to accelerate the drug discovery process and in particular early phase discovery activities such as target validation and hit identification.
Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads per instrument run.
Spatial transcriptomics is a method for assigning cell types to their locations in the histological sections and can also be used to determine subcellular localization of mRNA molecules. First described in 2016 by Ståhl et al., it has since undergone a variety of improvements and modifications.
This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.