An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for gene expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. Expression vectors are the basic tools in biotechnology for the production of proteins.
The vector is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector. [1] The goal of a well-designed expression vector is the efficient production of protein, and this may be achieved by the production of significant amount of stable messenger RNA, which can then be translated into protein. The expression of a protein may be tightly controlled, and the protein is only produced in significant quantity when necessary through the use of an inducer. In some systems, however, the protein may be expressed constitutively. Escherichia coli is commonly used as the host for protein production, but other cell types may also be used. An example of the use of expression vector is the production of insulin, which is used for medical treatments of diabetes.
An expression vector has features that any vector may have, such as an origin of replication, a selectable marker, and a suitable site for the insertion of a gene like the multiple cloning site. The cloned gene may be transferred from a specialized cloning vector to an expression vector, although it is possible to clone directly into an expression vector. The cloning process is normally performed in Escherichia coli . Vectors used for protein production in organisms other than E.coli may have, in addition to a suitable origin of replication for its propagation in E. coli, elements that allow them to be maintained in another organism, and these vectors are called shuttle vectors.
An expression vector must have elements necessary for gene expression. These may include a promoter, the correct translation initiation sequence such as a ribosomal binding site and start codon, a termination codon, and a transcription termination sequence. [2] There are differences in the machinery for protein synthesis between prokaryotes and eukaryotes, therefore the expression vectors must have the elements for expression that are appropriate for the chosen host. For example, prokaryotes expression vectors would have a Shine-Dalgarno sequence at its translation initiation site for the binding of ribosomes, while eukaryotes expression vectors would contain the Kozak consensus sequence.
The promoter initiates the transcription and is therefore the point of control for the expression of the cloned gene. The promoters used in expression vector are normally inducible, meaning that protein synthesis is only initiated when required by the introduction of an inducer such as IPTG. Gene expression however may also be constitutive (i.e. protein is constantly expressed) in some expression vectors. Low level of constitutive protein synthesis may occur even in expression vectors with tightly controlled promoters.
After the expression of the gene product, it may be necessary to purify the expressed protein; however, separating the protein of interest from the great majority of proteins of the host cell can be a protracted process. To make this purification process easier, a purification tag may be added to the cloned gene. This tag could be histidine (His) tag, other marker peptides, or a fusion partners such as glutathione S-transferase or maltose-binding protein. [3] Some of these fusion partners may also help to increase the solubility of some expressed proteins. Other fusion proteins such as green fluorescent protein may act as a reporter gene for the identification of successful cloned genes, or they may be used to study protein expression in cellular imaging. [4] [5]
The expression vector is transformed or transfected into the host cell for protein synthesis. Some expression vectors may have elements for transformation or the insertion of DNA into the host chromosome, for example the vir genes for plant transformation, and integrase sites for chromosomal integration .
Some vectors may include targeting sequence that may target the expressed protein to a specific location such as the periplasmic space of bacteria.
Different organisms may be used to express a gene's target protein, and the expression vector used will therefore have elements specific for use in the particular organism. The most commonly used organism for protein production is the bacterium Escherichia coli . However, not all proteins can be successfully expressed in E. coli, or be expressed with the correct form of post-translational modifications such as glycosylations, and other systems may therefore be used.
The expression host of choice for the expression of many proteins is Escherichia coli as the production of heterologous protein in E. coli is relatively simple and convenient, as well as being rapid and cheap. A large number of E. coli expression plasmids are also available for a wide variety of needs. Other bacteria used for protein production include Bacillus subtilis .
Most heterologous proteins are expressed in the cytoplasm of E. coli. However, not all proteins formed may be soluble in the cytoplasm, and incorrectly folded proteins formed in cytoplasm can form insoluble aggregates called inclusion bodies. Such insoluble proteins will require refolding, which can be an involved process and may not necessarily produce high yield. [6] Proteins which have disulphide bonds are often not able to fold correctly due to the reducing environment in the cytoplasm which prevents such bond formation, and a possible solution is to target the protein to the periplasmic space by the use of an N-terminal signal sequence. Another possibility is to manipulate the redox environment of the cytoplasm. [7] Other more sophisticated systems are also being developed; such systems may allow for the expression of proteins previously thought impossible in E. coli, such as glycosylated proteins. [8] [9] [10]
The promoters used for these vector are usually based on the promoter of the lac operon or the T7 promoter, [11] and they are normally regulated by the lac operator. These promoters may also be hybrids of different promoters, for example, the Tac-Promoter is a hybrid of trp and lac promoters. [12] Note that most commonly used lac or lac-derived promoters are based on the lacUV5 mutant which is insensitive to catabolite repression. This mutant allows for expression of protein under the control of the lac promoter when the growth medium contains glucose since glucose would inhibit gene expression if wild-type lac promoter is used. [13] Presence of glucose nevertheless may still be used to reduce background expression through residual inhibition in some systems. [14]
Examples of E. coli expression vectors are the pGEX series of vectors where glutathione S-transferase is used as a fusion partner and gene expression is under the control of the tac promoter, [15] [16] [17] and the pET series of vectors which uses a T7 promoter. [18]
It is possible to simultaneously express two or more different proteins in E. coli using different plasmids. However, when 2 or more plasmids are used, each plasmid needs to use a different antibiotic selection as well as a different origin of replication, otherwise one of the plasmids may not be stably maintained. Many commonly used plasmids are based on the ColE1 replicon and are therefore incompatible with each other; in order for a ColE1-based plasmid to coexist with another in the same cell, the other would need to be of a different replicon, e.g. a p15A replicon-based plasmid such as the pACYC series of plasmids. [19] Another approach would be to use a single two-cistron vector or design the coding sequences in tandem as a bi- or poly-cistronic construct. [20] [21]
A yeast commonly used for protein production is Pichia pastoris . [22] Examples of yeast expression vector in Pichia are the pPIC series of vectors, and these vectors use the AOX1 promoter which is inducible with methanol. [23] The plasmids may contain elements for insertion of foreign DNA into the yeast genome and signal sequence for the secretion of expressed protein. Proteins with disulphide bonds and glycosylation can be efficiently produced in yeast. Another yeast used for protein production is Kluyveromyces lactis and the gene is expressed, driven by a variant of the strong lactase LAC4 promoter. [24]
Saccharomyces cerevisiae is particularly widely used for gene expression studies in yeast, for example in yeast two-hybrid system for the study of protein-protein interaction. [25] The vectors used in yeast two-hybrid system contain fusion partners for two cloned genes that allow the transcription of a reporter gene when there is interaction between the two proteins expressed from the cloned genes.
Baculovirus, a rod-shaped virus which infects insect cells, is used as the expression vector in this system. [26] Insect cell lines derived from Lepidopterans (moths and butterflies), such as Spodoptera frugiperda , are used as host. A cell line derived from the cabbage looper is of particular interest, as it has been developed to grow fast and without the expensive serum normally needed to boost cell growth. [27] [28] The shuttle vector is called bacmid, and gene expression is under the control of a strong promoter pPolh. [29] Baculovirus has also been used with mammalian cell lines in the BacMam system. [30]
Baculovirus is normally used for production of glycoproteins, although the glycosylations may be different from those found in vertebrates. In general, it is safer to use than mammalian virus as it has a limited host range and does not infect vertebrates without modifications.
Many plant expression vectors are based on the Ti plasmid of Agrobacterium tumefaciens . [31] In these expression vectors, DNA to be inserted into plant is cloned into the T-DNA, a stretch of DNA flanked by a 25-bp direct repeat sequence at either end, and which can integrate into the plant genome. The T-DNA also contains the selectable marker. The Agrobacterium provides a mechanism for transformation, integration of into the plant genome, and the promoters for its vir genes may also be used for the cloned genes. Concerns over the transfer of bacterial or viral genetic material into the plant however have led to the development of vectors called intragenic vectors whereby functional equivalents of plant genome are used so that there is no transfer of genetic material from an alien species into the plant. [32]
Plant viruses may be used as vectors since the Agrobacterium method does not work for all plants. Examples of plant virus used are the tobacco mosaic virus (TMV), potato virus X, and cowpea mosaic virus. [33] The protein may be expressed as a fusion to the coat protein of the virus and is displayed on the surface of assembled viral particles, or as an unfused protein that accumulates within the plant. Expression in plant using plant vectors is often constitutive, [34] and a commonly used constitutive promoter in plant expression vectors is the cauliflower mosaic virus (CaMV) 35S promoter. [35] [36]
Mammalian expression vectors offer considerable advantages for the expression of mammalian proteins over bacterial expression systems - proper folding, post-translational modifications, and relevant enzymatic activity. It may also be more desirable than other eukaryotic non-mammalian systems whereby the proteins expressed may not contain the correct glycosylations. It is of particular use in producing membrane-associating proteins that require chaperones for proper folding and stability as well as containing numerous post-translational modifications. The downside, however, is the low yield of product in comparison to prokaryotic vectors as well as the costly nature of the techniques involved. Its complicated technology, and potential contamination with animal viruses of mammalian cell expression have also placed a constraint on its use in large-scale industrial production. [37]
Cultured mammalian cell lines such as the Chinese hamster ovary (CHO), COS, including human cell lines such as HEK and HeLa may be used to produce protein. Vectors are transfected into the cells and the DNA may be integrated into the genome by homologous recombination in the case of stable transfection, or the cells may be transiently transfected. Examples of mammalian expression vectors include the adenoviral vectors, [38] the pSV and the pCMV series of plasmid vectors, vaccinia and retroviral vectors, [39] as well as baculovirus. [30] The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive gene expression. Non-viral promoter, such as the elongation factor (EF)-1 promoter, is also known. [40]
E. coli cell lysate containing the cellular components required for transcription and translation are used in this in vitro method of protein production. The advantage of such system is that protein may be produced much faster than those produced in vivo since it does not require time to culture the cells, but it is also more expensive. Vectors used for E. coli expression can be used in this system although specifically designed vectors for this system are also available. Eukaryotic cell extracts may also be used in other cell-free systems, for example, the wheat germ cell-free expression systems. [41] Mammalian cell-free systems have also been produced. [42]
Expression vector in an expression host is now the usual method used in laboratories to produce proteins for research. Most proteins are produced in E. coli, but for glycosylated proteins and those with disulphide bonds, yeast, baculovirus and mammalian systems may be used.
Most protein pharmaceuticals are now produced through recombinant DNA technology using expression vectors. These peptide and protein pharmaceuticals may be hormones, vaccines, antibiotics, antibodies, and enzymes. [43] The first human recombinant protein used for disease management, insulin, was introduced in 1982. [43] Biotechnology allows these peptide and protein pharmaceuticals, some of which were previously rare or difficult to obtain, to be produced in large quantity. It also reduces the risks of contaminants such as host viruses, toxins and prions. Examples from the past include prion contamination in growth hormone extracted from pituitary glands harvested from human cadavers, which caused Creutzfeldt–Jakob disease in patients receiving treatment for dwarfism, [44] and viral contaminants in clotting factor VIII isolated from human blood that resulted in the transmission of viral diseases such as hepatitis and AIDS. [45] [46] Such risk is reduced or removed completely when the proteins are produced in non-human host cells.
In recent years, expression vectors have been used to introduce specific genes into plants and animals to produce transgenic organisms, for example in agriculture it is used to produce transgenic plants. Expression vectors have been used to introduce a vitamin A precursor, beta-carotene, into rice plants. This product is called golden rice. This process has also been used to introduce a gene into plants that produces an insecticide, called Bacillus thuringiensis toxin or Bt toxin which reduces the need for farmers to apply insecticides since it is produced by the modified organism. In addition expression vectors are used to extend the ripeness of tomatoes by altering the plant so that it produces less of the chemical that causes the tomatoes to rot. [47] There have been controversies over using expression vectors to modify crops due to the fact that there might be unknown health risks, possibilities of companies patenting certain genetically modified food crops, and ethical concerns. Nevertheless, this technique is still being used and heavily researched.
Transgenic animals have also been produced to study animal biochemical processes and human diseases, or used to produce pharmaceuticals and other proteins. They may also be engineered to have advantageous or useful traits. Green fluorescent protein is sometimes used as tags which results in animal that can fluoresce, and this have been exploited commercially to produce the fluorescent GloFish.
Gene therapy is a promising treatment for a number of diseases where a "normal" gene carried by the vector is inserted into the genome, to replace an "abnormal" gene or supplement the expression of particular gene. Viral vectors are generally used but other nonviral methods of delivery are being developed. The treatment is still a risky option due to the viral vector used which can cause ill-effects, for example giving rise to insertional mutation that can result in cancer. [48] [49] However, there have been promising results. [50] [51]
A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; however, plasmids are sometimes present in archaea and eukaryotic organisms. Plasmids often carry useful genes, such as antibiotic resistance and virulence. While chromosomes are large and contain all the essential genetic information for living under normal conditions, plasmids are usually very small and contain additional genes for special circumstances.
A bacterial artificial chromosome (BAC) is a DNA construct, based on a functional fertility plasmid, used for transforming and cloning in bacteria, usually E. coli. F-plasmids play a crucial role because they contain partition genes that promote the even distribution of plasmids after bacterial cell division. The bacterial artificial chromosome's usual insert size is 150–350 kbp. A similar cloning vector called a PAC has also been produced from the DNA of P1 bacteriophage.
Protein production is the biotechnological process of generating a specific protein. It is typically achieved by the manipulation of gene expression in an organism such that it expresses large amounts of a recombinant gene. This includes the transcription of the recombinant DNA to messenger RNA (mRNA), the translation of mRNA into polypeptide chains, which are ultimately folded into functional proteins and may be targeted to specific subcellular or extracellular locations.
A cloning vector is a small piece of DNA that can be stably maintained in an organism, and into which a foreign DNA fragment can be inserted for cloning purposes. The cloning vector may be DNA taken from a virus, the cell of a higher organism, or it may be the plasmid of a bacterium. The vector contains features that allow for the convenient insertion of a DNA fragment into the vector or its removal from the vector, for example through the presence of restriction sites. The vector and the foreign DNA may be treated with a restriction enzyme that cuts the DNA, and DNA fragments thus generated contain either blunt ends or overhangs known as sticky ends, and vector DNA and foreign DNA with compatible ends can then be joined by molecular ligation. After a DNA fragment has been cloned into a cloning vector, it may be further subcloned into another vector designed for more specific use.
In molecular biology, a reporter gene is a gene that researchers attach to a regulatory sequence of another gene of interest in bacteria, cell culture, animals or plants. Such genes are called reporters because the characteristics they confer on organisms expressing them are easily identified and measured, or because they are selectable markers. Reporter genes are often used as an indication of whether a certain gene has been taken up by or expressed in the cell or organism population.
In molecular biology and genetics, transformation is the genetic alteration of a cell resulting from the direct uptake and incorporation of exogenous genetic material from its surroundings through the cell membrane(s). For transformation to take place, the recipient bacterium must be in a state of competence, which might occur in nature as a time-limited response to environmental conditions such as starvation and cell density, and may also be induced in a laboratory.
Recombinant DNA (rDNA) molecules are DNA molecules formed by laboratory methods of genetic recombination that bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome.
A DNA construct is an artificially-designed segment of DNA borne on a vector that can be used to incorporate genetic material into a target tissue or cell. A DNA construct contains a DNA insert, called a transgene, delivered via a transformation vector which allows the insert sequence to be replicated and/or expressed in the target cell. This gene can be cloned from a naturally occurring gene, or synthetically constructed. The vector can be delivered using physical, chemical or viral methods. Typically, the vectors used in DNA constructs contain an origin of replication, a multiple cloning site, and a selectable marker. Certain vectors can carry additional regulatory elements based on the expression system involved.
Two-hybrid screening is a molecular biology technique used to discover protein–protein interactions (PPIs) and protein–DNA interactions by testing for physical interactions between two proteins or a single protein and a DNA molecule, respectively.
Baculoviridae is a family of viruses. Arthropods, among the most studied being Lepidoptera, Hymenoptera and Diptera, serve as natural hosts. Currently, 85 species are placed in this family, assigned to four genera.
In genetics, Flp-FRT recombination is a site-directed recombination technology, increasingly used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-lox recombination but involves the recombination of sequences between short flippase recognition target (FRT) sites by the recombinase flippase (Flp) derived from the 2 μ plasmid of baker's yeast Saccharomyces cerevisiae.
The blue–white screen is a screening technique that allows for the rapid and convenient detection of recombinant bacteria in vector-based molecular cloning experiments. This method of screening is usually performed using a suitable bacterial strain, but other organisms such as yeast may also be used. DNA of transformation is ligated into a vector. The vector is then inserted into a competent host cell viable for transformation, which are then grown in the presence of X-gal. Cells transformed with vectors containing recombinant DNA will produce white colonies; cells transformed with non-recombinant plasmids grow into blue colonies.
In molecular cloning, a vector is any particle used as a vehicle to artificially carry a foreign nucleic sequence – usually DNA – into another cell, where it can be replicated and/or expressed. A vector containing foreign DNA is termed recombinant DNA. The four major types of vectors are plasmids, viral vectors, cosmids, and artificial chromosomes. Of these, the most commonly used vectors are plasmids. Common to all engineered vectors are an origin of replication, a multicloning site, and a selectable marker.
pHT01 is a plasmid used as a cloning vector for expressing proteins in Bacillus subtilis. It is 7,956 base pairs in length. pHT01 carries Pgrac, an artificial, strong, IPTG-inducible promoter consisting of the Bacillus subtilisgroE promoter, a lac operator, and the gsiB ribosome binding site. It was first found on plasmid pNDH33. The plasmid also carries replication regions from the pMTLBs72. The plasmid also carries genes to confer resistance to ampicillin and chloramphenicol.
Baculovirus gene transfer into Mammalian cells (BacMam) is the use of a baculovirus to deliver genes to mammalian cells. Baculoviruses are insect viruses that are typically not capable of infecting mammalian cells; however, they can be modified to express proteins in mammalian cells. Unmodified baculoviruses are able to enter mammalian cells; however, their genes are not expressed unless a recognizable mammalian promoter is incorporated upstream of a gene of interest. Both the unmodified baculovirus and its modified counterpart are unable to replicate in humans, making them non-infectious.
Heterologous expression refers to the expression of a gene or part of a gene in a host organism that does not naturally have the gene or gene fragment in question. Insertion of the gene in the heterologous host is performed by recombinant DNA technology. The purpose of heterologous expression is often to determine the effects of mutations and differential interactions on protein function. It provides an easy path to efficiently express and experiment with combinations of genes and mutants that do not naturally occur.
Molecular cloning is a set of experimental methods in molecular biology that are used to assemble recombinant DNA molecules and to direct their replication within host organisms. The use of the word cloning refers to the fact that the method involves the replication of one molecule to produce a population of cells with identical DNA molecules. Molecular cloning generally uses DNA sequences from two different organisms: the species that is the source of the DNA to be cloned, and the species that will serve as the living host for replication of the recombinant DNA. Molecular cloning methods are central to many contemporary areas of modern biology and medicine.
The Tac-Promoter, or tac vector is a synthetically produced DNA promoter, produced from the combination of promoters from the trp and lac operons. It is commonly used for protein production in Escherichia coli.
Transient expression, more frequently referred to "transient gene expression", is the temporary expression of genes that are expressed for a short time after nucleic acid, most frequently plasmid DNA encoding an expression cassette, has been introduced into eukaryotic cells with a chemical delivery agent like calcium phosphate (CaPi) or polyethyleneimine (PEI). However, unlike "stable expression," the foreign DNA does not fuse with the host cell DNA, resulting in the inevitable loss of the vector after several cell replication cycles. The majority of transient gene expressions are done with cultivated animal cells. The technique is also used in plant cells; however, the transfer of nucleic acids into these cells requires different methods than those with animal cells. In both plants and animals, transient expression should result in a time-limited use of transferred nucleic acids, since any long-term expression would be called "stable expression."
The T7 expression system is used in the field of microbiology to clone recombinant DNA using strains of E. coli. It is the most popular system for expressing recombinant proteins in E. coli.
{{cite journal}}
: CS1 maint: numeric names: authors list (link)