Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional mutating changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, it is used for investigating the structure and biological activity of DNA, RNA, and protein molecules, and for protein engineering.
Site-directed mutagenesis is one of the most important laboratory techniques for creating DNA libraries by introducing mutations into DNA sequences. There are numerous methods for achieving site-directed mutagenesis, but with decreasing costs of oligonucleotide synthesis, artificial gene synthesis is now occasionally used as an alternative to site-directed mutagenesis. Since 2013, the development of the CRISPR/Cas9 technology, based on a prokaryotic viral defense system, has also allowed for the editing of the genome, and mutagenesis may be performed in vivo with relative ease. [1]
Early attempts at mutagenesis using radiation or chemical mutagens were non-site-specific, generating random mutations. [2] Analogs of nucleotides and other chemicals were later used to generate localized point mutations, [3] examples of such chemicals are aminopurine, [4] nitrosoguanidine, [5] and bisulfite. [6] Site-directed mutagenesis was achieved in 1974 in the laboratory of Charles Weissmann using a nucleotide analogue N4-hydroxycytidine, which induces transition of GC to AT. [7] [8] These methods of mutagenesis, however, are limited by the kind of mutation they can achieve, and they are not as specific as later site-directed mutagenesis methods.
In 1971, Clyde Hutchison and Marshall Edgell showed that it is possible to produce mutants with small fragments of phage ϕX174 and restriction nucleases. [9] [10] Hutchison later produced with his collaborator Michael Smith in 1978 a more flexible approach to site-directed mutagenesis by using oligonucleotides in a primer extension method with DNA polymerase. [11] For his part in the development of this process, Michael Smith later shared the Nobel Prize in Chemistry in October 1993 with Kary B. Mullis, who invented polymerase chain reaction.
The basic procedure requires the synthesis of a short DNA primer. This synthetic primer contains the desired mutation and is complementary to the template DNA around the mutation site so it can hybridize with the DNA in the gene of interest. The mutation may be a single base change (a point mutation), multiple base changes, deletion, or insertion. The single-strand primer is then extended using a DNA polymerase, which copies the rest of the gene. The gene thus copied contains the mutated site, and is then introduced into a host cell in a vector and cloned. Finally, mutants are selected by DNA sequencing to check that they contain the desired mutation.
The original method using single-primer extension was inefficient due to a low yield of mutants. This resulting mixture contains both the original unmutated template as well as the mutant strand, producing a mixed population of mutant and non-mutant progenies. Furthermore, the template used is methylated while the mutant strand is unmethylated, and the mutants may be counter-selected due to presence of mismatch repair system that favors the methylated template DNA, resulting in fewer mutants. Many approaches have since been developed to improve the efficiency of mutagenesis.
A large number of methods are available to effect site-directed mutagenesis, [12] although most of them have rarely been used in laboratories since the early 2000s, as newer techniques allow for simpler and easier ways of introducing site-specific mutation into genes.
In 1985, Thomas Kunkel introduced a technique that reduces the need to select for the mutants. [13] The DNA fragment to be mutated is inserted into a phagemid such as M13mp18/19 and is then transformed into an E. coli strain deficient in two enzymes, dUTPase ( dut ) and uracil deglycosidase (udg). Both enzymes are part of a DNA repair pathway that protects the bacterial chromosome from mutations by the spontaneous deamination of dCTP to dUTP. The dUTPase deficiency prevents the breakdown of dUTP, resulting in a high level of dUTP in the cell. The uracil deglycosidase deficiency prevents the removal of uracil from newly synthesized DNA. As the double-mutant E. coli replicates the phage DNA, its enzymatic machinery may, therefore, misincorporate dUTP instead of dTTP, resulting in single-strand DNA that contains some uracils (ssUDNA). The ssUDNA is extracted from the bacteriophage that is released into the medium, and then used as template for mutagenesis. An oligonucleotide containing the desired mutation is used for primer extension. The heteroduplex DNA, that forms, consists of one parental non-mutated strand containing dUTP and a mutated strand containing dTTP. The DNA is then transformed into an E. coli strain carrying the wildtype dut and udg genes. Here, the uracil-containing parental DNA strand is degraded, so that nearly all of the resulting DNA consists of the mutated strand.
Unlike other methods, cassette mutagenesis need not involve primer extension using DNA polymerase. In this method, a fragment of DNA is synthesized, and then inserted into a plasmid. [14] It involves the cleavage by a restriction enzyme at a site in the plasmid and subsequent ligation of a pair of complementary oligonucleotides containing the mutation in the gene of interest to the plasmid. Usually, the restriction enzymes that cut at the plasmid and the oligonucleotide are the same, permitting sticky ends of the plasmid and insert to ligate to one another. This method can generate mutants at close to 100% efficiency, but is limited by the availability of suitable restriction sites flanking the site that is to be mutated.
The limitation of restriction sites in cassette mutagenesis may be overcome using polymerase chain reaction with oligonucleotide "primers", such that a larger fragment may be generated, covering two convenient restriction sites. The exponential amplification in PCR produces a fragment containing the desired mutation in sufficient quantity to be separated from the original, unmutated plasmid by gel electrophoresis, which may then be inserted in the original context using standard recombinant molecular biology techniques. There are many variations of the same technique. The simplest method places the mutation site toward one of the ends of the fragment whereby one of two oligonucleotides used for generating the fragment contains the mutation. This involves a single step of PCR, but still has the inherent problem of requiring a suitable restriction site near the mutation site unless a very long primer is used. Other variations, therefore, employ three or four oligonucleotides, two of which may be non-mutagenic oligonucleotides that cover two convenient restriction sites and generate a fragment that can be digested and ligated into a plasmid, whereas the mutagenic oligonucleotide may be complementary to a location within that fragment well away from any convenient restriction site. These methods require multiple steps of PCR so that the final fragment to be ligated can contain the desired mutation. The design process for generating a fragment with the desired mutation and relevant restriction sites can be cumbersome. Software tools like SDM-Assist [15] can simplify the process.
For plasmid manipulations, other site-directed mutagenesis techniques have been supplanted largely by techniques that are highly efficient but relatively simple, easy to use, and commercially available as a kit. An example of these techniques is the "Quikchange" method, [16] wherein a pair of complementary mutagenic primers are used to amplify the entire plasmid in a thermocycling reaction using a high-fidelity non-strand-displacing DNA polymerase such as Pfu polymerase. The reaction generates a nicked, circular DNA. The template DNA must be eliminated by enzymatic digestion with a restriction enzyme such as DpnI, which is specific for methylated DNA. All DNA produced from most Escherichia coli strains would be methylated; the template plasmid that is biosynthesized in E. coli will, therefore, be digested, while the mutated plasmid, which is generated in vitro and is therefore unmethylated, would be left undigested. Note that, in these double-strand plasmid mutagenesis methods, while the thermocycling reaction may be used, the DNA is not exponentially amplified if the two primers are designed such that they bind symmetrically to the same region around the mutagenesis site, as described in the original protocol. In this case the amplification is linear, and it is therefore inaccurate to describe the procedure as a PCR, since there is no chain reaction. However, if the primers are designed to bind in an offset manner such that mutagenesis site is close to the 5' end of both primers, the 3' region of the primers can bind also to the amplified products and thus exponential product formation is observed. The name "Quikchange" originates from the registered trademark "QuikChange mutagenesis" of Stratagene, now Agilent Technologies , for site directed mutagenesis kits. The method was developed by scientists working at Stratagene. [16]
Note that Pfu polymerase can become strand-displacing at higher extension temperature (≥70 °C) which can result in the failure of the experiment, therefore the extension reaction should be performed at the recommended temperature of 68 °C. In some applications, this method has been observed to lead to insertion of multiple copies of primers. [17] A variation of this method, called SPRINP, prevents this artifact and has been used in different types of site directed mutagenesis. [17]
Other techniques such as scanning mutagenesis of oligo-directed targets (SMOOT) can semi-randomly combine mutagenic oligonucleotides in plasmid mutagenesis. [18] This technique can create plasmid mutagenesis libraries ranging from single mutations to comprehensive codon mutagenesis across an entire gene.
Since 2013, the development of CRISPR-Cas9 technology has allowed for the efficient introduction of various mutations into the genome of a wide variety of organisms. The method does not require a transposon insertion site, leaves no marker, and its efficiency and simplicity has made it the preferred method for genome editing. [21] [22]
Site-directed mutagenesis is used to generate mutations that may produce a rationally designed protein that has improved or special properties (i.e.protein engineering).
Investigative tools – specific mutations in DNA allow the function and properties of a DNA sequence or a protein to be investigated in a rational approach. Furthermore, single amino-acid changes by site-directed mutagenesis in proteins can help understand the importance of post-translational modifications. For instance changing a particular serine (phosphoacceptor) to an alanine (phospho-non-acceptor) in a substrate protein blocks the attachment of a phosphate group, thereby allows the phosphorylation to be investigated. This approach has been used to uncover the phosphorylation of the protein CBP by the kinase HIPK2 [23] Another comprehensive approach is site saturation mutagenesis where one codon or a set of codons may be substituted with all possible amino acids at the specific positions. [24]
Commercial applications – Proteins may be engineered to produce mutant forms that are tailored for a specific application. For example, commonly used laundry detergents may contain subtilisin, whose wild-type form has a methionine that can be oxidized by bleach, significantly reducing the activity the protein in the process. [25] This methionine may be replaced by alanine or other residues, making it resistant to oxidation thereby keeping the protein active in the presence of bleach. [26]
As the cost of DNA oligonucleotides synthesis falls, artificial synthesis of a complete gene is now a viable method for introducing mutation into gene. This method allows for extensive mutagenesis over multiples sites, including the complete redesign of the codon usage of gene to optimise it for a particular organism. [27]
The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA sufficiently to enable detailed study. PCR was invented in 1983 by American biochemist Kary Mullis at Cetus Corporation. Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993.
Protein engineering is the process of developing useful or valuable proteins through the design and production of unnatural polypeptides, often by altering amino acid sequences found in nature. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to improve the function of many enzymes for industrial catalysis. It is also a product and services market, with an estimated value of $168 billion by 2017.
In molecular biology, a library is a collection of genetic material fragments that are stored and propagated in a population of microbes through the process of molecular cloning. There are different types of DNA libraries, including cDNA libraries, genomic libraries and randomized mutant libraries. DNA library technology is a mainstay of current molecular biology, genetic engineering, and protein engineering, and the applications of these libraries depend on the source of the original DNA fragments. There are differences in the cloning vectors and techniques used in library preparation, but in general each DNA fragment is uniquely inserted into a cloning vector and the pool of recombinant DNA molecules is then transferred into a population of bacteria or yeast such that each organism contains on average one construct. As the population of organisms is grown in culture, the DNA molecules contained within them are copied and propagated.
Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.
A DNA construct is an artificially-designed segment of DNA borne on a vector that can be used to incorporate genetic material into a target tissue or cell. A DNA construct contains a DNA insert, called a transgene, delivered via a transformation vector which allows the insert sequence to be replicated and/or expressed in the target cell. This gene can be cloned from a naturally occurring gene, or synthetically constructed. The vector can be delivered using physical, chemical or viral methods. Typically, the vectors used in DNA constructs contain an origin of replication, a multiple cloning site, and a selectable marker. Certain vectors can carry additional regulatory elements based on the expression system involved.
Taq polymerase is a thermostable DNA polymerase I named after the thermophilic eubacterial microorganism Thermus aquaticus, from which it was originally isolated by Chien et al. in 1976. Its name is often abbreviated to Taq or Taq pol. It is frequently used in the polymerase chain reaction (PCR), a method for greatly amplifying the quantity of short segments of DNA.
The overlap extension polymerase chain reaction is a variant of PCR. It is also referred to as Splicing by overlap extension / Splicing by overhang extension (SOE) PCR. It is used assemble multiple smaller double stranded DNA fragments into a larger DNA sequence. OE-PCR is widely used to insert mutations at specific points in a sequence or to assemble custom DNA sequence from smaller DNA fragments into a larger polynucleotide.
DNA shuffling, also known as molecular breeding, is an in vitro random recombination method to generate mutant genes for directed evolution and to enable a rapid increase in DNA library size. Three procedures for accomplishing DNA shuffling are molecular breeding which relies on homologous recombination or the similarity of the DNA sequences, restriction enzymes which rely on common restriction sites, and nonhomologous random recombination which requires the use of hairpins. In all of these techniques, the parent genes are fragmented and then recombined.
SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. An SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.
Artificial gene synthesis, or simply gene synthesis, refers to a group of methods that are used in synthetic biology to construct and assemble genes from nucleotides de novo. Unlike DNA synthesis in living cells, artificial gene synthesis does not require template DNA, allowing virtually any DNA sequence to be synthesized in the laboratory. It comprises two main steps, the first of which is solid-phase DNA synthesis, sometimes known as DNA printing. This produces oligonucleotide fragments that are generally under 200 base pairs. The second step then involves connecting these oligonucleotide fragments using various DNA assembly methods. Because artificial gene synthesis does not require template DNA, it is theoretically possible to make a completely synthetic DNA molecule with no limits on the nucleotide sequence or size.
Promoter bashing is a technique used in molecular biology to identify how certain regions of a DNA strand, commonly promoters, affect the transcription of downstream genes. Under normal circumstances, proteins bind to the promoter and activate or repress transcription. In a promoter bashing assay, specific point mutations or deletions are made in specific regions of the promoter and the transcription of the gene is then measured. The contribution of a region of the promoter can be observed by the level of transcription. If a mutation or deletion changes the level of transcription, then it is known that that region of the promoter may be a binding site or other regulatory element.
The versatility of polymerase chain reaction (PCR) has led to modifications of the basic protocol being used in a large number of variant techniques designed for various purposes. This article summarizes many of the most common variations currently or formerly used in molecular biology laboratories; familiarity with the fundamental premise by which PCR works and corresponding terms and concepts is necessary for understanding these variant techniques.
T7 DNA polymerase is an enzyme used during the DNA replication of the T7 bacteriophage. During this process, the DNA polymerase “reads” existing DNA strands and creates two new strands that match the existing ones. The T7 DNA polymerase requires a host factor, E. coli thioredoxin, in order to carry out its function. This helps stabilize the binding of the necessary protein to the primer-template to improve processivity by more than 100-fold, which is a feature unique to this enzyme. It is a member of the Family A DNA polymerases, which include E. coli DNA polymerase I and Taq DNA polymerase.
Clyde A. Hutchison III is an American biochemist and microbiologist notable for his research on site-directed mutagenesis and synthetic biology. He is Professor Emeritus of Microbiology and Immunology at the University of North Carolina at Chapel Hill, distinguished professor at the J Craig Venter Institute, a member of the National Academy of Sciences, and a fellow of the American Academy of Arts and Sciences.
Reverse genetics is a method in molecular genetics that is used to help understand the function(s) of a gene by analysing the phenotypic effects caused by genetically engineering specific nucleic acid sequences within the gene. The process proceeds in the opposite direction to forward genetic screens of classical genetics. While forward genetics seeks to find the genetic basis of a phenotype or trait, reverse genetics seeks to find what phenotypes are controlled by particular genetic sequences.
Cas9 is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.
In molecular biology, mutagenesis is an important laboratory technique whereby DNA mutations are deliberately engineered to produce libraries of mutant genes, proteins, strains of bacteria, or other genetically modified organisms. The various constituents of a gene, as well as its regulatory elements and its gene products, may be mutated so that the functioning of a genetic locus, process, or product can be examined in detail. The mutation may produce mutant proteins with interesting properties or enhanced or novel functions that may be of commercial use. Mutant strains may also be produced that have practical application or allow the molecular basis of a particular cell function to be investigated.
Cassette mutagenesis is a type of site-directed mutagenesis that uses a short, double-stranded oligonucleotide sequence to replace a fragment of target DNA. It uses complementary restriction enzyme digest ends on the target DNA and gene cassette to achieve specificity. It is different from methods that use single oligonucleotide in that a single gene cassette can contain multiple mutations. Unlike many site directed mutagenesis methods, cassette mutagenesis also does not involve primer extension by DNA polymerase.
No-SCAR genome editing is an editing method that is able to manipulate the Escherichia coli genome. The system relies on recombineering whereby DNA sequences are combined and manipulated through homologous recombination. No-SCAR is able to manipulate the E. coli genome without the use of the chromosomal markers detailed in previous recombineering methods. Instead, the λ-Red recombination system facilitates donor DNA integration while Cas9 cleaves double-stranded DNA to counter-select against wild-type cells. Although λ-Red and Cas9 genome editing are widely used technologies, the no-SCAR method is novel in combining the two functions; this technique is able to establish point mutations, gene deletions, and short sequence insertions in several genomic loci with increased efficiency and time sensitivity.
Sequence saturation mutagenesis (SeSaM) is a chemo-enzymatic random mutagenesis method applied for the directed evolution of proteins and enzymes. It is one of the most common saturation mutagenesis techniques. In four PCR-based reaction steps, phosphorothioate nucleotides are inserted in the gene sequence, cleaved and the resulting fragments elongated by universal or degenerate nucleotides. These nucleotides are then replaced by standard nucleotides, allowing for a broad distribution of nucleic acid mutations spread over the gene sequence with a preference to transversions and with a unique focus on consecutive point mutations, both difficult to generate by other mutagenesis techniques. The technique was developed by Professor Ulrich Schwaneberg at Jacobs University Bremen and RWTH Aachen University.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)