In genetics, an insertion (also called an insertion mutation) is the addition of one or more nucleotide base pairs into a DNA sequence. This can often happen in microsatellite regions due to the DNA polymerase slipping. Insertions can be anywhere in size from one base pair incorrectly inserted into a DNA sequence to a section of one chromosome inserted into another. The mechanism of the smallest single base insertion mutations is believed to be through base-pair separation between the template and primer strands followed by non-neighbor base stacking, which can occur locally within the DNA polymerase active site. [1] On a chromosome level, an insertion refers to the insertion of a larger sequence into a chromosome. This can happen due to unequal crossover during meiosis.
N region addition is the addition of non-coded nucleotides during recombination by terminal deoxynucleotidyl transferase.
P nucleotide insertion is the insertion of palindromic sequences encoded by the ends of the recombining gene segments.
Trinucleotide repeats are classified as insertion mutations [2] [3] and sometimes as a separate class of mutations. [4]
Zinc finger nuclease(ZFN), Transcription activator-like effector nucleases (TALEN), and CRISPR gene editing are the three main methods used in the former research to achieve gene insertion. And CRISPR/Cas tools have already become one of the most used methods to present research.
Based on CRISPR/Cas tools, different systems have already been developed to achieve specific functions. For example, one strategy is double-strand nucleases cutting system, using the normal Cas9 protein with single guide RNA (sgRNA) and then achieving the gene insertion through end-joining or dividing cells with the DNA repair system. [5] Another example is the prime editing system, which uses Cas9 nickase and the prime editing guide RNA (pegRNA) carrying the target genes. [5]
One limitation of current technology is that the size for DNA precise insertion is not large enough [6] to meet the demand for genome research. RNA-guided DNA transposition is an emerging area to solve this problem. [7] More efficient methods are expected to be developed and applied in the genome engineering area.
Insertions can be particularly hazardous if they occur in an exon, the amino acid coding region of a gene. A frameshift mutation, an alteration in the normal reading frame of a gene, results if the number of inserted nucleotides is not divisible by three, i.e., the number of nucleotides per codon. Frameshift mutations will alter all the amino acids encoded by the gene following the mutation. Usually, insertions and the subsequent frameshift mutation will cause the active translation of the gene to encounter a premature stop codon, resulting in an end to translation and the production of a truncated protein. Transcripts carrying the frameshift mutation may also be degraded through Nonsense-mediated decay during translation, thus not resulting in any protein product. If translated, the truncated proteins frequently are unable to function properly or at all and can result in any number of genetic disorders depending on the gene in which the insertion occurs. [8]
In-frame insertions occur when the reading frame is not altered as a result of the insertion; the number of inserted nucleotides is divisible by three. The reading frame remains intact after the insertion and translation will most likely run to completion if the inserted nucleotides do not code for a stop codon. However, because of the inserted nucleotides, the finished protein will contain, depending on the size of the insertion, multiple new amino acids that may affect the function of the protein.
Gene knockouts are a widely used genetic engineering technique that involves the targeted removal or inactivation of a specific gene within an organism's genome. This can be done through a variety of methods, including homologous recombination, CRISPR-Cas9, and TALENs.
Gene knockdown is an experimental technique by which the expression of one or more of an organism's genes is reduced. The reduction can occur either through genetic modification or by treatment with a reagent such as a short DNA or RNA oligonucleotide that has a sequence complementary to either gene or an mRNA transcript.
A frameshift mutation is a genetic mutation caused by indels of a number of nucleotides in a DNA sequence that is not divisible by three. Due to the triplet nature of gene expression by codons, the insertion or deletion can change the reading frame, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein. A frameshift mutation is not the same as a single-nucleotide polymorphism in which a nucleotide is replaced, rather than inserted or deleted. A frameshift mutation will in general cause the reading of the codons after the mutation to code for different amino acids. The frameshift mutation will also alter the first stop codon encountered in the sequence. The polypeptide being created could be abnormally short or abnormally long, and will most likely not be functional.
A germline mutation, or germinal mutation, is any detectable variation within germ cells. Mutations in these cells are the only mutations that can be passed on to offspring, when either a mutated sperm or oocyte come together to form a zygote. After this fertilization event occurs, germ cells divide rapidly to produce all of the cells in the body, causing this mutation to be present in every somatic and germline cell in the offspring; this is also known as a constitutional mutation. Germline mutation is distinct from somatic mutation.
CRISPR is a family of DNA sequences found in the genomes of prokaryotic organisms such as bacteria and archaea. These sequences are derived from DNA fragments of bacteriophages that had previously infected the prokaryote. They are used to detect and destroy DNA from similar bacteriophages during subsequent infections. Hence these sequences play a key role in the antiviral defense system of prokaryotes and provide a form of acquired immunity. CRISPR is found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.
Nonsense-mediated mRNA decay (NMD) is a surveillance pathway that exists in all eukaryotes. Its main function is to reduce errors in gene expression by eliminating mRNA transcripts that contain premature stop codons. Translation of these aberrant mRNAs could, in some cases, lead to deleterious gain-of-function or dominant-negative activity of the resulting proteins.
Guide RNA (gRNA) or single guide RNA (sgRNA) is a short sequence of RNA that functions as a guide for the Cas9-endonuclease or other Cas-proteins that cut the double-stranded DNA and thereby can be used for gene editing. In bacteria and archaea, gRNAs are a part of the CRISPR-Cas system that serves as an adaptive immune defense that protects the organism from viruses. Here the short gRNAs serve as detectors of foreign DNA and direct the Cas-enzymes that degrades the foreign nucleic acid.
Missense mRNA is a messenger RNA bearing one or more mutated codons that yield polypeptides with an amino acid sequence different from the wild-type or naturally occurring polypeptide. Missense mRNA molecules are created when template DNA strands or the mRNA strands themselves undergo a missense mutation in which a protein coding sequence is mutated and an altered amino acid sequence is coded for.
In Molecular biology, an insert is a piece of DNA that is inserted into a larger DNA vector by a recombinant DNA technique, such as ligation or recombination. This allows it to be multiplied, selected, further manipulated or expressed in a host organism.
Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site-specific locations. The basic mechanism involved in genetic manipulations through programmable nucleases is the recognition of target genomic loci and binding of effector DNA-binding domain (DBD), double-strand breaks (DSBs) in target DNA by the restriction endonucleases, and the repair of DSBs through homology-directed recombination (HDR) or non-homologous end joining (NHEJ).
Genetic engineering techniques allow the modification of animal and plant genomes. Techniques have been devised to insert, delete, and modify DNA at multiple levels, ranging from a specific base pair in a specific gene to entire genes. There are a number of steps that are followed before a genetically modified organism (GMO) is created. Genetic engineers must first choose what gene they wish to insert, modify, or delete. The gene must then be isolated and incorporated, along with other genetic elements, into a suitable vector. This vector is then used to insert the gene into the host genome, creating a transgenic or edited organism.
Cas9 is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.
In molecular biology, mutagenesis is an important laboratory technique whereby DNA mutations are deliberately engineered to produce libraries of mutant genes, proteins, strains of bacteria, or other genetically modified organisms. The various constituents of a gene, as well as its regulatory elements and its gene products, may be mutated so that the functioning of a genetic locus, process, or product can be examined in detail. The mutation may produce mutant proteins with interesting properties or enhanced or novel functions that may be of commercial use. Mutant strains may also be produced that have practical application or allow the molecular basis of a particular cell function to be investigated.
A protospacer adjacent motif (PAM) is a 2–6-base pair DNA sequence immediately following the DNA sequence targeted by the Cas9 nuclease in the CRISPR bacterial adaptive immune system. The PAM is a component of the invading virus or plasmid, but is not found in the bacterial host genome and hence is not a component of the bacterial CRISPR locus. Cas9 will not successfully bind to or cleave the target DNA sequence if it is not followed by the PAM sequence. PAM is an essential targeting component which distinguishes bacterial self from non-self DNA, thereby preventing the CRISPR locus from being targeted and destroyed by the CRISPR-associated nuclease.
Cas12a is a subtype of Cas12 proteins and an RNA-guided endonuclease that forms part of the CRISPR system in some bacteria and archaea. In CRISPR systems, Cas12a serves to destroy the genetic material of viruses and other foreign DNA, thereby protecting the cell from infection. Like other Cas enzymes, Cas12a binds to an RNA to target nucleic acid in a specific and programmable matter. In the host organism, the crRNA contains a constant region that is recognized by the Cas12a protein and a spacer region that is complementary to a piece of foreign nucleic acid that previously infected the cell.
No-SCAR genome editing is an editing method that is able to manipulate the Escherichia coli genome. The system relies on recombineering whereby DNA sequences are combined and manipulated through homologous recombination. No-SCAR is able to manipulate the E. coli genome without the use of the chromosomal markers detailed in previous recombineering methods. Instead, the λ-Red recombination system facilitates donor DNA integration while Cas9 cleaves double-stranded DNA to counter-select against wild-type cells. Although λ-Red and Cas9 genome editing are widely used technologies, the no-SCAR method is novel in combining the two functions; this technique is able to establish point mutations, gene deletions, and short sequence insertions in several genomic loci with increased efficiency and time sensitivity.
CRISPR-Display (CRISP-Disp) is a modification of the CRISPR/Cas9 system for genome editing. The CRISPR/Cas9 system uses a short guide RNA (sgRNA) sequence to direct a Streptococcus pyogenes Cas9 nuclease, acting as a programmable DNA binding protein, to cleave DNA at a site of interest.
Off-target genome editing refers to nonspecific and unintended genetic modifications that can arise through the use of engineered nuclease technologies such as: clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9, transcription activator-like effector nucleases (TALEN), meganucleases, and zinc finger nucleases (ZFN). These tools use different mechanisms to bind a predetermined sequence of DNA (“target”), which they cleave, creating a double-stranded chromosomal break (DSB) that summons the cell's DNA repair mechanisms and leads to site-specific modifications. If these complexes do not bind at the target, often a result of homologous sequences and/or mismatch tolerance, they will cleave off-target DSB and cause non-specific genetic modifications. Specifically, off-target effects consist of unintended point mutations, deletions, insertions inversions, and translocations.
CRISPR gene editing is a genetic engineering technique in molecular biology by which the genomes of living organisms may be modified. It is based on a simplified version of the bacterial CRISPR-Cas9 antiviral defense system. By delivering the Cas9 nuclease complexed with a synthetic guide RNA (gRNA) into a cell, the cell's genome can be cut at a desired location, allowing existing genes to be removed and/or new ones added in vivo.
Prime editing is a 'search-and-replace' genome editing technology in molecular biology by which the genome of living organisms may be modified. The technology directly writes new genetic information into a targeted DNA site. It uses a fusion protein, consisting of a catalytically impaired Cas9 endonuclease fused to an engineered reverse transcriptase enzyme, and a prime editing guide RNA (pegRNA), capable of identifying the target site and providing the new genetic information to replace the target DNA nucleotides. It mediates targeted insertions, deletions, and base-to-base conversions without the need for double strand breaks (DSBs) or donor DNA templates.