This article may be too technical for most readers to understand.(May 2019) |
Site-specific recombinase technologies are genome engineering tools that depend on recombinase enzymes to replace targeted sections of DNA.
In the late 1980s gene targeting in murine embryonic stem cells (ESCs) enabled the transmission of mutations into the mouse germ line, and emerged as a novel option to study the genetic basis of regulatory networks as they exist in the genome. Still, classical gene targeting proved to be limited in several ways as gene functions became irreversibly destroyed by the marker gene that had to be introduced for selecting recombinant ESCs. These early steps led to animals in which the mutation was present in all cells of the body from the beginning leading to complex phenotypes and/or early lethality. There was a clear need for methods to restrict these mutations to specific points in development and specific cell types. This dream became reality when groups in the USA were able to introduce bacteriophage and yeast-derived site-specific recombination (SSR-) systems into mammalian cells as well as into the mouse. [1] [2] [3]
Common genetic engineering strategies require a permanent modification of the target genome. To this end great sophistication has to be invested in the design of routes applied for the delivery of transgenes. Although for biotechnological purposes random integration is still common, it may result in unpredictable gene expression due to variable transgene copy numbers, lack of control about integration sites and associated mutations. The molecular requirements in the stem cell field are much more stringent. Here, homologous recombination (HR) can, in principle, provide specificity to the integration process, but for eukaryotes it is compromised by an extremely low efficiency. Although meganucleases, zinc-finger- and transcription activator-like effector nucleases (ZFNs and TALENs) are actual tools supporting HR, it was the availability of site-specific recombinases (SSRs) which triggered the rational construction of cell lines with predictable properties. Nowadays both technologies, HR and SSR can be combined in highly efficient "tag-and-exchange technologies". [4]
Many site-specific recombination systems have been identified to perform these DNA rearrangements for a variety of purposes, but nearly all of these belong to either of two families, tyrosine recombinases (YR) and serine recombinases (SR), depending on their mechanism. These two families can mediate up to three types of DNA rearrangements (integration, excision/resolution, and inversion) along different reaction routes based on their origin and architecture. [5]
The founding member of the YR family is the lambda integrase, encoded by bacteriophage λ, enabling the integration of phage DNA into the bacterial genome. A common feature of this class is a conserved tyrosine nucleophile attacking the scissile DNA-phosphate to form a 3'-phosphotyrosine linkage. Early members of the SR family are closely related resolvase / DNA invertases from the bacterial transposons Tn3 and γδ, which rely on a catalytic serine responsible for attacking the scissile phosphate to form a 5'-phosphoserine linkage. These undisputed facts, however, were compromised by a good deal of confusion at the time other members entered the scene, for instance the YR recombinases Cre and Flp (capable of integration, excision/resolution as well as inversion), which were nevertheless welcomed as new members of the "integrase family". The converse examples are PhiC31 and related SRs, which were originally introduced as resolvase/invertases although, in the absence of auxiliary factors, integration is their only function. Nowadays the standard activity of each enzyme determines its classification reserving the general term "recombinase" for family members which, per se, comprise all three routes, INT, RES and INV:
Our table extends the selection of the conventional SSR systems and groups these according to their performance. All of these enzymes recombine two target sites, which are either identical (subfamily A1) or distinct (phage-derived enzymes in A2, B1 and B2). [6] Whereas for A1 these sites have individual designations ("FRT" in case of Flp-recombinase, loxP for Cre-recombinase), the terms "attP" and "attB" (attachment sites on the phage and bacterial part, respectively) are valid in the other cases. In case of subfamily A1 we have to deal with short (usually 34 bp-) sites consisting of two (near-)identical 13 bp arms (arrows) flanking an 8 bp spacer (the crossover region, indicated by red line doublets). [7] Note that for Flp there is an alternative, 48 bp site available with three arms, each accommodating a Flp unit (a so-called "protomer"). attP- and attB-sites follow similar architectural rules, but here the arms show only partial identity (indicated by the broken lines) and differ in both cases. These features account for relevant differences:
In order to streamline this chapter the following implementations will be focused on two recombinases (Flp and Cre) and just one integrase (PhiC31) since their spectrum covers the tools which, at present, are mostly used for directed genome modifications. This will be done in the framework of the following overview.
The mode integration/resolution and inversion (INT/RES and INV) depend on the orientation of recombinase target sites (RTS), among these pairs of attP and attB. Section C indicates, in a streamlined fashion, the way recombinase-mediated cassette exchange (RMCE) can be reached by synchronous double-reciprocal crossovers (rather than integration, followed by resolution). [8] [9]
Tyr-Recombinases are reversible, while the Ser-Integrase is unidirectional. Of note is the way reversible Flp (a Tyr recombinase) integration/resolution is modulated by 48 bp (in place of 34 bp minimal) FRT versions: the extra 13 bp arm serves as a Flp "landing path" contributing to the formation of the synaptic complex, both in the context of Flp-INT and Flp-RMCE functions (see the respective equilibrium situations). While it is barely possible to prevent the (entropy-driven) reversion of integration in section A for Cre and hard to achieve for Flp, RMCE can be completed if the donor plasmid is provided at an excess due to the bimolecular character of both the forward- and the reverse reaction. Posing both FRT sites in an inverse manner will lead to an equilibrium of both orientations for the insert (green arrow). In contrast to Flp, the Ser integrase PhiC31 (bottom representations) leads to unidirectional integration, at least in the absence of an recombinase-directionality (RDF-)factor. [10] Relative to Flp-RMCE, which requires two different ("heterospecific") FRT-spacer mutants, the reaction partner (attB) of the first reacting attP site is hit arbitrarily, such that there is no control over the direction the donor cassette enters the target (cf. the alternative products). Also different from Flp-RMCE, several distinct RMCE targets cannot be mounted in parallel, owing to the lack of heterospecific (non-crossinteracting) attP/attB combinations.
Cre recombinase (Cre) is able to recombine specific sequences of DNA without the need for cofactors. The enzyme recognizes 34 base pair DNA sequences called loxP ("locus of crossover in phage P1"). Depending on the orientation of target sites with respect to one another, Cre will integrate/excise or invert DNA sequences. Upon the excision (called "resolution" in case of a circular substrate) of a particular DNA region, normal gene expression is considerably compromised or terminated. [11]
Due to the pronounced resolution activity of Cre, one of its initial applications was the excision of loxP-flanked ("floxed") genes leading to cell-specific gene knockout of such a floxed gene after Cre becomes expressed in the tissue of interest. Current technologies incorporate methods, which allow for both the spatial and temporal control of Cre activity. A common method facilitating the spatial control of genetic alteration involves the selection of a tissue-specific promoter to drive Cre expression. Placement of Cre under control of such a promoter results in localized, tissue-specific expression. As an example, Leone et al. have placed the transcription unit under the control of the regulatory sequences of the myelin proteolipid protein (PLP) gene, leading to induced removal of targeted gene sequences in oligodendrocytes and Schwann cells. [12] The specific DNA fragment recognized by Cre remains intact in cells, which do not express the PLP gene; this in turn facilitates empirical observation of the localized effects of genome alterations in the myelin sheath that surround nerve fibers in the central nervous system (CNS) and the peripheral nervous system (PNS). [13] Selective Cre expression has been achieved in many other cell types and tissues as well.
In order to control temporal activity of the excision reaction, forms of Cre which take advantage of various ligand binding domains have been developed. One successful strategy for inducing specific temporal Cre activity involves fusing the enzyme with a mutated ligand-binding domain for the human estrogen receptor (ERt). Upon the introduction of tamoxifen (an estrogen receptor antagonist), the Cre-ERt construct is able to penetrate the nucleus and induce targeted mutation. ERt binds tamoxifen with greater affinity than endogenous estrogens, which allows Cre-ERt to remain cytoplasmic in animals untreated with tamoxifen. The temporal control of SSR activity by tamoxifen permits genetic changes to be induced later in embryogenesis and/or in adult tissues. [12] This allows researchers to bypass embryonic lethality while still investigating the function of targeted genes.
Recent extensions of these general concepts led to generating the "Cre-zoo", i.e. collections of hundreds of mouse strains for which defined genes can be deleted by targeted Cre expression. [3]
In its natural host (S. cerevisiae) the Flp/FRT system enables replication of a "2μ plasmid" by the inversion of a segment that is flanked by two identical, but oppositely oriented FRT sites ("flippase" activity). This inversion changes the relative orientation of replication forks within the plasmid enabling "rolling circle"—amplification of the circular 2μ entity before the multimeric intermediates are resolved to release multiple monomeric products. Whereas 34 bp minimal FRT sites favor excision/resolution to a similar extent as the analogue loxP sites for Cre, the natural, more extended 48 bp FRT variants enable a higher degree of integration, while overcoming certain promiscuous interactions as described for phage enzymes like Cre- [5] and PhiC31. [6] An additional advantage is the fact, that simple rules can be applied to generate heterospecific FRT sites which undergo crossovers with equal partners but nor with wild type FRTs. These facts have enabled, since 1994, the development and continuous refinements of recombinase-mediated cassette exchange (RMCE-)strategies permitting the clean exchange of a target cassette for an incoming donor cassette. [6]
Based on the RMCE technology, a particular resource of pre-characterized ES-strains that lends itself to further elaboration has evolved in the framework of the EUCOMM (European Conditional Mouse Mutagenesis) program, based on the now established Cre- and/or Flp-based "FlExing" (Flp-mediated excision/inversion) setups, [6] involving the excision and inversion activities. Initiated in 2005, this project focused first on saturation mutagenesis to enable complete functional annotation of the mouse genome (coordinated by the International Knockout-Mouse Consortium, IKMC) with the ultimate goal to have all protein genes mutated via gene trapping and -targeting in murine ES cells. [14] These efforts mark the top of various "tag-and-exchange" strategies, which are dedicated to tagging a distinct genomic site such that the "tag" can serve as an address to introduce novel (or alter existing) genetic information. The tagging step per se may address certain classes of integration sites by exploiting integration preferences of retroviruses or even site specific integrases like PhiC31, both of which act in an essentially unidirectional fashion.
The traditional, laborious "tag-and-exchange" procedures relied on two successive homologous recombination (HR-)steps, the first one ("HR1") to introduce a tag consisting of a selection marker gene. "HR2" was then used to replace the marker by the "GOI. In the first ("knock-out"-) reaction the gene was tagged with a selectable marker, typically by insertion of a hygtk ([+/-]) cassette providing G418 resistance. In the following "knock-in" step, the tagged genomic sequence was replaced by homologous genomic sequences with certain mutations. Cell clones could then be isolated by their resistance to ganciclovir due to loss of the HSV-tk gene, i.e. ("negative selection"). This conventional two-step tag-and-exchange procedure [15] could be streamlined after the advent of RMCE, which could take over and add efficiency to the knock-in step.
Without much doubt, Ser integrases are the current tools of choice for integrating transgenes into a restricted number of well-understood genomic acceptor sites that mostly (but not always) mimic the phage attP site in that they attract an attB-containing donor vector. At this time the most prominent member is PhiC31-INT with proven potential in the context of human and mouse genomes.
Contrary to the above Tyr recombinases, PhiC31-INT as such acts in a unidirectional manner, firmly locking in the donor vector at a genomically anchored target. An obvious advantage of this system is that it can rely on unmodified, native attP (acceptor) and attB donor sites. Additional benefits (together with certain complications) may arise from the fact that mouse and human genomes per se contain a limited number of endogenous targets (so called "attP-pseudosites"). Available information suggests that considerable DNA sequence requirements let the integrase recognize fewer sites than retroviral or even transposase-based integration systems opening its career as a superior carrier vehicle for the transport and insertion at a number of well established genomic sites, some of which with so called "safe-harbor" properties. [10]
Exploiting the fact of specific (attP x attB) recombination routes, RMCE becomes possible without requirements for synthetic, heterospecific att-sites. This obvious advantage, however comes at the expense of certain shortcomings, such as lack of control about the kind or directionality of the entering (donor-) cassette. [6] Further restrictions are imposed by the fact that irreversibility does not permit standard multiplexing-RMCE setups including "serial RMCE" reactions, i.e., repeated cassette exchanges at a given genomic locus.
Annotation of the human and mouse genomes has led to the identification of >20 000 protein-coding genes and >3 000 noncoding RNA genes, which guide the development of the organism from fertilization through embryogenesis to adult life. Although dramatic progress is noted, the relevance of rare gene variants has remained a central topic of research.
As one of the most important platforms for dealing with vertebrate gene functions on a large scale, genome-wide genetic resources of mutant murine ES cells have been established. To this end four international programs aimed at saturation mutagenesis of the mouse genome have been founded in Europe and North America (EUCOMM, KOMP, NorCOMM, and TIGM). Coordinated by the International Knockout Mouse Consortium (IKSC) these ES-cell repositories are available for exchange between international research units. Present resources comprise mutations in 11 539 unique genes, 4 414 of these conditional. [14]
The relevant technologies have now reached a level permitting their extension to other mammalian species and to human stem cells, most prominently those with an iPS (induced pluripotent) status.
Retroviral integrase (IN) is an enzyme produced by a retrovirus that integrates its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage integrases (recombinases) used in biotechnology, such as λ phage integrase, as discussed in site-specific recombination.
Gene knockouts are a widely used genetic engineering technique that involves the targeted removal or inactivation of a specific gene within an organism's genome. This can be done through a variety of methods, including homologous recombination, CRISPR-Cas9, and TALENs.
Integrons are genetic mechanisms that allow bacteria to adapt and evolve rapidly through the stockpiling and expression of new genes. These genes are embedded in a specific genetic structure called gene cassette that generally carries one promoterless open reading frame (ORF) together with a recombination site (attC). Integron cassettes are incorporated to the attI site of the integron platform by site-specific recombination reactions mediated by the integrase.
A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transposition. The word "transposase" was first coined by the individuals who cloned the enzyme required for transposition of the Tn3 transposon. The existence of transposons was postulated in the late 1940s by Barbara McClintock, who was studying the inheritance of maize, but the actual molecular basis for transposition was described by later groups. McClintock discovered that some segments of chromosomes changed their position, jumping between different loci or from one chromosome to another. The repositioning of these transposons allowed other genes for pigment to be expressed. Transposition in maize causes changes in color; however, in other organisms, such as bacteria, it can cause antibiotic resistance. Transposition is also important in creating genetic diversity within species and generating adaptability to changing living conditions.
A transgene is a gene that has been transferred naturally, or by any of a number of genetic engineering techniques, from one organism to another. The introduction of a transgene, in a process known as transgenesis, has the potential to change the phenotype of an organism. Transgene describes a segment of DNA containing a gene sequence that has been isolated from one organism and is introduced into a different organism. This non-native segment of DNA may either retain the ability to produce RNA or protein in the transgenic organism or alter the normal function of the transgenic organism's genetic code. In general, the DNA is incorporated into the organism's germ line. For example, in higher vertebrates this can be accomplished by injecting the foreign DNA into the nucleus of a fertilized ovum. This technique is routinely used to introduce human disease genes or other genes of interest into strains of laboratory mice to study the function or pathology involved with that particular gene.
Cre-Lox recombination is a site-specific recombinase technology, used to carry out deletions, insertions, translocations and inversions at specific sites in the DNA of cells. It allows the DNA modification to be targeted to a specific cell type or be triggered by a specific external stimulus. It is implemented both in eukaryotic and prokaryotic systems. The Cre-lox recombination system has been particularly useful to help neuroscientists to study the brain in which complex cell types and neural circuits come together to generate cognition and behaviors. NIH Blueprint for Neuroscience Research has created several hundreds of Cre driver mouse lines which are currently used by the worldwide neuroscience community.
Cre recombinase is a tyrosine recombinase enzyme derived from the P1 bacteriophage. The enzyme uses a topoisomerase I-like mechanism to carry out site specific recombination events. The enzyme is a member of the integrase family of site specific recombinase and it is known to catalyse the site specific recombination event between two DNA recognition sites. This 34 base pair (bp) loxP recognition site consists of two 13 bp palindromic sequences which flank an 8bp spacer region. The products of Cre-mediated recombination at loxP sites are dependent upon the location and relative orientation of the loxP sites. Two separate DNA species both containing loxP sites can undergo fusion as the result of Cre mediated recombination. DNA sequences found between two loxP sites are said to be "floxed". In this case the products of Cre mediated recombination depends upon the orientation of the loxP sites. DNA found between two loxP sites oriented in the same direction will be excised as a circular loop of DNA whilst intervening DNA between two loxP sites that are opposingly orientated will be inverted. The enzyme requires no additional cofactors or accessory proteins for its function.
In biology, a gene cassette is a type of mobile genetic element that contains a gene and a recombination site. Each cassette usually contains a single gene and tends to be very small; on the order of 500–1,000 base pairs. They may exist incorporated into an integron or freely as circular DNA. Gene cassettes can move around within an organism's genome or be transferred to another organism in the environment via horizontal gene transfer. These cassettes often carry antibiotic resistance genes. An example would be the kanMX cassette which confers kanamycin resistance upon bacteria.
Recombinases are genetic recombination enzymes.
In genetics, Flp-FRT recombination is a site-directed recombination technology, increasingly used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-lox recombination but involves the recombination of sequences between short flippase recognition target (FRT) sites by the recombinase flippase (Flp) derived from the 2 µ plasmid of baker's yeast Saccharomyces cerevisiae.
P1 is a temperate bacteriophage that infects Escherichia coli and some other bacteria. When undergoing a lysogenic cycle the phage genome exists as a plasmid in the bacterium unlike other phages that integrate into the host DNA. P1 has an icosahedral head containing the DNA attached to a contractile tail with six tail fibers. The P1 phage has gained research interest because it can be used to transfer DNA from one bacterial cell to another in a process known as transduction. As it replicates during its lytic cycle it captures fragments of the host chromosome. If the resulting viral particles are used to infect a different host the captured DNA fragments can be integrated into the new host's genome. This method of in vivo genetic engineering was widely used for many years and is still used today, though to a lesser extent. P1 can also be used to create the P1-derived artificial chromosome cloning vector which can carry relatively large fragments of DNA. P1 encodes a site-specific recombinase, Cre, that is widely used to carry out cell-specific or time-specific DNA recombination by flanking the target DNA with loxP sites.
Site-specific recombination, also known as conservative site-specific recombination, is a type of genetic recombination in which DNA strand exchange takes place between segments possessing at least a certain degree of sequence homology. Enzymes known as site-specific recombinases (SSRs) perform rearrangements of DNA segments by recognizing and binding to short, specific DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved, and rejoin the DNA strands. In some cases the presence of a recombinase enzyme and the recombination sites is sufficient for the reaction to proceed; in other systems a number of accessory proteins and/or accessory sites are required. Many different genome modification strategies, among these recombinase-mediated cassette exchange (RMCE), an advanced approach for the targeted introduction of transcription units into predetermined genomic loci, rely on SSRs.
Gene targeting is a biotechnological tool used to change the DNA sequence of an organism. It is based on the natural DNA-repair mechanism of Homology Directed Repair (HDR), including Homologous Recombination. Gene targeting can be used to make a range of sizes of DNA edits, from larger DNA edits such as inserting entire new genes into an organism, through to much smaller changes to the existing DNA such as a single base-pair change. Gene targeting relies on the presence of a repair template to introduce the user-defined edits to the DNA. The user will design the repair template to contain the desired edit, flanked by DNA sequence corresponding (homologous) to the region of DNA that the user wants to edit; hence the edit is targeted to a particular genomic region. In this way Gene Targeting is distinct from natural homology-directed repair, during which the ‘natural’ DNA repair template of the sister chromatid is used to repair broken DNA. The alteration of DNA sequence in an organism can be useful in both a research context – for example to understand the biological role of a gene – and in biotechnology, for example to alter the traits of an organism.
Conditional gene knockout is a technique used to eliminate a specific gene in a certain tissue, such as the liver. This technique is useful to study the role of individual genes in living organisms. It differs from traditional gene knockout because it targets specific genes at specific times rather than being deleted from beginning of life. Using the conditional gene knockout technique eliminates many of the side effects from traditional gene knockout. In traditional gene knockout, embryonic death from a gene mutation can occur, and this prevents scientists from studying the gene in adults. Some tissues cannot be studied properly in isolation, so the gene must be inactive in a certain tissue while remaining active in others. With this technology, scientists are able to knockout genes at a specific stage in development and study how the knockout of a gene in one tissue affects the same gene in other tissues.
RMCE is a procedure in reverse genetics allowing the systematic, repeated modification of higher eukaryotic genomes by targeted integration, based on the features of site-specific recombination processes (SSRs). For RMCE, this is achieved by the clean exchange of a preexisting gene cassette for an analogous cassette carrying the "gene of interest" (GOI).
The Gateway cloning method, invented and commercialized by Invitrogen since the late 1990s, is the cloning method of the integration and excision recombination reactions that take place when bacteriophage lambda infects bacteria. This technology provides a fast and highly efficient way to transport DNA sequences into multi-vector systems for functional analysis and protein expression using Gateway att sites, and two proprietary enzyme mixes called BP Clonase and LR Clonase. In vivo, these recombination reactions are facilitated by the recombination of attachment sites from the lambda/phage chromosome (attP) and the bacteria (attB). As a result of recombination between the attP and attB sites, the phage integrates into the bacterial genome flanked by two new recombination sites. The removal of the phage from the bacterial chromosome and the regeneration of attP and attB sites can both result from the attL and attR sites recombining under specific circumstances.
Bacteriophage P2, scientific name Escherichia virus P2, is a temperate phage that infects E. coli. It is a tailed virus with a contractile sheath and is thus classified in the genus Peduovirus, subfamily Peduovirinae, family Myoviridae within order Caudovirales. This genus of viruses includes many P2-like phages as well as the satellite phage P4.
In genetics, floxing refers to the sandwiching of a DNA sequence between two lox P sites. The terms are constructed upon the phrase "flanking/flanked by LoxP". Recombination between LoxP sites is catalysed by Cre recombinase. Floxing a gene allows it to be deleted, translocated or inverted in a process called Cre-Lox recombination. The floxing of genes is essential in the development of scientific model systems as it allows researchers to have spatial and temporal alteration of gene expression. Moreover, animals such as mice can be used as models to study human disease. Therefore, Cre-lox system can be used in mice to manipulate gene expression in order to study human diseases and drug development. For example, using the Cre-lox system, researchers can study oncogenes and tumor suppressor genes and their role in development and progression of cancer in mice models.
Susan M. Dymecki is an American geneticist and neuroscientist and director of the Biological and Biomedical Sciences PhD Program at Harvard University. Dymecki is also a professor in the Department of Genetics and the principal investigator of the Dymecki Lab at Harvard. Her lab characterizes the development and function of unique populations of serotonergic neurons in the mouse brain. To enable this functional dissection, Dymecki has pioneered several transgenic tools for probing neural circuit development and function. Dymecki also competed internationally as an ice dancer, placing 7th in the 1980 U.S. Figure Skating Championships.
Genome editing of synthetic target arrays for lineage tracing (GESTALT) is a method used to determine the developmental lineages of cells in multicellular systems. GESTALT involves introducing a small DNA barcode that contains regularly spaced CRISPR/Cas9 target sites into the genomes of progenitor cells. Alongside the barcode, Cas9 and sgRNA are introduced into the cells. Mutations in the barcode accumulate during the course of cell divisions and the unique combination of mutations in a cell's barcode can be determined by DNA or RNA sequencing to link it to a developmental lineage.