Site-specific recombination, also known as conservative site-specific recombination, is a type of genetic recombination in which DNA strand exchange takes place between segments possessing at least a certain degree of sequence homology. [1] [2] [3] Enzymes known as site-specific recombinases (SSRs) perform rearrangements of DNA segments by recognizing and binding to short, specific DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved, and rejoin the DNA strands. In some cases the presence of a recombinase enzyme and the recombination sites is sufficient for the reaction to proceed; in other systems a number of accessory proteins and/or accessory sites are required. Many different genome modification strategies, among these recombinase-mediated cassette exchange (RMCE), an advanced approach for the targeted introduction of transcription units into predetermined genomic loci, rely on SSRs.
Site-specific recombination systems are highly specific, fast, and efficient, even when faced with complex eukaryotic genomes. [4] They are employed naturally in a variety of cellular processes, including bacterial genome replication, differentiation and pathogenesis, and movement of mobile genetic elements. [5] For the same reasons, they present a potential basis for the development of genetic engineering tools. [6]
Recombination sites are typically between 30 and 200 nucleotides in length and consist of two motifs with a partial inverted-repeat symmetry, to which the recombinase binds, and which flank a central crossover sequence at which the recombination takes place. The pairs of sites between which the recombination occurs are usually identical, but there are exceptions (e.g. attP and attB of λ integrase). [7]
Based on amino acid sequence homologies and mechanistic relatedness, most site-specific recombinases are grouped into one of two families: the tyrosine (Tyr) recombinase family or serine (Ser) recombinase family. The names stem from the conserved nucleophilic amino acid residue present in each class of recombinase which is used to attack the DNA and which becomes covalently linked to it during strand exchange. The earliest identified members of the serine recombinase family were known as resolvases or DNA invertases, while the founding member of the tyrosine recombinases, lambda phage integrase (using attP/B recognition sites), differs from the now well-known enzymes such as Cre (from the P1 phage) and FLP (from the yeast Saccharomyces cerevisiae ). Famous serine recombinases include enzymes such as gamma-delta resolvase (from the Tn1000 transposon), Tn3 resolvase (from the Tn3 transposon), and φC31 integrase (from the φC31 phage). [8] There are several classes of serine recombinases, consisting of the small serine recombinase, the ISXc5 resolvase, the serine transposase, and the large serine recombinase [9] .
Although the individual members of the two recombinase families can perform reactions with the same practical outcomes, the families are unrelated to each other, having different protein structures and reaction mechanisms. Unlike tyrosine recombinases, serine recombinases are highly modular, as was first hinted by biochemical studies [10] and later shown by crystallographic structures. [11] [12] Knowledge of these protein structures could prove useful when attempting to re-engineer recombinase proteins as tools for genetic manipulation.
Recombination between two DNA sites begins by the recognition and binding of these sites – one site on each of two separate double-stranded DNA molecules, or at least two distant segments of the same molecule – by the recombinase enzyme. This is followed by synapsis, i.e. bringing the sites together to form the synaptic complex. It is within this synaptic complex that the strand exchange takes place, as the DNA is cleaved and rejoined by controlled transesterification reactions. During strand exchange, each double-stranded DNA molecule is cut at a fixed point within the crossover region of the recognition site, releasing a deoxyribose hydroxyl group, while the recombinase enzyme forms a transient covalent bond to a DNA backbone phosphate. This phosphodiester bond between the hydroxyl group of the nucleophilic serine or tyrosine residue conserves the energy that was expended in cleaving the DNA. Energy stored in this bond is subsequently used for the rejoining of the DNA to the corresponding deoxyribose hydroxyl group on the other DNA molecule. The entire reaction therefore proceeds without the need for external energy-rich cofactors such as ATP.
Although the basic chemical reaction is the same for both tyrosine and serine recombinases, there are some differences between them. [13] Tyrosine recombinases, such as Cre or FLP, cleave one DNA strand at a time at points that are staggered by 6–8bp, linking the 3' end of the strand to the hydroxyl group of the tyrosine nucleophile (Fig. 1). [14] Strand exchange then proceeds via a crossed strand intermediate analogous to the Holliday junction in which only one pair of strands has been exchanged. [15] [16]
The mechanism and control of serine recombinases is much less well understood. This group of enzymes was only discovered in the mid-1990s and is still relatively small. The now classical members gamma-delta and Tn3 resolvase, but also new additions like φC31-, Bxb1-, and R4 integrases, cut all four DNA strands simultaneously at points that are staggered by 2 bp (Fig. 2). [17] During cleavage, a protein–DNA bond is formed via a transesterification reaction, in which a phosphodiester bond is replaced by a phosphoserine bond between a 5' phosphate at the cleavage site and the hydroxyl group of the conserved serine residue (S10 in resolvase). [18] [19]
It is still not entirely clear how the strand exchange occurs after the DNA has been cleaved. However, it has been shown that the strands are exchanged while covalently linked to the protein, with a resulting net rotation of 180°. [20] [21] The most quoted (but not the only) model accounting for these facts is the "subunit rotation model" (Fig. 2). [13] [22] Independent of the model, DNA duplexes are situated outside of the protein complex, and large movement of the protein is needed to achieve the strand exchange. In this case the recombination sites are slightly asymmetric, which allows the enzyme to tell apart the left and right ends of the site. When generating products, left ends are always joined to the right ends of their partner sites, and vice versa. This causes different recombination hybrid sites to be reconstituted in the recombination products. Joining of left ends to left or right to right is avoided due to the asymmetric "overlap" sequence between the staggered points of top and bottom strand exchange, which is in stark contrast to the mechanism employed by tyrosine recombinases. [13]
The reaction catalysed by Cre-recombinase, for instance, may lead to excision of the DNA segment flanked by the two sites (Fig. 3A), but may also lead to integration or inversion of the orientation of the flanked DNA segment (Fig. 3B). What the outcome of the reaction will be is dictated mainly by the relative locations and orientations of the sites that are to be recombined, but also by the innate specificity of the site-specific system in question. Excisions and inversions occur if the recombination takes place between two sites that are found on the same molecule (intramolecular recombination), and if the sites are in the same (direct repeat) or in an opposite orientation (inverted repeat), respectively. Insertions, on the other hand, take place if the recombination occurs on sites that are situated on two different DNA molecules (intermolecular recombination), provided that at least one of these molecules is circular. Most site-specific systems are highly specialised, catalysing only one of these different types of reaction, and have evolved to ignore the sites that are in the "wrong" orientation.
Chymotrypsin (EC 3.4.21.1, chymotrypsins A and B, alpha-chymar ophth, avazyme, chymar, chymotest, enzeon, quimar, quimotrase, alpha-chymar, alpha-chymotrypsin A, alpha-chymotrypsin) is a digestive enzyme component of pancreatic juice acting in the duodenum, where it performs proteolysis, the breakdown of proteins and polypeptides. Chymotrypsin preferentially cleaves peptide amide bonds where the side chain of the amino acid N-terminal to the scissile amide bond (the P1 position) is a large hydrophobic amino acid (tyrosine, tryptophan, and phenylalanine). These amino acids contain an aromatic ring in their side chain that fits into a hydrophobic pocket (the S1 position) of the enzyme. It is activated in the presence of trypsin. The hydrophobic and shape complementarity between the peptide substrate P1 side chain and the enzyme S1 binding cavity accounts for the substrate specificity of this enzyme. Chymotrypsin also hydrolyzes other amide bonds in peptides at slower rates, particularly those containing leucine at the P1 position.
Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthesis is most commonly performed by ribosomes in cells. Peptides can also be synthesized in the laboratory. Protein primary structures can be directly sequenced, or inferred from DNA sequences.
Protein biosynthesis is a core biological process, occurring inside cells, balancing the loss of cellular proteins through the production of new proteins. Proteins perform a number of critical functions as enzymes, structural proteins or hormones. Protein synthesis is a very similar process for both prokaryotes and eukaryotes but there are some distinct differences.
Retroviral integrase (IN) is an enzyme produced by a retrovirus that integrates its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage integrases (recombinases) used in biotechnology, such as λ phage integrase, as discussed in site-specific recombination.
Exodeoxyribonuclease V is an enzyme of E. coli that initiates recombinational repair from potentially lethal double strand breaks in DNA which may result from ionizing radiation, replication errors, endonucleases, oxidative damage, and a host of other factors. The RecBCD enzyme is both a helicase that unwinds, or separates the strands of DNA, and a nuclease that makes single-stranded nicks in DNA. It catalyses exonucleolytic cleavage in either 5′- to 3′- or 3′- to 5′-direction to yield 5′-phosphooligonucleotides.
A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transposition. The word "transposase" was first coined by the individuals who cloned the enzyme required for transposition of the Tn3 transposon. The existence of transposons was postulated in the late 1940s by Barbara McClintock, who was studying the inheritance of maize, but the actual molecular basis for transposition was described by later groups. McClintock discovered that some segments of chromosomes changed their position, jumping between different loci or from one chromosome to another. The repositioning of these transposons allowed other genes for pigment to be expressed. Transposition in maize causes changes in color; however, in other organisms, such as bacteria, it can cause antibiotic resistance. Transposition is also important in creating genetic diversity within species and generating adaptability to changing living conditions.
In biochemistry, dephosphorylation is the removal of a phosphate group from an organic compound by hydrolysis. It is a reversible post-translational modification. Dephosphorylation and its counterpart, phosphorylation, activate and deactivate enzymes by detaching or attaching phosphoric esters and anhydrides. A notable occurrence of dephosphorylation is the conversion of ATP to ADP and inorganic phosphate.
Deoxyribozymes, also called DNA enzymes, DNAzymes, or catalytic DNA, are DNA oligonucleotides that are capable of performing a specific chemical reaction, often but not always catalytic. This is similar to the action of other biological enzymes, such as proteins or ribozymes . However, in contrast to the abundance of protein enzymes in biological systems and the discovery of biological ribozymes in the 1980s, there is only little evidence for naturally occurring deoxyribozymes. Deoxyribozymes should not be confused with DNA aptamers which are oligonucleotides that selectively bind a target ligand, but do not catalyze a subsequent chemical reaction.
Cre-Lox recombination is a site-specific recombinase technology, used to carry out deletions, insertions, translocations and inversions at specific sites in the DNA of cells. It allows the DNA modification to be targeted to a specific cell type or be triggered by a specific external stimulus. It is implemented both in eukaryotic and prokaryotic systems. The Cre-lox recombination system has been particularly useful to help neuroscientists to study the brain in which complex cell types and neural circuits come together to generate cognition and behaviors. NIH Blueprint for Neuroscience Research has created several hundreds of Cre driver mouse lines which are currently used by the worldwide neuroscience community.
Site-specific recombinase technologies are genome engineering tools that depend on recombinase enzymes to replace targeted sections of DNA.
Cre recombinase is a tyrosine recombinase enzyme derived from the P1 bacteriophage. The enzyme uses a topoisomerase I-like mechanism to carry out site specific recombination events. The enzyme is a member of the integrase family of site specific recombinase and it is known to catalyse the site specific recombination event between two DNA recognition sites. This 34 base pair (bp) loxP recognition site consists of two 13 bp palindromic sequences which flank an 8bp spacer region. The products of Cre-mediated recombination at loxP sites are dependent upon the location and relative orientation of the loxP sites. Two separate DNA species both containing loxP sites can undergo fusion as the result of Cre mediated recombination. DNA sequences found between two loxP sites are said to be "floxed". In this case the products of Cre mediated recombination depends upon the orientation of the loxP sites. DNA found between two loxP sites oriented in the same direction will be excised as a circular loop of DNA whilst intervening DNA between two loxP sites that are opposingly orientated will be inverted. The enzyme requires no additional cofactors or accessory proteins for its function.
A Holliday junction is a branched nucleic acid structure that contains four double-stranded arms joined. These arms may adopt one of several conformations depending on buffer salt concentrations and the sequence of nucleobases closest to the junction. The structure is named after Robin Holliday, the molecular biologist who proposed its existence in 1964.
Recombinases are genetic recombination enzymes.
In genetics, Flp-FRT recombination is a site-directed recombination technology, increasingly used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-lox recombination but involves the recombination of sequences between short flippase recognition target (FRT) sites by the recombinase flippase (Flp) derived from the 2 μ plasmid of baker's yeast Saccharomyces cerevisiae.
Protein metabolism denotes the various biochemical processes responsible for the synthesis of proteins and amino acids (anabolism), and the breakdown of proteins by catabolism.
Hin recombinase is a 21kD protein composed of 198 amino acids that is found in the bacteria Salmonella. Hin belongs to the serine recombinase family (B2) of DNA invertases in which it relies on the active site serine to initiate DNA cleavage and recombination. The related protein, gamma-delta resolvase shares high similarity to Hin, of which much structural work has been done, including structures bound to DNA and reaction intermediates. Hin functions to invert a 900 base pair (bp) DNA segment within the salmonella genome that contains a promoter for downstream flagellar genes, fljA and fljB. Inversion of the intervening DNA alternates the direction of the promoter and thereby alternates expression of the flagellar genes. This is advantageous to the bacterium as a means of escape from the host immune response.
P1 is a temperate bacteriophage that infects Escherichia coli and some other bacteria. When undergoing a lysogenic cycle the phage genome exists as a plasmid in the bacterium unlike other phages that integrate into the host DNA. P1 has an icosahedral head containing the DNA attached to a contractile tail with six tail fibers. The P1 phage has gained research interest because it can be used to transfer DNA from one bacterial cell to another in a process known as transduction. As it replicates during its lytic cycle it captures fragments of the host chromosome. If the resulting viral particles are used to infect a different host the captured DNA fragments can be integrated into the new host's genome. This method of in vivo genetic engineering was widely used for many years and is still used today, though to a lesser extent. P1 can also be used to create the P1-derived artificial chromosome cloning vector which can carry relatively large fragments of DNA. P1 encodes a site-specific recombinase, Cre, that is widely used to carry out cell-specific or time-specific DNA recombination by flanking the target DNA with loxP sites.
The Tn3 transposon is a 4957 base pair mobile genetic element, found in prokaryotes. It encodes three proteins:
AIDS is caused by the human immunodeficiency virus (HIV). Individuals with HIV have what is referred to as a "HIV infection". When infected semen, vaginal secretions, or blood come in contact with the mucous membranes or broken skin of an uninfected person, HIV may be transferred to the uninfected person, causing another infection. Additionally, HIV can also be passed from infected pregnant women to their uninfected baby during pregnancy and/or delivery, or via breastfeeding. As a result of HIV infection, a portion of these individuals will progress and go on to develop clinically significant AIDS.
The first human immunodeficiency virus (HIV) case was reported in the United States in the early 1980s. Many drugs have been discovered to treat the disease but mutations in the virus and resistance to the drugs make development difficult. Integrase is a viral enzyme that integrates retroviral DNA into the host cell genome. Integrase inhibitors are a new class of drugs used in the treatment of HIV. The first integrase inhibitor, raltegravir, was approved in 2007 and other drugs were in clinical trials in 2011.