Ligation is the joining of two nucleotides, or two nucleic acid fragments, into a single polymeric chain through the action of an enzyme known as a ligase. The reaction involves the formation of a phosphodiester bond between the 3'-hydroxyl terminus of one nucleotide and the 5'-phosphoryl terminus of another nucleotide, which results in the two nucleotides being linked consecutively on a single strand. Ligation works in fundamentally the same way for both DNA and RNA. A cofactor is generally involved in the reaction, usually ATP or NAD +. Eukaryotic ligases belong to the ATP type, while the NAD+ type are found in bacteria (e.g. E. coli). [1]
Ligation occurs naturally as part of numerous cellular processes, including DNA replication, transcription, splicing, and recombination, and is also an essential laboratory procedure in molecular cloning, whereby DNA fragments are joined to create recombinant DNA molecules (such as when a foreign DNA fragment is inserted into a plasmid). The discovery of DNA ligase dates back to 1967 and was an important event in the field of molecular biology. [1] Ligation in the laboratory is normally performed using T4 DNA ligase. It is broadly used in vitro due to its capability of joining sticky-ended fragments as well as blunt-ended fragments. [2] However, procedures for ligation without the use of standard DNA ligase are also popular. Human DNA ligase abnormalities have been linked to pathological disorders characterized by immunodeficiency, radiation sensitivity, and developmental problems. [3]
The mechanism of the ligation reaction was first elucidated in the laboratory of I. Robert Lehman. [4] [5] Two fragments of DNA may be joined by DNA ligase which catalyzes the formation of a phosphodiester bond between the 3'-hydroxyl group (-OH) at one end of a strand of DNA and the 5'-phosphate group (-PO4) of another. In animals and bacteriophages, ATP is used as the energy source for the ligation, while in bacteria, NAD+ is used. [6]
The DNA ligase first reacts with ATP or NAD+, forming a ligase-AMP intermediate with the AMP linked to the ε-amino group of lysine in the active site of the ligase via a phosphoramide bond. This adenylyl group is then transferred to the phosphate group at the 5' end of a DNA chain, forming a DNA-adenylate complex. Finally, a phosphodiester bond between the two DNA ends is formed via the nucleophilic attack of the 3'-hydroxyl at the end of a DNA strand on the activated 5′-phosphoryl group of another. [4]
A nick in the DNA (i.e. a break in one strand of a double-stranded DNA) can be repaired very efficiently by the ligase. However, a complicating feature of ligation conducted presents itself when ligating two separate DNA ends as the two ends need to come together before the ligation reaction can proceed. In the ligation reactions conducted in a laboratory, the ligation of DNA with sticky or cohesive ends, the protruding strands of DNA may be annealed together already, therefore it is a relatively efficient process as it is equivalent to repairing two nicks in the DNA. However, in the ligation of blunt-ends, which lack protruding ends for the DNA to anneal together, the process is dependent on random collision for the ends to align together before they can be ligated, and is consequently a much less efficient process. [7] The DNA ligase from E. coli cannot ligate blunt-ended DNA except under conditions of molecular crowding, and it is therefore not normally used for ligation in the laboratory. Instead the DNA ligase from phage T4 is used as it can ligate blunt-ended DNA as well as single-stranded DNA. [8] [6]
In the laboratory, factors that affect an enzyme-mediated chemical reaction would naturally affect a ligation reaction, these include the concentration of enzyme and the reactants, the temperature of reaction and the length of time of incubation. Ligation is complicated by the fact that the reaction can involve both inter- and intra-molecular reactions, but the desired ligation products in many ligation reactions (e.g. ligating a DNA fragment into a vector) needed first to be inter-molecular, i.e. between two different DNA molecules, followed by an intra-molecular reaction to seal and circularize the molecule. For efficient ligation, an additional annealing step is also necessary.
The three steps to form a new phosphodiester bond during ligation are: enzyme adenylylation, adenylyl transfer to DNA, and nick sealing. Mg(2+) is a cofactor for catalysis, therefore at high concentration of Mg(2+) the ligation efficiency is high. If the concentration of Mg(2+) is limited, the nick- sealing is the rate- limiting reaction of the process, and adenylylated DNA intermediate stays in the solution. Such adenylylation of the enzyme restrains the rebinding to the adenylylated DNA intermediate comparison of an Achilles' heel of LIG1, and represents a risk if they are not fixed. [9]
The concentration of DNA can affect the rate of ligation, and whether the ligation is an inter-molecular or intra-molecular reaction. Ligation involves joining up the ends of a DNA with other ends, however, each DNA fragment has two ends, and if the ends are compatible, a DNA molecule can circularize by joining its own ends. At high DNA concentration, there is a greater chance of one end of a DNA molecule meeting the end of another DNA, thereby forming intermolecular ligation. At a lower DNA concentration, the chance that one end of a DNA molecule would meet the other end of the same molecule increases, therefore intramolecular reaction that circularizes the DNA is more likely. The transformation efficiency of linear DNA is also much lower than circular DNA, and for the DNA to circularize, the DNA concentration should not be too high. As a general rule, the total DNA concentration should be less than 10 μg/ml. [10]
The relative concentration of the DNA fragments, their length, as well as buffer conditions are also factors that can affect whether intermolecular or intramolecular reactions are favored.
The concentration of DNA can be artificially increased by adding condensing agents such as cobalt hexamine and biogenic polyamines such as spermidine, or by using crowding agents such as polyethylene glycol (PEG) which also increase the effective concentration of enzymes. [11] [12] Note however that additives such as cobalt hexamine can produce exclusively intermolecular reaction, [11] resulting in linear concatemers rather than the circular DNA more suitable for transformation of plasmid DNA, and is therefore undesirable for plasmid ligation. If it is necessary to use additives in plasmid ligation, the use of PEG is preferable as it can promote intramolecular as well as intermolecular ligation. [13]
As is usual for an enzyme, the higher the ligase concentration, the faster is the rate of ligation. Blunt-end ligation is much less efficient than sticky end ligation, so a higher concentration of ligase is used in blunt-end ligations. High DNA ligase concentration may be used in conjunction with PEG for a faster ligation, and they are the components often found in commercial kits designed for rapid ligation. [14] [15]
Two issues are involved when considering the temperature of a ligation reaction. First, the optimum temperature for DNA ligase activity which is 37°C, and second, the melting temperature (Tm) of the DNA ends to be ligated. The melting temperature is dependent on length and base composition of the DNA overhang—the greater the number of G and C, the higher the Tm since there are three hydrogen bonds formed between G-C base pair compared to two for A-T base pair—with some contribution from the stacking of the bases between fragments. For the ligation reaction to proceed efficiently, the ends should be stably annealed, and in ligation experiments, the Tm of the DNA ends is generally much lower than 37°C. The optimal temperature for ligating cohesive ends is therefore a compromise between the best temperature for DNA ligase activity and the Tm where the ends can associate. [16] However, different restriction enzymes generates different ends, and the base composition of the ends produced by these enzymes may also differ, the melting temperature and therefore the optimal temperature can vary widely depending on the restriction enzymes used, and the optimum temperature for ligation may be between 4-15°C depending on the ends. [17] [18] Ligations also often involve ligating ends generated from different restriction enzymes in the same reaction mixture, therefore it may not be practical to select optimal temperature for a particular ligation reaction and most protocols simply choose 12-16°C, room temperature, or 4°C. When conducting a ligation at 4°C, it is necessary to increase the time of ligation reaction, for example by leaving the ligation mixture overnight or longer in the fridge.
The ionic strength of the buffer used can affect the ligation. The kinds of cations presence can also influence the ligation reaction, for example, excess amount of Na+ can cause the DNA to become more rigid and increase the likelihood of intermolecular ligation. At high concentration of monovalent cation (>200 mM) ligation can also be almost completely inhibited. [19] The standard buffer used for ligation is designed to minimize ionic effects. [20]
Restriction enzymes can generate a wide variety of ends in the DNA they digest, but in cloning experiments most commonly-used restriction enzymes generate a 4-base single-stranded overhang called the sticky or cohesive end (exceptions include NdeI which generates a 2-base overhang, and those that generate blunt ends). These sticky ends can anneal to other compatible ends and become ligated in a sticky-end (or cohesive end) ligation. EcoRI for example generates an AATT end, and since A and T have lower melting temperature than C and G, its melting temperature Tm is low at around 6°C. [21] For most restriction enzymes, the overhangs generated have a Tm that is around 15°C. [20] For practical purposes, sticky end ligations are performed at 12-16°C, or at room temperature, or alternatively at 4°C for a longer period.
For the insertion of a DNA fragment into a plasmid vector, it is preferable to use two different restriction enzymes to digest the DNA so that different ends are generated. The two different ends can prevent the religation of the vector without any insert, and it also allows the fragment to be inserted in a directional manner.
When it is not possible to use two different sites, then the vector DNA may need to be dephosphorylated to avoid a high background of recircularized vector DNA with no insert. Without a phosphate group at the ends the vector cannot ligate to itself, but can be ligated to an insert with a phosphate group. Dephosphorylation is commonly done using calf-intestinal alkaline phosphatase (CIAP) which removes the phosphate group from the 5′ end of digested DNA, but note that CIAP is not easy to inactivate and can interfere with ligation without an additional step to remove the CIAP, thereby resulting in failure of ligation. CIAP should not be used in excessive amount and should only be used when necessary. Shrimp alkaline phosphatase (SAP) or Antarctic phosphatase (AP) are suitable alternative as they can be easily inactivated.
Blunt end ligation does not involve base-pairing of the protruding ends, so any blunt end may be ligated to another blunt end. Blunt ends may be generated by restriction enzymes such as SmaI and EcoRV. A major advantage of blunt-end cloning is that the desired insert does not require any restriction sites in its sequence as blunt-ends are usually generated in a PCR, and the PCR generated blunt-ended DNA fragment may then be ligated into a blunt-ended vector generated from restriction digest.
Blunt-end ligation, however, is much less efficient than sticky end ligation, typically the reaction is 100X slower than sticky-end ligation. Since blunt-end does not have protruding ends, the ligation reaction depends on random collisions between the blunt-ends and is consequently much less efficient. To compensate for the lower efficiency, the concentration of ligase used is higher than sticky end ligation (10x or more). The concentration of DNA used in blunt-end ligation is also higher to increase the likelihood of collisions between ends, and longer incubation time may also be used for blunt-end ligations.
If both ends needed to be ligated into a vector are blunt-ended, then the vector needs to be dephosphorylated to minimize self-ligation. This may be done using CIAP, but caution in its use is necessary as noted previously. Since the vector has been dephosphorylated, and ligation requires the presence of a 5'-phosphate, the insert must be phosphorylated. Blunt-ended PCR product normally lacks a 5'-phosphate, therefore it needs to be phosphorylated by treatment with T4 polynucleotide kinase. [22]
Blunt-end ligation is also reversibly inhibited by high concentration of ATP. [23]
PCR usually generates blunt-ended PCR products, but note that PCR using Taq polymerase can add an extra adenine (A) to the 3' end of the PCR product. This property may be exploited in TA cloning where the ends of the PCR product can anneal to the T end of a vector. TA ligation is therefore a form of sticky end ligation. Blunt-ended vectors may be turned into vector for TA ligation with dideoxythymidine triphosphate (ddTTP) using terminal transferase.
For the cloning of an insert into a circular plasmid:
Sometimes ligation fail to produce the desired ligated products, and some of the possible reasons may be:
A number of commercially available DNA cloning kits use other methods of ligation that do not require the use of the usual DNA ligases. These methods allow cloning to be done much more rapidly, as well as allowing for simpler transfer of cloned DNA insert to different vectors. These methods however require the use of specially designed vectors and components, and may lack flexibility.
Topoisomerase can be used instead of ligase for ligation, and the cloning may be done more rapidly without the need for restriction digest of the vector or insert. In this TOPO cloning method a linearized vector is activated by attaching topoisomerase I to its ends, and this "TOPO-activated" vector may then accept a PCR product by ligating to both of the 5' ends of the PCR product, the topoisomerase is released and a circular vector is formed in the process. [28]
Another method of cloning without the use of ligase is by DNA recombination, for example as used in the Gateway cloning system. [29] [30] The gene, once cloned into the cloning vector (called entry clone in this method), may be conveniently introduced into a variety of expression vectors by recombination. [31]
Different types of ligases found in the studied organisms. For instance, Nicotinamide adenine dinucleotide (NAD+)-dependent ligase was found and isolated from bacterial organism, known as E. coli in second third of 20th century. Since then, this model has been widely used to study that DNA ligase family. Moreover, it is found in all bacteria. Examples of genes present in E. coli are LigA, which has essential functions affecting bacterial growth, and LigB. [32]
In mammals, including human 3 genes, namely Lig1, Lig3, Lig4 were identified. All eukaryotes contain multiple types of DNA ligases encoded by Lig genes. [33] The smallest known eukaryotic ligase is Chlorella virus DNA ligase (ChVLig). It contains only 298 amino acids. When ChVLig is the only source of ligase in the cell, it can continue to support mitotic development, and nonhomologous end joining in budding yeasts. [34] DNA Ligase I (Lig1) is accountable for Okazaki Fragments ligation. It is consist of 919 amino acids. In a complex process of DNA replication, DNA Ligase I recruited to replications machinery by protein interactions. Lig1 plays role in cell division in plants and yeasts. Knockout of the Lig1 gene is lethal in yeasts and some plants sprouts. Nevertheless, studies of mouse embryogenesis have shown that until the middle of the growth process embryo developing without DNA ligase I. [35]
Enzymatic ligation has been used in various studies related to DNA nanostructures and lead to increase of efficiency and stability. One of the methods is sealing of covalent DNA bond, namely phosphodiester bond and nicks. Reconstruction of those structures performed with assistance of ligation. For instance, T4 DNA ligase serve as a catalyst for sealing of a nick between 3 prime and 5 prime ends of DNA to make up strong phosphodiester bond. Ligated structures have higher thermal stability values. [36] T4 DNA ligase has many valuable properties such as already mentioned catalytic, but it is also responsible for sealing of the gaps between DNA strands, nick-closing activity, repair of the DNA damage, etc. [2]
In nanostructures architecture, molecular biology researches - ssDNA is an important application model. T4 DNA ligase used to cyclize short ssDNA fragments, but process is complicated by formation of secondary structures. On the other hand, Taq DNA ligase is a thermostable enzyme which can be applied at higher temperatures (45, 55 and 65 °C respectively). Since at these temperature range secondary structures less stable it is enhance cyclization efficiency of oligonucleotides. The kinetic, biological, and other parameters of nanostructures are influenced by presence of the secondary structures in DNA rings. However, Taq DNA ligation occur only when two complementary DNA strands are perfectly paired and have no gaps in between. [37]
Analysis of ligases activities, mutations, deficiencies widely used in drug design and biological researches to investigate diseases, pathologies developments and related rare acquired or inherited syndromes (e.g. DNA ligase IV syndrome). [38] [39] [40] [41]
The ligation procedure is prevalent in molecular biology cloning techniques, and it has been applied to define and characterize specific nucleotide sequences in the genome using Ligase Chain Reaction (LCR) or Polymerase Chain Reaction (PCR)-based amplification of ligated probes. [42]
Ligation may also serve as a DNA analysis method. [43] Some techniques employ rolling circle amplification. [43] The most notable of these is described by Smolina et al., 2007 & Smolina et al., 2008 using fluorescence in situ hybridization and peptide nucleic acids. [43] They developed and employed this technique for analyses of bacterial chromosomes. [43]
DNA ligase is a type of enzyme that facilitates the joining of DNA strands together by catalyzing the formation of a phosphodiester bond. It plays a role in repairing single-strand breaks in duplex DNA in living organisms, but some forms may specifically repair double-strand breaks. Single-strand breaks are repaired by DNA ligase using the complementary strand of the double helix as a template, with DNA ligase creating the final phosphodiester bond to fully repair the DNA.
Protein engineering is the process of developing useful or valuable proteins through the design and production of unnatural polypeptides, often by altering amino acid sequences found in nature. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to improve the function of many enzymes for industrial catalysis. It is also a product and services market, with an estimated value of $168 billion by 2017.
A cloning vector is a small piece of DNA that can be stably maintained in an organism, and into which a foreign DNA fragment can be inserted for cloning purposes. The cloning vector may be DNA taken from a virus, the cell of a higher organism, or it may be the plasmid of a bacterium. The vector contains features that allow for the convenient insertion of a DNA fragment into the vector or its removal from the vector, for example through the presence of restriction sites. The vector and the foreign DNA may be treated with a restriction enzyme that cuts the DNA, and DNA fragments thus generated contain either blunt ends or overhangs known as sticky ends, and vector DNA and foreign DNA with compatible ends can then be joined by molecular ligation. After a DNA fragment has been cloned into a cloning vector, it may be further subcloned into another vector designed for more specific use.
A cDNA library is a combination of cloned cDNA fragments inserted into a collection of host cells, which constitute some portion of the transcriptome of the organism and are stored as a "library". cDNA is produced from fully transcribed mRNA found in the nucleus and therefore contains only the expressed genes of an organism. Similarly, tissue-specific cDNA libraries can be produced. In eukaryotic cells the mature mRNA is already spliced, hence the cDNA produced lacks introns and can be readily expressed in a bacterial cell. While information in cDNA libraries is a powerful and useful tool since gene products are easily identified, the libraries lack information about enhancers, introns, and other regulatory elements found in a genomic DNA library.
Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional mutating changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, it is used for investigating the structure and biological activity of DNA, RNA, and protein molecules, and for protein engineering.
Serial Analysis of Gene Expression (SAGE) is a transcriptomic technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. Several variants have been developed since, most notably a more robust version, LongSAGE, RL-SAGE and the most recent SuperSAGE. Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.
In molecular biology, subcloning is a technique used to move a particular DNA sequence from a parent vector to a destination vector.
Restriction sites, or restriction recognition sites, are located on a DNA molecule containing specific sequences of nucleotides, which are recognized by restriction enzymes. These are generally palindromic sequences, and a particular restriction enzyme may cut the sequence between two nucleotides within its recognition site, or somewhere nearby.
DNA shuffling, also known as molecular breeding, is an in vitro random recombination method to generate mutant genes for directed evolution and to enable a rapid increase in DNA library size. Three procedures for accomplishing DNA shuffling are molecular breeding which relies on homologous recombination or the similarity of the DNA sequences, restriction enzymes which rely on common restriction sites, and nonhomologous random recombination which requires the use of hairpins. In all of these techniques, the parent genes are fragmented and then recombined.
A genomic library is a collection of overlapping DNA fragments that together make up the total genomic DNA of a single organism. The DNA is stored in a population of identical vectors, each containing a different insert of DNA. In order to construct a genomic library, the organism's DNA is extracted from cells and then digested with a restriction enzyme to cut the DNA into fragments of a specific size. The fragments are then inserted into the vector using DNA ligase. Next, the vector DNA can be taken up by a host organism - commonly a population of Escherichia coli or yeast - with each cell containing only one vector molecule. Using a host cell to carry the vector allows for easy amplification and retrieval of specific clones from the library for analysis.
The blue–white screen is a screening technique that allows for the rapid and convenient detection of recombinant bacteria in vector-based molecular cloning experiments. This method of screening is usually performed using a suitable bacterial strain, but other organisms such as yeast may also be used. DNA of transformation is ligated into a vector. The vector is then inserted into a competent host cell viable for transformation, which are then grown in the presence of X-gal. Cells transformed with vectors containing recombinant DNA will produce white colonies; cells transformed with non-recombinant plasmids grow into blue colonies.
Fragmentation describes the process of splitting into several pieces or fragments. In cell biology, fragmentation is useful for a cell during both DNA cloning and apoptosis. DNA cloning is important in asexual reproduction or creation of identical DNA molecules, and can be performed spontaneously by the cell or intentionally by laboratory researchers. Apoptosis is the programmed destruction of cells, and the DNA molecules within them, and is a highly regulated process. These two ways in which fragmentation is used in cellular processes describe normal cellular functions and common laboratory procedures performed with cells. However, problems within a cell can sometimes cause fragmentation that results in irregularities such as red blood cell fragmentation and sperm cell DNA fragmentation.
Artificial gene synthesis, or simply gene synthesis, refers to a group of methods that are used in synthetic biology to construct and assemble genes from nucleotides de novo. Unlike DNA synthesis in living cells, artificial gene synthesis does not require template DNA, allowing virtually any DNA sequence to be synthesized in the laboratory. It comprises two main steps, the first of which is solid-phase DNA synthesis, sometimes known as DNA printing. This produces oligonucleotide fragments that are generally under 200 base pairs. The second step then involves connecting these oligonucleotide fragments using various DNA assembly methods. Because artificial gene synthesis does not require template DNA, it is theoretically possible to make a completely synthetic DNA molecule with no limits on the nucleotide sequence or size.
Topoisomerase-based cloning is a molecular biology technique in which DNA fragments are cloned into specific vectors without the requirement for DNA ligases. Taq polymerase has a nontemplate-dependent terminal transferase activity that adds a single deoxyadenosine (A) to the 3'-end of the PCR products. This characteristic is exploited in "sticky end" TOPO TA cloning. For "blunt end" TOPO cloning, the recipient vector does not have overhangs and blunt-ended DNA fragments can be cloned.
The Gateway cloning method is a method of molecular cloning invented and commercialized by Invitrogen since the late 1990s, which makes use of the integration and excision recombination reactions that take place when bacteriophage lambda infects bacteria. This technology provides a fast and highly efficient way to transport DNA sequences into multi-vector systems for functional analysis and protein expression using Gateway att sites and two proprietary enzyme mixes called BP Clonase and LR Clonase. In vivo, these recombination reactions are facilitated by the recombination of attachment sites from the lambda/phage chromosome (attP) and the bacteria (attB). As a result of recombination between the attP and attB sites, the phage integrates into the bacterial genome flanked by two new recombination sites. The removal of the phage from the bacterial chromosome and the regeneration of attP and attB sites can both result from the attL and attR sites recombining under specific circumstances.
Diversity Arrays Technology (DArT) is a high-throughput genetic marker technique that can detect allelic variations to provide comprehensive genome coverage without any DNA sequence information for genotyping and other genetic analysis. The general steps involve reducing the complexity of the genomic DNA with specific restriction enzymes, choosing diverse fragments to serve as representations for the parent genomes, amplify via polymerase chain reaction (PCR), inserting fragments into a vector to be placed as probes within a microarray, and then fluorescent targets from a reference sequence will be allowed to hybridize with probes and put through an imaging system. The objective is to identify and quantify various forms of DNA polymorphism within genomic DNA of sampled species.
Gibson assembly is a molecular cloning method that allows for the joining of multiple DNA fragments in a single, isothermal reaction. It is named after its creator, Daniel G. Gibson, who is the chief technology officer and co-founder of the synthetic biology company, Telesis Bio. The technology is more efficient than manual plasmid genetic recombination methods, but remains expensive as it is still under patent.
Molecular cloning is a set of experimental methods in molecular biology that are used to assemble recombinant DNA molecules and to direct their replication within host organisms. The use of the word cloning refers to the fact that the method involves the replication of one molecule to produce a population of cells with identical DNA molecules. Molecular cloning generally uses DNA sequences from two different organisms: the species that is the source of the DNA to be cloned, and the species that will serve as the living host for replication of the recombinant DNA. Molecular cloning methods are central to many contemporary areas of modern biology and medicine.
Golden Gate Cloning or Golden Gate assembly is a molecular cloning method that allows a researcher to simultaneously and directionally assemble multiple DNA fragments into a single piece using Type IIS restriction enzymes and T4 DNA ligase. This assembly is performed in vitro. Most commonly used Type IIS enzymes include BsaI, BsmBI, and BbsI.
EcoRI is a restriction endonuclease enzyme isolated from species E. coli. It is a restriction enzyme that cleaves DNA double helices into fragments at specific sites, and is also a part of the restriction modification system. The Eco part of the enzyme's name originates from the species from which it was isolated - "E" denotes generic name which is "Escherichia" and "co" denotes species name, "coli" - while the R represents the particular strain, in this case RY13, and the I denotes that it was the first enzyme isolated from this strain.