Dideoxynucleotide

Last updated
Molecular structure of 2',3'-dideoxyadenosine triphosphate (ddATP) Dideoxyadenosine triphosphate.svg
Molecular structure of 2',3'-dideoxyadenosine triphosphate (ddATP)

Dideoxynucleotides are chain-elongating inhibitors of DNA polymerase, used in the Sanger method for DNA sequencing. [1] They are also known as 2',3' because both the 2' and 3' positions on the ribose lack hydroxyl groups, and are abbreviated as ddNTPs (ddGTP, ddATP, ddTTP and ddCTP). [2]

Contents

Role in the Sanger method

Inhibition of a nucleophilic attack due to the absence of the 3'-OH Group Dideoxy termination of DNA elongation EN.png
Inhibition of a nucleophilic attack due to the absence of the 3'-OH Group

The Sanger method is used to amplify a target segment of DNA, so that the DNA sequence can be determined precisely. The incorporation of ddNTPs in the reaction valves are simply used to terminate the synthesis of a growing DNA strand, resulting in partially replicated DNA fragments. This is because DNA polymerase requires the 3' OH group of the growing chain and the 5' phosphate group of the incoming dNTP to create a phosphodiester bond. [2] Sometimes the DNA polymerase will incorporate a ddNTP and the absence of the 3' OH group will interrupt the condensation reaction between the 5' phosphate (following the cleavage of pyrophospate) of the incoming nucleotide with the 3' hydroxyl group of the previous nucleotide on the growing strand. This condensation reaction would normally occur with the incorporation of a non-modified dNTP by DNA polymerase. In the simplest of terms, the nucleophilic attack of the 3' OH group leads to the addition of a nucleotide onto a growing chain. The absence of the 3' hydroxyl group inhibits this nucleophilic attack from happening, disabling the DNA polymerase's ability to continue with its function. [2]

This discovery led to its appropriate name "Chain-terminating nucleotides". [2] The dideoxyribonucleotides do not have a 3' hydroxyl group, hence no further chain elongation can occur once this dideoxynucleotide is on the chain. This can lead to the termination of the DNA sequence. Thus, these molecules form the basis of the dideoxy chain-termination method of DNA sequencing, which was reported by Frederick Sanger and his team in 1977 [3] as an extension of earlier work. [4] Sanger's approach was described in 2001 as one of the two fundamental methods for sequencing DNA fragments [1] (the other being the MaxamGilbert method [5] ) but the Sanger method is both the "most widely used and the method used by most automated DNA sequencers." [1] Sanger won his second Nobel Prize in Chemistry in 1980, sharing it with Walter Gilbert ("for their contributions concerning the determination of base sequences in nucleic acids") and with Paul Berg ("for his fundamental studies of the biochemistry of nucleic acids, with particular regard to recombinant DNA"), [6] and discussed the use of dideoxynucleotides in his Nobel lecture. [7]

DNA Sequencing

Dideoxynucleotides are useful in the sequencing of DNA in combination with electrophoresis. A DNA sample that undergoes PCR (polymerase chain reaction) in a mixture containing all four deoxynucleotides and one dideoxynucleotide will produce strands of length equal to the position of each base of the type that complements the type having a dideoxynucleotide present. That the taq polymerase used in PCR favors the ddGNTP, is a pattern observed in various research. [8] That is, each nucleotide base of that particular type has a probability of being bonded to not a deoxynucleotide but rather a dideoxynucleotide, which ends chain elongation. Therefore, if the sample then undergoes electrophoresis, there will be a band present for each length at which the complement of the dideoxynucleotide is present. It is now common to use fluorescent dideoxynucleotides such that each one of the four has a different fluorescence that can be detected by a sequencer; thus only one reaction is needed.

Production

In a patented method, Tetrahydrofuran was used in a solvent containing 50 g (0.205 mols) of uridine (or, presumably, any unprotected nucleoside), 112 ml (1.03 mols) of methyl orthoformate and 10 g (52.6 mmols) of paratoluenesulfonic acid at room temperature with stirring. After stirring at room temperature for 24 hours, the reaction mixture was poured into an aqueous sodium bicarbonate solution followed by extraction with chloroform 5 times. The extract was dried over sodium sulfate and concentrated to give 49.5 g (0.173 mols) of 2',3'-o-methoxymethylideneuridine (yield, 84.5%). After this reaction, the uridine example product 2',3'-o-methoxymethylideneuridine is then dissolved in 50 ml of acetic anhydride at room temperature with stirring. The solution was heated to 140 °C. and kept at this temperature for 5 hours under reflux of the solvent. After cooling to room temperature, the solvent was removed by distillation under reduced pressure, 50 ml of water were added, and the mixture was extracted 3 times with 100 ml of chloroform. The extract was concentrated under reduced pressure, 30% ammonia water to obtain an 82.3% yield of 2',3'-dideoxy-2',3'-didehydrouridine. This product example is then hydrated to remove the double bond through dissolving it in methanol (10 ml) containing a catalyst (wet 5% palladium on carbon) (400 mg) in an atmosphere of hydrogen for 1 h to obtain the resultant dideoxynucleoside ( 2',3'-dideoxyuridine in this case). [6] The dye nucleotide to be used will likely occur by treatment with a phosphorylation enzyme and biotinylation and reaction of the biotinylated substance with the dye. It is possible that immediate reaction with the dye may also occur, but extending the arm is claimed to increase efficiency in the case of using a mutant form of DNA polymerase. [5]

Related Research Articles

<span class="mw-page-title-main">Polymerase chain reaction</span> Laboratory technique to multiply a DNA sample for study

The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA sufficiently to enable detailed study. PCR was invented in 1983 by American biochemist Kary Mullis at Cetus Corporation. Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993.

<span class="mw-page-title-main">Primer (molecular biology)</span> Short strand of RNA or DNA that serves as a starting point for DNA synthesis

A primer is a short single-stranded nucleic acid used by all living organisms in the initiation of DNA synthesis. A synthetic primer may also be referred to as an oligo, short for oligonucleotide. DNA polymerase enzymes are only capable of adding nucleotides to the 3’-end of an existing nucleic acid, requiring a primer be bound to the template before DNA polymerase can begin a complementary strand. DNA polymerase adds nucleotides after binding to the RNA primer and synthesizes the whole strand. Later, the RNA strands must be removed accurately and replace them with DNA nucleotides forming a gap region known as a nick that is filled in using an enzyme called ligase. The removal process of the RNA primer requires several enzymes, such as Fen1, Lig1, and others that work in coordination with DNA polymerase, to ensure the removal of the RNA nucleotides and the addition of DNA nucleotides. Living organisms use solely RNA primers, while laboratory techniques in biochemistry and molecular biology that require in vitro DNA synthesis usually use DNA primers, since they are more temperature stable. Primers can be designed in laboratory for specific reactions such as polymerase chain reaction (PCR). When designing PCR primers, there are specific measures that must be taken into consideration, like the melting temperature of the primers and the annealing temperature of the reaction itself. Moreover, the DNA binding sequence of the primer in vitro has to be specifically chosen, which is done using a method called basic local alignment search tool (BLAST) that scans the DNA and finds specific and unique regions for the primer to bind.

In genetics and biochemistry, sequencing means to determine the primary structure of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succinctly summarizes much of the atomic-level structure of the sequenced molecule.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">Frederick Sanger</span> British biochemist (1918–2013)

Frederick Sanger was a British biochemist who received the Nobel Prize in Chemistry twice.

<span class="mw-page-title-main">DNA polymerase I</span> Family of enzymes

DNA polymerase I is an enzyme that participates in the process of prokaryotic DNA replication. Discovered by Arthur Kornberg in 1956, it was the first known DNA polymerase. It was initially characterized in E. coli and is ubiquitous in prokaryotes. In E. coli and many other bacteria, the gene that encodes Pol I is known as polA. The E. coli Pol I enzyme is composed of 928 amino acids, and is an example of a processive enzyme — it can sequentially catalyze multiple polymerisation steps without releasing the single-stranded template. The physiological function of Pol I is mainly to support repair of damaged DNA, but it also contributes to connecting Okazaki fragments by deleting RNA primers and replacing the ribonucleotides with DNA.

Pyrosequencing is a method of DNA sequencing based on the "sequencing by synthesis" principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase. Pyrosequencing relies on light detection based on a chain reaction when pyrophosphate is released. Hence, the name pyrosequencing.

<span class="mw-page-title-main">DNA sequencing</span> Process of determining the nucleic acid sequence

DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

<span class="mw-page-title-main">Sanger sequencing</span> Method of DNA sequencing developed in 1977

Sanger sequencing is a method of DNA sequencing that involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. After first being developed by Frederick Sanger and colleagues in 1977, it became the most widely used sequencing method for approximately 40 years. It was first commercialized by Applied Biosystems in 1986. More recently, higher volume Sanger sequencing has been replaced by next generation sequencing methods, especially for large-scale, automated genome analyses. However, the Sanger method remains in wide use for smaller-scale projects and for validation of deep sequencing results. It still has the advantage over short-read sequencing technologies in that it can produce DNA sequence reads of > 500 nucleotides and maintains a very low error rate with accuracies around 99.99%. Sanger sequencing is still actively being used in efforts for public health initiatives such as sequencing the spike protein from SARS-CoV-2 as well as for the surveillance of norovirus outbreaks through the Center for Disease Control and Prevention's (CDC) CaliciNet surveillance network.

<i>Taq</i> polymerase Thermostable form of DNA polymerase I used in polymerase chain reaction

Taq polymerase is a thermostable DNA polymerase I named after the thermophilic eubacterial microorganism Thermus aquaticus, from which it was originally isolated by Chien et al. in 1976. Its name is often abbreviated to Taq or Taq pol. It is frequently used in the polymerase chain reaction (PCR), a method for greatly amplifying the quantity of short segments of DNA.

Allan Maxam is one of the pioneers of molecular genetics. He was one of the contributors to develop a DNA sequencing method at Harvard University, while working as a student in the laboratory of Walter Gilbert.

SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. An SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.

The polymerase chain reaction (PCR) is a commonly used molecular biology tool for amplifying DNA, and various techniques for PCR optimization which have been developed by molecular biologists to improve PCR performance and minimize failure.

<span class="mw-page-title-main">History of polymerase chain reaction</span>

The history of the polymerase chain reaction (PCR) has variously been described as a classic "Eureka!" moment, or as an example of cooperative teamwork between disparate researchers. Following is a list of events before, during, and after its development:

<span class="mw-page-title-main">T7 DNA polymerase</span>

T7 DNA polymerase is an enzyme used during the DNA replication of the T7 bacteriophage. During this process, the DNA polymerase “reads” existing DNA strands and creates two new strands that match the existing ones. The T7 DNA polymerase requires a host factor, E. coli thioredoxin, in order to carry out its function. This helps stabilize the binding of the necessary protein to the primer-template to improve processivity by more than 100-fold, which is a feature unique to this enzyme. It is a member of the Family A DNA polymerases, which include E. coli DNA polymerase I and Taq DNA polymerase.

<span class="mw-page-title-main">Ion semiconductor sequencing</span>

Ion semiconductor sequencing is a method of DNA sequencing based on the detection of hydrogen ions that are released during the polymerization of DNA. This is a method of "sequencing by synthesis", during which a complementary strand is built based on the sequence of a template strand.

Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads per instrument run.

<span class="mw-page-title-main">Illumina dye sequencing</span> DNA sequencing method

Illumina dye sequencing is a technique used to determine the series of base pairs in DNA, also known as DNA sequencing. The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris. It was developed by Shankar Balasubramanian and David Klenerman of Cambridge University, who subsequently founded Solexa, a company later acquired by Illumina. This sequencing method is based on reversible dye-terminators that enable the identification of single nucleotides as they are washed over DNA strands. It can also be used for whole-genome and region sequencing, transcriptome analysis, metagenomics, small RNA discovery, methylation profiling, and genome-wide protein-nucleic acid interaction analysis.

<span class="mw-page-title-main">Maxam–Gilbert sequencing</span> Method of DNA sequencing

Maxam–Gilbert sequencing is a method of DNA sequencing developed by Allan Maxam and Walter Gilbert in 1976–1977. This method is based on nucleobase-specific partial chemical modification of DNA and subsequent cleavage of the DNA backbone at sites adjacent to the modified nucleotides.

In molecular biology, hybridization is a phenomenon in which single-stranded deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) molecules anneal to complementary DNA or RNA. Though a double-stranded DNA sequence is generally stable under physiological conditions, changing these conditions in the laboratory will cause the molecules to separate into single strands. These strands are complementary to each other but may also be complementary to other sequences present in their surroundings. Lowering the surrounding temperature allows the single-stranded molecules to anneal or “hybridize” to each other.

References

  1. 1 2 3 Meis RJ, Raghavachari R (2001). "Near-Infrared Applications in DNA Sequencing and Analysis". In Raghavachari R (ed.). Near-Infrared Applications in Biotechnology. CRC Press. pp. 133–150. ISBN   9781420030242.
  2. 1 2 3 4 Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R (2014). Molecular biology of the Gene (7th ed.). Cold Spring Harbor, NY: Pearson. pp. 160–161. ISBN   978-0-321-76243-6.
  3. Sanger F, Nicklen S, Coulson AR (December 1977). "DNA sequencing with chain-terminating inhibitors". Proceedings of the National Academy of Sciences of the United States of America. 74 (12): 5463–7. Bibcode:1977PNAS...74.5463S. doi: 10.1073/pnas.74.12.5463 . PMC   431765 . PMID   271968.
  4. Sanger F, Coulson AR (May 1975). "A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase". Journal of Molecular Biology. 94 (3): 441–8. doi:10.1016/0022-2836(75)90213-2. PMID   1100841.
  5. 1 2 Maxam AM, Gilbert W (February 1977). "A new method for sequencing DNA". Proceedings of the National Academy of Sciences of the United States of America. 74 (2): 560–4. Bibcode:1977PNAS...74..560M. doi: 10.1073/pnas.74.2.560 . PMC   392330 . PMID   265521.
  6. 1 2 Kungliga Vetenskapsakademien (The Royal Swedish Academy of Sciences) (14 October 1980). "The Nobel Prize in Chemistry 1980". nobelprize.org (Press release). Nobel Media. Archived from the original on 31 October 2018. Retrieved 14 December 2019.
  7. Sanger, F. (8 December 1980). "Determination of Nucleotide Sequences in DNA (Nobel lecture)" (PDF). nobelprize.org . Nobel Media. Archived (PDF) from the original on 14 December 2019. Retrieved 14 December 2019.
  8. Li Y, Mitaxov V, Waksman G (August 1999). "Structure-based design of Taq DNA polymerases with improved properties of dideoxynucleotide incorporation". Proceedings of the National Academy of Sciences of the United States of America. 96 (17): 9491–6. Bibcode:1999PNAS...96.9491L. doi: 10.1073/pnas.96.17.9491 . PMC   22236 . PMID   10449720.