Sequencing by hybridization

Last updated

Sequencing by hybridization is a class of methods for determining the order in which nucleotides occur on a strand of DNA. Typically used for looking for small changes relative to a known DNA sequence. [1] The binding of one strand of DNA to its complementary strand in the DNA double-helix (known as hybridization) is sensitive to even single-base mismatches when the hybrid region is short or if specialized mismatch detection proteins are present. This is exploited in a variety of ways, most notably via DNA chips or microarrays with thousands to billions of synthetic oligonucleotides found in a genome of interest plus many known variations or even all possible single-base variations. [2] [3]

DNA Molecule that encodes the genetic instructions used in the development and functioning of all known organisms and many viruses

Deoxyribonucleic acid is a molecule composed of two chains that coil around each other to form a double helix carrying the genetic instructions used in the growth, development, functioning, and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids; alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.

In molecular biology, hybridization is a phenomenon in which single-stranded deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) molecules anneal to complementary DNA or RNA. Though a double-stranded DNA sequence is generally stable under physiological conditions, changing these conditions in the laboratory will cause the molecules to separate into single strands. These strands are complementary to each other but may also be complementary to other sequences present in their surroundings. Lowering the surrounding temperature allows the single-stranded molecules to anneal or “hybridize” to each other.

Contents

The type of sequencing by hybridization described above has largely been displaced by other methods, including sequencing by synthesis, and sequencing by ligation (as well as pore-based methods). However hybridization of oligonucleotides is still used in some sequencing schemes, including hybridization-assisted pore-based sequencing, and reversible hybridization. [4]

Sequencing by ligation is a DNA sequencing method that uses the enzyme DNA ligase to identify the nucleotide present at a given position in a DNA sequence. Unlike most currently popular DNA sequencing methods, this method does not use a DNA polymerase to create a second strand. Instead, the mismatch sensitivity of a DNA ligase enzyme is used to determine the underlying sequence of the target DNA molecule.

Examples of commercial systems

Affymetrix, Inc. was an American company that manufactured DNA microarrays; it was based in Santa Clara, California, United States. The company was acquired by Thermo Fisher Scientific in March 2016.

See also

Related Research Articles

DNA microarray

A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown.

Chargaff's rules state that DNA from any cell of any organisms should have a 1:1 ratio of pyrimidine and purine bases and, more specifically, that the amount of guanine should be equal to cytosine and the amount of adenine should be equal to thymine. This pattern is found in both strands of the DNA. They were discovered by Austrian born chemist Erwin Chargaff, in the late 1940s.

Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, it is used for investigating the structure and biological activity of DNA, RNA, and protein molecules, and for protein engineering.

DNA sequencing process of determining the nucleic acid sequence – the order of nucleotides in DNA

DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. A SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.

Nucleic acid thermodynamics is the study of how temperature affects the nucleic acid structure of double-stranded DNA (dsDNA). The melting temperature (Tm) is defined as the temperature at which half of the DNA strands are in the random coil or single-stranded (ssDNA) state. Tm depends on the length of the DNA molecule and its specific nucleotide sequence. DNA, when in a state where its two strands are dissociated, is referred to as having been denatured by the high temperature.

An allele-specific oligonucleotide (ASO) is a short piece of synthetic DNA complementary to the sequence of a variable target DNA. It acts as a probe for the presence of the target in a Southern blot assay or, more commonly, in the simpler Dot blot assay. It is a common tool used in genetic testing, forensics, and Molecular Biology research.

Polymerase cycling assembly is a method for the assembly of large DNA oligonucleotides from shorter fragments. The process uses the same technology as PCR, but takes advantage of DNA hybridization and annealing as well as DNA polymerase to amplify a complete sequence of DNA in a precise order based on the single stranded oligonucleotides used in the process. It thus allows for the production of synthetic genes and even entire synthetic genomes.

SOLiD is a next-generation DNA sequencing technology developed by Life Technologies and has been commercially available since 2006. This next generation technology generates hundreds of millions to billions of small sequence reads at one time.

Oligomer restriction

Oligomer Restriction is a procedure to detect an altered DNA sequence in a genome. A labeled oligonucleotide probe is hybridized to a target DNA, and then treated with a restriction enzyme. If the probe exactly matches the target, the restriction enzyme will cleave the probe, changing its size. If, however, the target DNA does not exactly match the probe, the restriction enzyme will have no effect on the length of the probe. The OR technique, now rarely performed, was closely associated with the development of the popular polymerase chain reaction (PCR) method.

The ligase chain reaction (LCR) is a method of DNA amplification. The ligase chain reaction (LCR) is an amplification process that differs from PCR in that it involves a thermostable ligase to join two probes or other molecules together which can then be amplified by standard PCR cycling. Each cycle results in a doubling of the target nucleic acid molecule. A key advantage of LCR is greater specificity as compared to PCR.Thus, LCR requires two completely different enzymes to operate properly: ligase, to join probe molecules together, and a thermostable polymerase to amplify those molecules involved in successful ligation. The probes involved in the ligation are designed such that the 5′ end of one probe is directly adjacent to the 3′ end of the other probe, thereby providing the requisite 3′-OH and 5′-PO4 group substrates for the ligase.

T7 DNA polymerase

T7 DNA polymerase is an enzyme used during the DNA replication of the T7 bacteriophage. During this process, the DNA polymerase “reads” existing DNA strands and creates two new strands that match the existing ones. The T7 DNA polymerase requires a host factor, E. coli thioredoxin, in order to carry out its function. This helps stabilize the binding of the necessary protein to the primer-template to improve processivity by more than 100-fold, which is a feature unique to this enzyme. It is a member of the Family A DNA polymerases, which include E. coli DNA polymerase I and Taq DNA polymerase.

Polony sequencing is an inexpensive but highly accurate multiplex sequencing technique that can be used to “read” millions of immobilized DNA sequences in parallel. This technique was first developed by Dr. George Church's group at Harvard Medical School. Unlike other sequencing techniques, Polony sequencing technology is an open platform with freely downloadable, open source software and protocols. Also, the hardware of this technique can be easily set up with a commonly available epifluorescence microscopy and a computer-controlled flowcell/fluidics system. Polony sequencing is generally performed on paired-end tags library that each molecule of DNA template is of 135 bp in length with two 17–18 bp paired genomic tags separated and flanked by common sequences. The current read length of this technique is 26 bases per amplicon and 13 bases per tag, leaving a gap of 4–5 bases in each tag.

DNA nanoball sequencing

DNA nanoball sequencing is a high throughput sequencing technology that is used to determine the entire genomic sequence of an organism. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Fluorescent nucleotides bind to complementary nucleotides and are then polymerized to anchor sequences bound to known sequences on the DNA template. The base order is determined via the fluorescence of the bound nucleotides This DNA sequencing method allows large numbers of DNA nanoballs to be sequenced per run at lower reagent costs compared to other next generation sequencing platforms. However, a limitation of this method is that it generates only short sequences of DNA, which presents challenges to mapping its reads to a reference genome. After purchasing Complete Genomics, the Beijing Genomics Institute (BGI) refined DNA nanoball sequencing to sequence nucleotide samples on their own platform.

Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged in 1994-1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads per instrument run.

Illumina dye sequencing

Illumina dye sequencing is a technique used to determine the series of base pairs in DNA, also known as DNA sequencing. The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris. It was developed by Shankar Balasubramanian and David Klenerman of Cambridge University, who subsequently founded Solexa, a company later acquired by Illumina. This sequencing method is based on reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands. It can also be used for whole-genome and region sequencing, transcriptome analysis, metagenomics, small RNA discovery, methylation profiling, and genome-wide protein-nucleic acid interaction analysis.

Magnetic sequencing is a single-molecule sequencing method in development. A DNA hairpin, containing the sequence of interest, is bound between a magnetic bead and a glass surface. A magnetic field is applied to stretch the hairpin open into single strands, and the hairpin refolds after decreasing of the magnetic field. The hairpin length can be determined by direct imaging of the diffraction rings of the magnetic beads using a simple microscope. The DNA sequences are determined by measuring the changes in the hairpin length following successful hybridization of complementary nucleotides.

A hybridization assay comprises any form of quantifiable hybridization i.e. the quantitative annealing of two complementary strands of nucleic acids, known as nucleic acid hybridization.

Duplex sequencing

Duplex sequencing is a library preparation and analysis method for next-generation sequencing (NGS) platforms that employs random tagging of double stranded DNA to detect mutations with higher accuracy and lower error rate. This method uses degenerate molecular tags in addition to sequencing adapters to recognize reads originating from each strand of DNA. The generated sequencing reads then will be analyzed using two methods: single strand consensus sequences (SSCSs) and Duplex consensus sequences (DCSs) assembly. Duplex sequencing theoretically can detect mutations with frequencies as low as 5 x 10−8 that is more than 10,000 fold higher in accuracy compared to the conventional next-generation sequencing methods.

References

  1. Drmanac, Radoje; Drmanac, Snezana; Chui, Gloria; Diaz, Robert; Hou, Aaron; Jin, Hui; Jin, Paul; Kwon, Sunhee; Lacy, Scott; Moeur, Bill; Shafto, Jay; Swanson, Don; Ukrainczyk, Tatjana; Xu, Chongjun; Little, Deane (2002). "Sequencing by Hybridization (SBH): Advantages, Achievements, and Opportunities". 77: 75–101. doi:10.1007/3-540-45713-5_5. ISSN   0724-6145. PDF
  2. Preparata, FP; Upfal, E (2000). "Sequencing-by-hybridization at the information-theory bound: an optimal algorithm". J. Comput. Biol. 7 (3): 621–30. CiteSeerX   10.1.1.61.3325 . doi:10.1089/106652700750050970. PMID   11108482.
  3. Hanna, GJ; et al. (July 2000). "Comparison of Sequencing by Hybridization and Cycle Sequencing for Genotyping of Human Immunodeficiency Virus Type 1 Reverse Transcriptase". J Clin Microbiol. 38 (7): 2715–2721. PMC   87006 . PMID   10878069.
  4. Church, George M. (January 2006). "Genomes for all". Scientific American. 294 (1): 46–54. doi:10.1038/scientificamerican0106-46. PMID   16468433.