Diversity arrays technology

Last updated

Diversity Arrays Technology (DArT) is a high-throughput genetic marker technique that can detect allelic variations to provides comprehensive genome coverage without any DNA sequence information for genotyping and other genetic analysis. [1] [2] [3] The general steps involve reducing the complexity of the genomic DNA with specific restriction enzymes, choosing diverse fragments to serve as representations for the parent genomes, amplify via polymerase chain reaction (PCR), insert fragments into a vector to be placed as probes within a microarray, then fluorescent targets from a reference sequence will be allowed to hybridize with probes and put through an imaging system. [1] [2] The objective is to identify and quantify various forms of DNA polymorphism within genomic DNA of sampled species. [1]

Contents

First reported in 2001 by Damian Jaccoud, Andrzej Kilian, David Feinstein, and Kaiman Peng, DArT prioritized significant advantages over other traditional primer-based methods like the ability to analyze large amounts of various samples from a low amount of initial DNA. [1] [2] [4] [5] It also afforded low costs and faster results compared to related solid state DNA arrays that detected Single Nucleotide Polymorphisms (SNPs). [1] [2] Since its inception, the technology has been a major instrument in the analysis of polyploid plants as well as in the construction of physical and genetic map to understand related on species based on similarities and allelic variances among their genomes. [1] [2] [6] [7] [8] [3]

History

The concept was first developed by Damian Jaccoud, Andrzej Kilian, David Feinstein, and Kaiman Peng in 2001. [1] They aimed to establish a genomic DNA-polymorphism detection and quantification technique that would increase throughput when compared to more traditional methods like Amplified Fragment Length Polymorphism (AFLP), Restriction Fragment Length Polymorphism (RFLP), Simple Sequence Repeats (SSR). [1] [2] [4] [5] They also aimed to minimize cost and reliance on sequenced genomes to identify polymorphisms which is a consequence of early immobilized, solid-states DNA arrays, like DNA chips, which solely identify SNPs. [1] [2] A byproduct of their discovery of a fast, low-cost whole-genome profiling method was that it also provided with the identification of SNPs as well as base-pair insertions, deletions, and shifts, which is an added layer of allelic variation between species analyzed. [1] [2]

Jaccoud, Kilian, Feinstein, and Peng selected nine subspecies of rice as their source for genomic DNA and polymorphism analysis. [1] The analysis consisted of detecting the presence, or absence, of specific DNA polymorphisms with probing arrays as well as quantifying the strength of each signal, via fluorescence, within the subspecies. Upon selecting and extracting DNA samples from subjects, samples were digested with three specific restriction enzymes and ligated with T4 ligase. Following ligation into double stranded DNA, dilution as well as extraction of a short amount of mixture to use as a PCR template was performed. Products were placed into a pCR2.1-TOPO vector and subsequently transformed into E. coli, who were selected based on resistance to ampicillin and pigmentation from the X-gal interaction. [1] [2] Cloned cells are amplified with PCR-amplified, purified, and introduced into a microarray. Reference DNA and samples were mixed with fluorescent dyes, Cy3 or Cy5, mixed, denatured, and allowed to hybridize to further reintroduce them into the microarray for further analysis. Results reported that the use of DArT was able to detect the presence or absence of polymorphism in an expedient manner as compared to RFLP as well as quantify the polymorphisms detected. [1] In addition, DArT was able to minimize the amount of initial DNA required to conduct the analysis significantly compared to other methods. [1]

Procedure

The DArT is broken down into three essential steps: Complexity reduction, genomic representation, and DArT assay. [2]

Complexity reduction

This step of the process deals with reducing large complex genomic DNA of selected species into more, manageable fragmented components through the use of specific restriction enzymes. In addition, this step exclusively relies on digestion enzymes over a couple effort of digestion enzymes and primers due to the reported increased polymorphism identified across analyzed samples. [2] The PstI enzyme is a commonly used restriction enzyme for this step because of its specificity to the nonrepetitive, nonmethylated genome of species. [2] [6] [7] [8] [9]

Genomic representation

Once genomic DNA has been reduced to a manageable size from the previous step by incorporating one or two specific restriction enzymes, the next step involves selecting for the fragments that include largest amount of significant polymorphism across gene pool. These selected fragments are termed “representations” as they are smaller representations of the initial, larger genomic DNA. It is eminent to avoid repetitive sequences when selecting fragments as these will exhibit the lowest amount of polymorphism within analyzed genomic DNA. [2]

DArT assay

Digested sequences are ligated using T4 ligase to produce double stranded DNA. A small amount of ligated mixture will be diluted then amplified via PCR. During PCR, it is important to use primers complementary to the restriction-enzymes’ cutting sites and RedTaq polymerase, which is rarely inhibited. Mix product into an amplified, gene pool representation and ligate onto vector pCR2.1-TOPO. Following representation insertion into vector, transform vector into E. coli cells via electrical shocking or chemical means. Incubate cells and select based on ampicillin resistance and white-pigmentation from inactive β-galactosidase gene in a medium containing X-gal. [1] [2] Inserts are then amplified via PCR and inserted as spotters into a microarray slide. Slides are centrifuge to isolate inserts, which are then purified.

Fluorescent dyes, Cy3 or Cy5, are added to the microarray targets, which are genomic representations. Following addition of the fluorescent dye, targets are added to microarray probes containing the amplified E. coli clones where denaturing and subsequent hybridization, if possible, takes place. Following hybridization, slides are washed and scanned with an imaging system that targets fluorescent signals with the incorporation of an open-source software called DArTsoft. Interactions and dissimilarities between probe and various targets are used to develop a histogram which quantifies and identifies several forms of DNA polymorphism among analyzed genomes. [1] [2]

Applications

Molecular breeding

The ability to identify and quantify allelic variations among genomes without the need for a sequenced genome is of great value to DArT and has large implications in the molecular breeding sector. [1] [2] By comparing crops with phenotypes such as higher yields of produce or resistance to certain environmental parasites, a phenotype can be directly linked to a DNA polymorphism identified among related species through DArT. DArT is also able to outperform other genotyping techniques with polyploids due to the absence of primer competition found in other techniques. [2] Polyploids are commonly found among agriculturally important crops. [2] For example, DArT has been used to conduct genome-wide analysis among Musa species, which includes bananas and plantains, which led to the development of a phylogenetic cladogram based on genetic markers derived from DArT techniques. These developments enhance breeding knowledge to obtain desirable yields and products. [3]

Expedited recognition of markers found with genes responsible for phenotypes is also being studied in animals with the help of DArT. [2] [10] Mosquitoes’ resistance to insecticide has been linked to specific mutations in genes that confer resistance to certain species of mosquitoes over others. [10] Genotypic variations were found through markers while conducting DArT analysis on relevant samples.

Genomic mapping

Since DArT is able to find genetic relations among species within a metagenome in a cheap and expedited manner, it has been integral to developing physical and genetic maps of closely related species. [2] [8] In its inception, DArT was used to develop phylogenetic cladograms of rice subspecies based on the presence or absence of DNA fragments in each species’ genome. [1] In the same manner, DArT was incorporated in fabricating genetic maps for A. thaliana by conducting an automated version of DArT. [2] [9] Wheat, a hexaploid, is also another crop that has benefited from implementation of a DArT analysis as a Bacterial Artificial Chromosome (BAC) of the largest chromosome, 3B, was created from markers detected through DArT assays. [8] [11]

Related Research Articles

<span class="mw-page-title-main">Complementary DNA</span> Single-stranded DNA synthesized from RNA

In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a specific protein in a cell that does not normally express that protein, or to sequence or quantify mRNA molecules using DNA based methods. cDNA that codes for a specific protein can be transferred to a recipient cell for expression, often bacterial or yeast expression systems. cDNA is also generated to analyze transcriptomic profiles in bulk tissue, single cells, or single nuclei in assays such as microarrays, qPCR, and RNA-seq.

In molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, in order to distinguish individuals, populations, or species or to pinpoint the locations of genes within a sequence. The term may refer to a polymorphism itself, as detected through the differing locations of restriction enzyme sites, or to a related laboratory technique by which such differences can be illustrated. In RFLP analysis, a DNA sample is digested into fragments by one or more restriction enzymes, and the resulting restriction fragments are then separated by gel electrophoresis according to their size.

<span class="mw-page-title-main">DNA sequencing</span> Process of determining the nucleic acid sequence

DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

<span class="mw-page-title-main">Serial analysis of gene expression</span> Molecular biology technique

Serial Analysis of Gene Expression (SAGE) is a transcriptomic technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. Several variants have been developed since, most notably a more robust version, LongSAGE, RL-SAGE and the most recent SuperSAGE. Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.

TILLING is a method in molecular biology that allows directed identification of mutations in a specific gene. TILLING was introduced in 2000, using the model plant Arabidopsis thaliana, and expanded on into other uses and methodologies by a small group of scientists including Luca Comai. TILLING has since been used as a reverse genetics method in other organisms such as zebrafish, maize, wheat, rice, soybean, tomato and lettuce.

<span class="mw-page-title-main">Amplified fragment length polymorphism</span>

AFLP-PCR or just AFLP is a PCR-based tool used in genetics research, DNA fingerprinting, and in the practice of genetic engineering. Developed in the early 1990s by KeyGene, AFLP uses restriction enzymes to digest genomic DNA, followed by ligation of adaptors to the sticky ends of the restriction fragments. A subset of the restriction fragments is then selected to be amplified. This selection is achieved by using primers complementary to the adaptor sequence, the restriction site sequence and a few nucleotides inside the restriction site fragments. The amplified fragments are separated and visualized on denaturing on agarose gel electrophoresis, either through autoradiography or fluorescence methodologies, or via automated capillary sequencing instruments.

<span class="mw-page-title-main">Representation oligonucleotide microarray analysis</span>

Representational oligonucleotide microarray analysis (ROMA) is a technique that was developed by Michael Wigler and Rob Lucito at the Cold Spring Harbor Laboratory (CSHL) in 2003. Wigler and Lucito currently run laboratories at CSHL using ROMA to explore genomic copy number variation in cancer and other genetic diseases.

Terminal restriction fragment length polymorphism is a molecular biology technique for profiling of microbial communities based on the position of a restriction site closest to a labelled end of an amplified gene. The method is based on digesting a mixture of PCR amplified variants of a single gene using one or more restriction enzymes and detecting the size of each of the individual resulting terminal fragments using a DNA sequencer. The result is a graph image where the x-axis represents the sizes of the fragment and the y-axis represents their fluorescence intensity.

SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. An SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.

<span class="mw-page-title-main">Bisulfite sequencing</span> Lab procedure detecting 5-methylcytosines in DNA

Bisulfitesequencing (also known as bisulphite sequencing) is the use of bisulfite treatment of DNA before routine sequencing to determine the pattern of methylation. DNA methylation was the first discovered epigenetic mark, and remains the most studied. In animals it predominantly involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

<span class="mw-page-title-main">MAGIChip</span>

MAGIChips, also known as "microarrays of gel-immobilized compounds on a chip" or "three-dimensional DNA microarrays", are devices for molecular hybridization produced by immobilizing oligonucleotides, DNA, enzymes, antibodies, and other compounds on a photopolymerized micromatrix of polyacrylamide gel pads of 100x100x20µm or smaller size. This technology is used for analysis of nucleic acid hybridization, specific binding of DNA, and low-molecular weight compounds with proteins, and protein-protein interactions.

Massive parallel signature sequencing (MPSS) is a procedure that is used to identify and quantify mRNA transcripts, resulting in data similar to serial analysis of gene expression (SAGE), although it employs a series of biochemical and sequencing steps that are substantially different.

Molecular Inversion Probe (MIP) belongs to the class of Capture by Circularization molecular techniques for performing genomic partitioning, a process through which one captures and enriches specific regions of the genome. Probes used in this technique are single stranded DNA molecules and, similar to other genomic partitioning techniques, contain sequences that are complementary to the target in the genome; these probes hybridize to and capture the genomic target. MIP stands unique from other genomic partitioning strategies in that MIP probes share the common design of two genomic target complementary segments separated by a linker region. With this design, when the probe hybridizes to the target, it undergoes an inversion in configuration and circularizes. Specifically, the two target complementary regions at the 5’ and 3’ ends of the probe become adjacent to one another while the internal linker region forms a free hanging loop. The technology has been used extensively in the HapMap project for large-scale SNP genotyping as well as for studying gene copy alterations and characteristics of specific genomic loci to identify biomarkers for different diseases such as cancer. Key strengths of the MIP technology include its high specificity to the target and its scalability for high-throughput, multiplexed analyses where tens of thousands of genomic loci are assayed simultaneously.

<span class="mw-page-title-main">Restriction site associated DNA markers</span> Type of genetic marker

Restriction site associated DNA (RAD) markers are a type of genetic marker which are useful for association mapping, QTL-mapping, population genetics, ecological genetics and evolutionary genetics. The use of RAD markers for genetic mapping is often called RAD mapping. An important aspect of RAD markers and mapping is the process of isolating RAD tags, which are the DNA sequences that immediately flank each instance of a particular restriction site of a restriction enzyme throughout the genome. Once RAD tags have been isolated, they can be used to identify and genotype DNA sequence polymorphisms mainly in form of single nucleotide polymorphisms (SNPs). Polymorphisms that are identified and genotyped by isolating and analyzing RAD tags are referred to as RAD markers. Although genotyping by sequencing presents an approach similar to the RAD-seq method, they differ in some substantial ways.

<span class="mw-page-title-main">Combined bisulfite restriction analysis</span>

Combined Bisulfite Restriction Analysis is a molecular biology technique that allows for the sensitive quantification of DNA methylation levels at a specific genomic locus on a DNA sequence in a small sample of genomic DNA. The technique is a variation of bisulfite sequencing, and combines bisulfite conversion based polymerase chain reaction with restriction digestion. Originally developed to reliably handle minute amounts of genomic DNA from microdissected paraffin-embedded tissue samples, the technique has since seen widespread usage in cancer research and epigenetics studies.

Suspension array technology is a high throughput, large-scale, and multiplexed screening platform used in molecular biology. SAT has been widely applied to genomic and proteomic research, such as single nucleotide polymorphism (SNP) genotyping, genetic disease screening, gene expression profiling, screening drug discovery and clinical diagnosis. SAT uses microsphere beads to prepare arrays. SAT allows for the simultaneous testing of multiple gene variants through the use of these microsphere beads as each type of microsphere bead has a unique identification based on variations in optical properties, most common is fluorescent colour. As each colour and intensity of colour has a unique wavelength, beads can easily be differentiated based on their wavelength intensity. Microspheres are readily suspendable in solution and exhibit favorable kinetics during an assay. Similar to flat microarrays, an appropriate receptor molecule, such as DNA oligonucleotide probes, antibodies, or other proteins, attach themselves to the differently labeled microspheres. This produces thousands of microsphere array elements. Probe-target hybridization is usually detected by optically labeled targets, which determines the relative abundance of each target in the sample.

<span class="mw-page-title-main">DNA nanoball sequencing</span>

DNA nanoball sequencing is a high throughput sequencing technology that is used to determine the entire genomic sequence of an organism. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Fluorescent nucleotides bind to complementary nucleotides and are then polymerized to anchor sequences bound to known sequences on the DNA template. The base order is determined via the fluorescence of the bound nucleotides This DNA sequencing method allows large numbers of DNA nanoballs to be sequenced per run at lower reagent costs compared to other next generation sequencing platforms. However, a limitation of this method is that it generates only short sequences of DNA, which presents challenges to mapping its reads to a reference genome. After purchasing Complete Genomics, the Beijing Genomics Institute (BGI) refined DNA nanoball sequencing to sequence nucleotide samples on their own platform.

In the field of genetic sequencing, genotyping by sequencing, also called GBS, is a method to discover single nucleotide polymorphisms (SNP) in order to perform genotyping studies, such as genome-wide association studies (GWAS). GBS uses restriction enzymes to reduce genome complexity and genotype multiple DNA samples. After digestion, PCR is performed to increase fragments pool and then GBS libraries are sequenced using next generation sequencing technologies, usually resulting in about 100bp single-end reads. It is relatively inexpensive and has been used in plant breeding. Although GBS presents an approach similar to restriction-site-associated DNA sequencing (RAD-seq) method, they differ in some substantial ways.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Jaccoud D, Peng K, Feinstein D, Kilian A (February 2001). "Diversity arrays: a solid state technology for sequence information independent genotyping". Nucleic Acids Research. 29 (4): 25e–25. doi:10.1093/nar/29.4.e25. PMC   29632 . PMID   11160945.
  2. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H, et al. (2012). "Diversity arrays technology: a generic genome profiling technology on open platforms". In Pompanon F, Bonin A (eds.). Data Production and Analysis in Population Genomics. Methods in Molecular Biology. Vol. 888. Totowa, NJ: Humana Press. pp. 67–89. doi:10.1007/978-1-61779-870-2_5. ISBN   978-1-61779-870-2. PMID   22665276. Data Production and Analysis in Population Genomics: Methods and Protocols
  3. 1 2 3 Risterucci AM, Hippolyte I, Perrier X, Xia L, Caig V, Evers M, et al. (October 2009). "Development and assessment of Diversity Arrays Technology for high-throughput DNA analyses in Musa". TAG. Theoretical and Applied Genetics. Theoretische und Angewandte Genetik. 119 (6): 1093–1103. doi:10.1007/s00122-009-1111-5. PMID   19693484. S2CID   23747800.
  4. 1 2 Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T, Hornes M, et al. (November 1995). "AFLP: a new technique for DNA fingerprinting". Nucleic Acids Research. 23 (21): 4407–4414. doi:10.1093/nar/23.21.4407. PMC   307397 . PMID   7501463.
  5. 1 2 Weber JL, May PE (March 1989). "Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction". American Journal of Human Genetics. 44 (3): 388–396. PMC   1715443 . PMID   2916582.
  6. 1 2 Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, et al. (November 1999). "Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome". Nature Genetics. 23 (3): 305–308. doi:10.1038/15479. PMID   10545948. S2CID   19943394.
  7. 1 2 Wenzl P, Carling J, Kudrna D, Jaccoud D, Huttner E, Kleinhofs A, Kilian A (June 2004). "Diversity Arrays Technology (DArT) for whole-genome profiling of barley". Proceedings of the National Academy of Sciences of the United States of America. 101 (26): 9915–9920. Bibcode:2004PNAS..101.9915W. doi: 10.1073/pnas.0401076101 . PMC   470773 . PMID   15192146.
  8. 1 2 3 4 Akbari M, Wenzl P, Caig V, Carling J, Xia L, Yang S, et al. (November 2006). "Diversity arrays technology (DArT) for high-throughput profiling of the hexaploid wheat genome". TAG. Theoretical and Applied Genetics. Theoretische und Angewandte Genetik. 113 (8): 1409–1420. doi:10.1007/s00122-006-0365-4. PMID   17033786. S2CID   12636193.
  9. 1 2 Wittenberg AH, van der Lee T, Cayla C, Kilian A, Visser RG, Schouten HJ (August 2005). "Validation of the high-throughput marker technology DArT using the model plant Arabidopsis thaliana". Molecular Genetics and Genomics. 274 (1): 30–39. doi:10.1007/s00438-005-1145-6. PMID   15937704. S2CID   34817585.
  10. 1 2 Bonin A, Paris M, Després L, Tetreau G, David JP, Kilian A (October 2008). "A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection". BMC Genomics. 9 (1): 459. doi:10.1186/1471-2164-9-459. PMC   2579443 . PMID   18837997.
  11. Paux E, Sourdille P, Salse J, Saintenac C, Choulet F, Leroy P, et al. (October 2008). "A physical map of the 1-gigabase bread wheat chromosome 3B". Science. 322 (5898): 101–104. Bibcode:2008Sci...322..101P. doi:10.1126/science.1161847. PMID   18832645. S2CID   27686615.