Barcode of Life Data System

Last updated

The Barcode of Life Data System (commonly known as BOLD or BOLDSystems) is a sequence database specifically devoted to DNA barcoding. It also provides an online platform for analyzing DNA sequences. [1] As of 2017, BOLD included over 5.9 million DNA barcode sequences from over 542,000 species. [2]

In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized ("digital") nucleic acid sequences, protein sequences, or other polymer sequences stored on a computer. The UniProt database is an example of a protein sequence database. As of 2013 it contained over 40 million sequences and is growing at an exponential rate. Historically, sequences were published in paper form, but as the number of sequences grew, this storage method became unsustainable.

DNA barcoding Method of species identification using a short section of DNA

DNA barcoding is a method of species identification using a short section of DNA from a specific gene or genes. The premise of DNA barcoding is that, by comparison with a reference library of such DNA sections, an individual sequence can be used to uniquely identify an organism to species, in the same way that a supermarket scanner uses the familiar black stripes of the UPC barcode to identify an item in its stock against its reference database. These "barcodes" are sometimes used in an effort to identify unknown species, parts of an organism, or simply to catalog as many taxa as possible, or to compare with traditional taxonomy in an effort to determine species boundaries.

Nucleic acid sequence A succession of nucleotides in a nucleic acid

A nucleic acid sequence is a succession of letters that indicate the order of nucleotides forming alleles within a DNA or RNA (GACU) molecule. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.

Related Research Articles

Bioinformatics Software tools for understanding biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics and statistics to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

Barcode optical machine-readable representation of data

A barcode is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths and spacings of parallel lines. These barcodes, now commonly referred to as linear or one-dimensional (1D), can be scanned by special optical scanners, called barcode readers. Later, two-dimensional (2D) variants were developed, using rectangles, dots, hexagons and other geometric patterns, called matrix codes or 2D barcodes, although they do not use bars as such. 2D barcodes can be read or deconstructed using application software on mobile devices with inbuilt cameras, such as smartphones.

DNA sequencer

A DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: G (guanine), C (cytosine), A (adenine) and T (thymine). This is then reported as a text string, called a read. Some DNA sequencers can be also considered optical instruments as they analyze light signals originating from fluorochromes attached to nucleotides.

National Center for Biotechnology Information database arm of the US National Library of Medicine

The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper.

Bold is a font style used for emphasis.

Functional genomics

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional “gene-by-gene” approach.

DNA sequencing process of determining the nucleic acid sequence – the order of nucleotides in DNA

DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

<i>Astraptes fulgerator</i> species of insect

Astraptes fulgerator, the two-barred flasher, is a cryptic species complex in the spread-wing skipper butterfly genus Astraptes. It ranges all over the Americas, from the southern United States to northern Argentina.

The Consortium for the Barcode of Life (CBOL) is an international initiative dedicated to supporting the development of DNA barcoding as a global standard for species identification. CBOL's Secretariat Office is hosted by the National Museum of Natural History, Smithsonian Institution, in Washington, DC. Barcoding was proposed in 2003 by Prof. Paul Hebert of the University of Guelph in Ontario as a way of distinguishing and identifying species with a short standardized gene sequence. Hebert proposed the 648 bases of the Folmer region of the mitochondrial gene cytochrome-C oxidase-1 as the standard barcode region. Dr. Hebert is the Director of the Biodiversity Institute of Ontario, the Canadian Centre for DNA Barcoding, and the International Barcode of Life Project (iBOL), all headquartered at the University of Guelph. The Barcode of Life Data Systems (BOLD) is also located at the University of Guelph.

History of genetics

The history of genetics dates from the classical era with contributions by Pythagoras, Hippocrates, Aristotle, Epicurus, and others. Modern genetics began with the work of the Augustinian friar Gregor Johann Mendel. His work on pea plants, published in 1866, established the theory of Mendelian inheritance.

In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction. Other ways of defining species include their karyotype, DNA sequence, morphology, behaviour or ecological niche. In addition, paleontologists use the concept of the chronospecies since fossil reproduction cannot be examined.

Optical mapping is a technique for constructing ordered, genome-wide, high-resolution restriction maps from single, stained molecules of DNA, called "optical maps". By mapping the location of restriction enzyme sites along the unknown DNA of an organism, the spectrum of resulting DNA fragments collectively serves as a unique "fingerprint" or "barcode" for that sequence. Originally developed by Dr. David C. Schwartz and his lab at NYU in the 1990s this method has since been integral to the assembly process of many large-scale sequencing projects for both microbial and eukaryotic genomes. Later technologies use DNA melting, DNA competitive binding or enzymatic labelling in order to create the optical mappings.

<i>Lampruna rosea</i> species of insect

Lampruna rosea is a moth of the subfamily Arctiinae. It was described by Schaus in 1894. It is found in Peru, Colombia, Venezuela, Costa Rica, Guatemala, Panama and Mexico.

Duplex sequencing

Duplex sequencing is a library preparation and analysis method for next-generation sequencing (NGS) platforms that employs random tagging of double stranded DNA to detect mutations with higher accuracy and lower error rate. This method uses degenerate molecular tags in addition to sequencing adapters to recognize reads originating from each strand of DNA. The generated sequencing reads then will be analyzed using two methods: single strand consensus sequences (SSCSs) and Duplex consensus sequences (DCSs) assembly. Duplex sequencing theoretically can detect mutations with frequencies as low as 5 x 10−8 that is more than 10,000 fold higher in accuracy compared to the conventional next-generation sequencing methods.

Braconinae subfamily of insects

The Braconinae are a large subfamily of braconid parasitoid wasps with more than 2,000 described species. Many species, including Bracon brevicornis, have been used in biocontrol programs.

The Cardiochilinae are a subfamily of braconid parasitoid wasps. This subfamily has been treated as a tribe of Microgastrinae in the past. Some species including Toxoneuron nigriceps have been used in biocontrol programs.

Hachimoji DNA Synthetic DNA

Hachimoji DNA is a synthetic nucleic acid analog that uses four synthetic nucleotides in addition to the four present in the natural nucleic acids, DNA and RNA. This leads to four allowed base pairs: two unnatural base pairs formed by the synthetic nucleobases in addition to the two normal pairs. Hachimoji bases have been demonstrated in both DNA and RNA analogs, using deoxyribose and ribose respectively as the backbone sugar.

Microbial DNA barcoding is the use of meta DNA barcoding to characterize a mixture of microorganisms.

DNA barcoding methods for fish are used to identify groups of fish based on DNA sequences within selected regions of a genome. These methods can be used to study fish, as genetic material, in the form of environmental DNA (eDNA) or cells, is freely diffused in the water. This allows researchers to identify which species are present in a body of water by collecting a water sample, extracting DNA from the sample and isolating DNA sequences that are specific for the species of interest. Barcoding methods can also be used for biomonitoring and food safety validation, animal diet assessment, assessment of food webs and species distribution, and for detection of invasive species.

References

  1. Ratnasingham, Sujeevan; Paul D. N. Hebert (2007). "BOLD: The Barcode of Life Data System (http://www.barcodinglife.org)". Molecular Ecology Notes. 7 (3): 355–364. doi:10.1111/j.1471-8286.2007.01678.x. PMC   1890991 . PMID   18784790.
  2. Stoeckle, Mark (November–December 2013). "DNA Barcoding Ready for Breakout". GeneWatch. 26 (5).