Comparative Toxicogenomics Database

Last updated
Comparative Toxicogenomics Database (CTD)
Developer(s) Department of Biological Sciences at North Carolina State University and the Department of Bioinformatics, MDI Biological Laboratory
Initial release12 November 2004;19 years ago (2004-11-12)
Available inEnglish
Type Bioinformatics, data analysis
Website ctdbase.org

The Comparative Toxicogenomics Database (CTD) is a public website and research tool launched in November 2004 that curates scientific data describing relationships between chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, GO annotations, pathways, and interaction modules. The database is maintained by the Department of Biological Sciences at North Carolina State University.

Contents

Background

The Comparative Toxicogenomics Database (CTD) is a public website and research tool that curates scientific data describing relationships between chemicals, genes/proteins, diseases, taxa, phenotypes, GO annotations, pathways, and interaction modules, launched on November 12, 2004. [1] [2] [3] [4] The database is maintained by the Department of Biological Sciences at North Carolina State University.[ citation needed ]

Goals and objectives

One of the primary goals of CTD is to advance the understanding of the effects of environmental chemicals on human health on the genetic level, a field called toxicogenomics.

The etiology of many chronic diseases involves interactions between environmental factors and genes that modulate important physiological processes. Chemicals are an important component of the environment. Conditions such as asthma, cancer, diabetes, hypertension, immunodeficiency, and Parkinson's disease are known to be influenced by the environment; however, the molecular mechanisms underlying these correlations are not well understood. CTD may help resolve these mechanisms. The most up-to-date extensive list of peer-reviewed scientific articles about CTD is available at their publications page [5]

Core data

CTD is a unique resource where biocurators [6] [7] read the scientific literature and manually curate four types of core data:

Data integration

By integrating the above four data sets, CTD automatically constructs putative chemical-gene-phenotype-disease networks to illuminate molecular mechanisms underlying environmentally-influenced diseases.

These inferred relationships are statistically scored and ranked and can be used by scientists and computational biologists to generate and verify testable hypotheses about toxicogenomic mechanisms and how they relate to human health.

Users can search CTD to explore scientific data for chemicals, genes, diseases, or interactions between any of these three concepts. Currently,[ when? ] CTD integrates toxicogenomic data for vertebrates and invertebrates.

CTD integrates data from or hyperlinks to these databases:

Related Research Articles

<span class="mw-page-title-main">Single-nucleotide polymorphism</span> Single nucleotide in genomic DNA at which different sequence alternatives exist

In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population.

Online Mendelian Inheritance in Man (OMIM) is a continuously updated catalog of human genes and genetic disorders and traits, with a particular focus on the gene-phenotype relationship. As of 28 June 2019, approximately 9,000 of the over 25,000 entries in OMIM represented phenotypes; the rest represented genes, many of which were related to known phenotypes.

CTD may refer to:

The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and gene product attributes; 2) annotate genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the Open Biomedical Ontologies, being one of the Initial Candidate Members of the OBO Foundry.

Toxicogenomics is a subdiscipline of pharmacology that deals with the collection, interpretation, and storage of information about gene and protein activity within a particular cell or tissue of an organism in response to exposure to toxic substances. Toxicogenomics combines toxicology with genomics or other high-throughput molecular profiling technologies such as transcriptomics, proteomics and metabolomics. Toxicogenomics endeavors to elucidate the molecular mechanisms evolved in the expression of toxicity, and to derive molecular expression patterns that predict toxicity or the genetic susceptibility to it.

The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.

<span class="mw-page-title-main">KEGG</span> Collection of bioinformatics databases

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

<span class="mw-page-title-main">PHI-base</span>

The Pathogen-Host Interactions database (PHI-base) is a biological database that contains manually curated information on genes experimentally proven to affect the outcome of pathogen-host interactions. The database has been maintained by researchers at Rothamsted Research and external collaborators since 2005. PHI-base has been part of the UK node of ELIXIR, the European life-science infrastructure for biological information, since 2016.

<span class="mw-page-title-main">Glycine—tRNA ligase</span> Protein-coding gene in the species Homo sapiens

Glycine—tRNA ligase also known as glycyl–tRNA synthetase is an enzyme that in humans is encoded by the GARS1 gene.

The Hazardous Substances Data Bank (HSDB) was a toxicology database on the U.S. National Library of Medicine's (NLM) Toxicology Data Network (TOXNET). It focused on the toxicology of potentially hazardous chemicals, and included information on human exposure, industrial hygiene, emergency handling procedures, environmental fate, regulatory requirements, and related areas. All data were referenced and derived from a core set of books, government documents, technical reports, and selected primary journal literature. Prior to 2020, all entries were peer-reviewed by a Scientific Review Panel (SRP), members of which represented a spectrum of professions and interests. Last Chairs of the SRP are Dr. Marcel J. Cassavant, MD, Toxicology Group, and Dr. Roland Everett Langford, PhD, Environmental Fate Group. The SRP was terminated due to budget cuts and realignment of the NLM.

GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science, in collaboration with LifeMap Sciences.

A biological pathway is a series of interactions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a fat or protein. Pathways can also turn genes on and off, or spur a cell to move. Some of the most common biological pathways are involved in metabolism, the regulation of gene expression and the transmission of signals. Pathways play a key role in advanced studies of genomics.

<span class="mw-page-title-main">IRX1</span> Protein-coding gene in the species Homo sapiens

Iroquois-class homeodomain protein IRX-1, also known as Iroquois homeobox protein 1, is a protein that in humans is encoded by the IRX1 gene. All members of the Iroquois (IRO) family of proteins share two highly conserved features, encoding both a homeodomain and a characteristic IRO sequence motif. Members of this family are known to play numerous roles in early embryo patterning. IRX1 has also been shown to act as a tumor suppressor gene in several forms of cancer.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

The Monarch Initiative is a large scale bioinformatics web resource focused on leveraging existing biomedical knowledge to connect genotypes with phenotypes in an effort to aid research that combats genetic diseases. Monarch does this by integrating multi-species genotype, phenotype, genetic variant and disease knowledge from various existing biomedical data resources into a centralized and structured database. While this integration process has been traditionally done manually by basic researchers and clinicians on a case-by-case basis, The Monarch Initiative provides an aggregated and structured collection of data and tools that make biomedical knowledge exploration more efficient and effective.

<span class="mw-page-title-main">Canto (gene curation tool)</span>

Canto is a web-based tool to support the curation of gene-specific scientific data, by both professional biocurators and publication authors. Canto was developed as part of the PomBase project, and is funded by the Wellcome Trust.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

References

  1. Mattingly CJ, Rosenstein MC, Colby GT, Forrest JN, Boyer JL (Sep 2006). "The Comparative Toxicogenomics Database (CTD): A Resource for Comparative Toxicological Studies". Journal of Experimental Zoology Part A: Comparative Experimental Biology. 305 (9): 689–92. doi:10.1002/jez.a.307. PMC   1586110 . PMID   16902965.
  2. Mattingly CJ, Rosenstein MC, Davis AP, Colby GT, Forrest JN, Boyer JL (Aug 2006). "The Comparative Toxicogenomics Database (CTD): A Cross-Species Resource for Building Chemical-Gene Interaction Networks". Toxicol. Sci. 92 (2): 587–95. doi:10.1093/toxsci/kfl008. PMC   1586111 . PMID   16675512.
  3. Mattingly CJ, Colby GT, Rosenstein MC, Forrest JN, Boyer JL (2004). "Promoting comparative molecular studies in environmental health research: an overview of the comparative toxicogenomics database (CTD)". Pharmacogenomics J. 4 (1): 5–8. doi: 10.1038/sj.tpj.6500225 . PMID   14735110.
  4. Mattingly CJ, Colby GT, Forrest JN, Boyer JL (May 2003). "The Comparative Toxicogenomics Database (CTD)". Environ. Health Perspect. 111 (6): 793–5. doi:10.1289/ehp.6028. PMC   1241500 . PMID   12760826. Archived from the original on 2010-06-06.
  5. CTD Publications page ctdbase.org
  6. Bourne PE, McEntyre J (Oct 2006). "Biocurators: Contributors to the World of Science". PLOS Comput. Biol. 2 (10): e142. Bibcode:2006PLSCB...2..142B. doi: 10.1371/journal.pcbi.0020142 . PMC   1626157 . PMID   17411327.
  7. Salimi N, Vita R (Oct 2006). "The Biocurator: Connecting and Enhancing Scientific Data". PLOS Comput. Biol. 2 (10): e125. Bibcode:2006PLSCB...2..125S. doi: 10.1371/journal.pcbi.0020125 . PMC   1626147 . PMID   17069454.
  8. ChemIDplus US National Library of Medicine, n.d., retrieved 7 November 2015
  9. diXa Data Warehouse n.d., retrieved 7 November 2015
  10. Hendrickx, D. M.; Aerts, H. J. W. L.; Caiment, F.; Clark, D.; Ebbels, T. M. D.; Evelo, C. T.; Gmuender, H.; Hebels, D. G. A. J.; Herwig, R.; Hescheler, J.; Jennen, D. G. J.; Jetten, M. J. A.; Kanterakis, S.; Keun, H. C.; Matser, V.; Overington, J. P.; Pilicheva, E.; Sarkans, U.; Segura-Lepe, M. P.; Sotiriadou, I.; Wittenberger, T.; Wittwehr, C.; Zanzi, A.; Kleinjans, J. C. S. (12 December 2014). "diXa: a data infrastructure for chemical safety assessment". Bioinformatics. 31 (9): 1505–1507. doi:10.1093/bioinformatics/btu827. PMC   4410652 . PMID   25505093.
  11. Train online, Disease data European Molecular Biology Laboratory, n.d., retrieved 7 November 2015
  12. NCBI Taxonomy