Eukaryotic Pathogen Database

Last updated
EuPathDB Logo.jpg

The Eukaryotic Pathgen Database, or EuPathDB, is a database of bioinformatic and experimental data related to a variety of eukaryotic pathogens. It was established in 2006 under a National Institutes of Health program to create Bioinformatics Resource Centers to facilitate research on pathogens that may pose biodefense threats.[ citation needed ] EuPathDB stores data related to its organisms of interest and provides tools for searching through and analyzing the data. It currently consists of 14 component databases, each dedicated to a certain research topic. EuPathDB includes:[ citation needed ]

Contents

History

EuPathDB was established under the NIH Bioinformatics Resource Centers program as ApiDB, a resource meant to cover Apicomplexan parasites. [1] ApiDB originally consisted of component sites CryptoDB (for Cryptosporidium ), PlasmoDB (for Plasmodium ), and ToxoDB (for Toxoplasma gondii ). [2] As ApiDB grew to focus on eukaryotic pathogens beyond Apicomplexans, the name was changed to EuPathDB to support its broadened scope. [3] EuPathDB was the result of collaboration between many different parasitologists, including David Roos, Jessica Kissinger and Dyann Wirth. [4] [5]

Functions

It is an integrated database covering the eukaryotic pathogens in several genera. It enables the accessing of detailed genome information associated with these pathogens. EuPathDB was formerly known as ApiDB and was the integrated resources for the apicomplexans covering the databases of associated pathogens, ToxoDB, PiroplasmDB and CryptoDB. [2]

Presently EuPathDB covers 11 databases, the latest addition being that of Piroplasma which supports Babesia and Theileria . This BRC is one of five centres being funded to provide support to research bodies. EuPath supports the investigation of eukaryotic pathogens, and the other four centres support the investigation of other disease pathogens. [6] It has developed a sophisticated search system providing invaluable help to researchers. [3]

Component databases

EuPathDB consists of 14 component databases, each with a particular focus: [7]

Related Research Articles

Apicomplexa Phylum of parasitic alveolates

The Apicomplexa are a large phylum of parasitic alveolates. Most of them possess a unique form of organelle that comprises a type of non-photosynthetic plastid called an apicoplast, and an apical complex structure. The organelle is an adaptation that the apicomplexan applies in penetration of a host cell.

Biological database

Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.

Microsporidia Phylum of fungi

Microsporidia are a group of spore-forming unicellular parasites. These spores contain an extrusion apparatus that has a coiled polar tube ending in an anchoring disc at the apical part of the spore. They were once considered protozoans or protists, but are now known to be fungi, or a sister group to fungi. This type of fungi obligates eukaryotic parasites that use a unique mechanism to infect host cells. They have recently been discovered to infect Coleoptera on a large scale, in a 2017 Cornell study. Loosely 1500 of the probably more than one million species are named. Microsporidia are restricted to animal hosts, and all major groups of animals host microsporidia. Most infect insects, but they are also responsible for common diseases of crustaceans and fish. The named species of microsporidia usually infect one host species or a group of closely related taxa. Approximately 10 percent of the species are parasites of vertebrates —several species, most of which are opportunistic, can infect humans, in whom they can cause microsporidiosis.

Sequence homology Shared ancestry between DNA, RNA or protein sequences

Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).

<i>Cryptosporidium</i> Genus of single-celled organisms

Cryptosporidium, sometimes informally called crypto, is a genus of apicomplexan parasitic alveolates that can cause a respiratory and gastrointestinal illness (cryptosporidiosis) that primarily involves watery diarrhea with or without a persistent cough in both immunocompetent and immunodeficient humans.

In academia, computational immunology is a field of science that encompasses high-throughput genomic and bioinformatics approaches to immunology. The field's main aim is to convert immunological data into computational problems, solve these problems using mathematical and computational approaches and then convert these results into immunologically meaningful interpretations.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

PHI-base

The Pathogen-Host Interactions database (PHI-base) is a biological database that contains curated information on genes experimentally proven to affect the outcome of pathogen-host interactions. The database is maintained by researchers at Rothamsted Research, together with external collaborators since 2005. Since April 2017 PHI-base is part of ELIXIR, the European life-science infrastructure for biological information via its ELIXIR-UK node.

<i>Theileria</i> Genus of single-celled organisms

Theileria is a genus of parasites that belongs to the phylum Apicomplexa, and is closely related to Plasmodium. Two Theileria species, T. annulata and T. parva, are important cattle parasites. T. annulata causes tropical theileriosis and T. parva causes East Coast fever. Theileria species are transmitted by ticks. The genomes of T. orientalis Shintoku, Theileria equi WA, Theileria annulata Ankara and Theileria parva Muguga have been sequenced and published.

UCSC Malaria Genome Browser is a bioinformatic research tool to study the malaria genome, developed by Hughes Undergraduate Research Laboratory together with the laboratory of Prof. Manuel Ares Jr. at the University of California, Santa Cruz.

BIOBASE is an international bioinformatics company headquartered in Wolfenbüttel, Germany. The company focuses on the generation, maintenance, and licensing of databases in the field of molecular biology, and their related software platforms.

The Bioinformatics Resource Centers (BRCs) are a group of five Internet-based research centers established in 2004 and funded by NIAID The BRCs were formed in response to the threats posed by emerging and re-emerging pathogens, particularly Centers for Disease Control and Prevention (CDC) Category A, B, and C pathogens, and their potential use in bioterrorism. The intention of NIAID in funding these bioinformatics centers is to assist researchers involved in the experimental characterization of such pathogens and the formation of drugs, vaccines, or diagnostic tools to combat them.

PlasmoDB is a biological database for the genus Plasmodium. The database is a member of the EuPathDB project. The database contains extensive genome, proteome and metabolome information relating to malaria parasites.

AmoebaDB is a functional genomics database for the genetics of amoebozoa.

The Mammalian Promoter Database (MPromDb) is a curated database of gene promoters identified from ChIP-seq. The proximal promoter region contains the cis-regulatory elements of most of the transcription factors (TFs).

OrthoDB

OrthoDB presents a catalog of orthologous protein-coding genes across vertebrates, arthropods, fungi, plants, and bacteria. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each major radiation along the species phylogeny. The database of orthologs presents available protein descriptors, together with Gene Ontology and InterPro attributes, which serve to provide general descriptive annotations of the orthologous groups, and facilitate comprehensive orthology database querying. OrthoDB also provides computed evolutionary traits of orthologs, such as gene duplicability and loss profiles, divergence rates, sibling groups, and gene intron-exon architectures.

PhytoPath

PhytoPath was a joint scientific project between the European Bioinformatics Institute and Rothamsted Research, running from January 2012 to May 30, 2017. The project aimed to enable the exploitation of the growing body of “-omics” data being generated for phytopathogens, their plant hosts and related model species. Gene mutant phenotypic information is directly displayed in genome browsers.

EPD is a biological database and web resource of eukaryotic RNA polymerase II promoters with experimentally defined transcription start sites. Originally, EPD was a manually curated resource relying on transcript mapping experiments targeted at individual genes and published in academic journals. More recently, automatically generated promoter collections derived from electronically distributed high-throughput data produced with the CAGE or TSS-Seq protocols were added as part of a special subsection named EPDnew. The EPD web server offers additional services, including an entry viewer which enables users to explore the genomic context of a promoter in a UCSC Genome Browser window, and direct links for uploading EPD-derived promoter subsets to associated web-based promoter analysis tools of the Signal Search Analysis (SSA) and ChIP-Seq servers. EPD also features a collection of position weight matrices (PWMs) for common promoter sequence motifs.

Jessica Kissinger is a Distinguished Research Professor at the Franklin College of Arts and Sciences, University of Georgia and director of the Institute of Bioinformatics. Her research focus is on the evolution, assembly and data curation of protozoan parasite genomes, particularly Cryptosporidium, Toxoplasma gondii and Plasmodium.

References

  1. Greene JM, Collins F, Lefkowitz EJ, Roos D, Scheuermann RH, Sobral B, Stevens R, White O, Francsco VD (2007). "National Institute of Allergy and Infectious Diseases Bioinformatics Resource Centers: New Assets for Pathogen Informatics". Infection and Immunity. 75 (7): 3212–3219. doi:10.1128/IAI.00105-07. PMC   1932942 . PMID   17420237.
  2. 1 2 Aurrecoechea C, Heiges M, Wang H, Wang Z, Fischer S, Rhodes P, Miller J, Kraemer E, Stoeckert CJ Jr, Roos DS, Kissinger JC (2007). "ApiDB: integrated resources for the apicomplexan bioinformatics resource center". Nucleic Acids Res. 35 (Database issue): D427-30. doi:10.1093/nar/gkl880. PMC   1669770 . PMID   17098930.
  3. 1 2 Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, Innamorato F, Iodice J, Kissinger JC, Kraemer ET, Li W, Miller JA, Nayak V, Pennington C, Pinney DF, Roos DS, Ross C, Srinivasamoorthy G, Stoeckert CJ Jr, Thibodeau R, Treatman C, Wang H (2010). "EuPathDB: a portal to eukaryotic pathogen databases". Nucleic Acids Res. 38 (Database issue): D415-9. doi:10.1093/nar/gkp941. PMC   2808945 . PMID   19914931.
  4. "Parasitologist, Reprogrammed: A Profile of David Roos". The Scientist Magazine®. Retrieved 2019-08-02.
  5. "EuPathDB : About All". eupathdb.org. Retrieved 2019-08-02.
  6. Aurrecoechea C, Barreto A, et al. (2013). "EuPathDB:the eukaryotic pathogen database". Nucleic Acids Research. 41 (Database issue): D684-91. doi:10.1093/nar/gks1113. PMC   3531183 . PMID   23175615.
  7. "The Eukaryotic Pathogen genome resource". EuPathDB. Retrieved 2013-11-11.