Global microbial identifier

Last updated

The genomic epidemiological database for global identification of microorganisms or global microbial identifier [1] is a platform for storing whole genome sequencing data of microorganisms, for the identification of relevant genes and for the comparison of genomes to detect and track-and-trace infectious disease outbreaks and emerging pathogens. [2] The database holds two types of information: 1) genomic information of microorganisms, linked to, 2) metadata of those microorganism such as epidemiological details. The database includes all genera of microorganisms: bacteria, viruses, parasites and fungi.[ citation needed ]

Contents

Technology

For genotyping of microorganisms for medical diagnosis, or other purposes, scientists may use a wide variety of DNA profiling techniques, such as polymerase chain reaction, pulsed-field gel electrophoresis or multilocus sequence typing. A complication of this broad variety of techniques is the difficulty to standardize between techniques, laboratories and microorganisms, which may be overcome using the complete DNA code of the genome generated by whole genome sequencing. [3] For straightforward diagnostic identification, the whole genome sequencing information of a microbiological sample is fed into a global genomic database and compared using BLAST procedures to the genomes already present in the database. [4] In addition, whole genome sequencing data may be used to back calculate to the different pre-whole genome sequencing genotyping methods, so previous collected valuable information is not lost. [5] [6] For the global microbial identifier the genomic information is coupled to a wide spectrum of metadata about the specific microbial clone and includes important clinical and epidemiological information such as the global finding places, treatment options and antimicrobial resistance, making it a general microbiological identification tool. This makes personalized treatment of microbial disease possible as well as real-time tracing systems for global surveillance of infectious diseases for food safety and serving human health.[ citation needed ]

The initiative

The initiative for building the database arose in 2011 and when several preconditions were met: 1) whole genome sequencing has become mature and serious alternative for other genotyping techniques, [7] [8] 2) the price of whole genome sequencing has started falling dramatically and in some cases below the price of traditional identifications, 3) vast amounts of IT resources and a fast Internet have become available, and 4) there is the idea that via a cross sectoral and One Health approach infectious diseases may be better controlled. [9] [10]

Starting the second millennium, many microbiological laboratories, as well as national health institutes, started genome sequencing projects for sequencing the infectious agents collections they had in their biobanks. [11] [12] Thereby generating private databases and sending model genomes to global nucleotide databases such as GenBank of the National Center for Biotechnology Information [13] or the nucleotide database of the EMBL. [14] This created a wealth of genomic information and independent databases for eukaryotic as well as prokaryotic genomes. [15] [16] [17] The need to further integrate these databases and to harmonize data collection, and to link the genomic data to metadata for optimal prevention of infectious diseases, was generally recognized by the scientific community. [18] In 2011, several infectious disease control centers and other organizations took the initiative of a series of international scientific- and policy-meetings, to develop a common platform and to better understand the potentials of an interactive microbiological genomic database. The first meeting was in Brussels, September 2011, [19] [20] followed by meetings in Washington (March 2012) and Copenhagen [21] (February 2013). In addition to experts from around the globe, Intergovernmental Organizations have been included in the action, notably the World Health Organization and the World Organization for Animal Health.[ citation needed ]

Development plan

A detailed roadmap [22] for the development of the database was set up with the following general timeline:

2010 - 2012: Development of pilot systems. [4]
2011 - 2013: International structural start-up, with the formation of an international core group, analysis of the present and future landscape to build the database, and diplomacy efforts to bring the relevant groups together.
2012 - 2016: Development of a robust IT-backbone for the database, and development of novel genome analysis algorithms and software.
2017 - 2020: Construction of a global solution, including the creation of networks and regional hubs.

Steering committee

Current members:

Former members:

See also

Related Research Articles

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">Microbial ecology</span> Study of the relationship of microorganisms with their environment

Microbial ecology is the ecology of microorganisms: their relationship with one another and with their environment. It concerns the three major domains of life—Eukaryota, Archaea, and Bacteria—as well as viruses.

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

Multilocus sequence typing (MLST) is a technique in molecular biology for the typing of multiple loci, using DNA sequences of internal fragments of multiple housekeeping genes to characterize isolates of microbial species.

The European Bioinformatics Institute (EMBL-EBI) is an intergovernmental organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Wellcome Genome Campus in Hinxton near Cambridge, and employs over 600 full-time equivalent (FTE) staff. Institute leaders such as Rolf Apweiler, Alex Bateman, Ewan Birney, and Guy Cochrane, an adviser on the National Genomics Data Center Scientific Advisory Board, serve as part of the international research network of the BIG Data Center at the Beijing Institute of Genomics.

Genotyping is the process of determining differences in the genetic make-up (genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their parents. Traditionally genotyping is the use of DNA sequences to define biological populations by use of molecular tools. It does not usually involve defining the genes of an individual.

<span class="mw-page-title-main">Medical microbiology</span> Branch of medical science

Medical microbiology, the large subset of microbiology that is applied to medicine, is a branch of medical science concerned with the prevention, diagnosis and treatment of infectious diseases. In addition, this field of science studies various clinical applications of microbes for the improvement of health. There are four kinds of microorganisms that cause infectious disease: bacteria, fungi, parasites and viruses, and one type of infectious protein called prion.

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

<span class="mw-page-title-main">Human Microbiome Project</span> Former research initiative

The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.

Pathogenomics is a field which uses high-throughput screening technology and bioinformatics to study encoded microbe resistance, as well as virulence factors (VFs), which enable a microorganism to infect a host and possibly cause disease. This includes studying genomes of pathogens which cannot be cultured outside of a host. In the past, researchers and medical professionals found it difficult to study and understand pathogenic traits of infectious organisms. With newer technology, pathogen genomes can be identified and sequenced in a much shorter time and at a lower cost, thus improving the ability to diagnose, treat, and even predict and prevent pathogenic infections and disease. It has also allowed researchers to better understand genome evolution events - gene loss, gain, duplication, rearrangement - and how those events impact pathogen resistance and ability to cause disease. This influx of information has created a need for bioinformatics tools and databases to analyze and make the vast amounts of data accessible to researchers, and it has raised ethical questions about the wisdom of reconstructing previously extinct and deadly pathogens in order to better understand virulence.

<i>Paenibacillus vortex</i> Species of bacterium

Paenibacillus vortex is a species of pattern-forming bacteria, first discovered in the early 1990s by Eshel Ben-Jacob's group at Tel Aviv University. It is a social microorganism that forms colonies with complex and dynamic architectures. P. vortex is mainly found in heterogeneous and complex environments, such as the rhizosphere, the soil region directly influenced by plant roots.

<span class="mw-page-title-main">European Nucleotide Archive</span> Online database from the EBI on Nucleotides

The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.

<span class="mw-page-title-main">Viral metagenomics</span>

Viral metagenomics is the metagenomic study of viral genetic material obtained from environmental DNA samples or clinical DNA samples obtained from a host or natural reservoir. Metagenomic methods can be applied to study viruses in any system and has been used to describe various viruses associated with cancerous tumors, extreme environments, terrestrial ecosystems, and the blood and feces of humans. The term virome is also used to refer to viruses investigated by metagenomic sequencing of viral nucleic acids and is frequently used to describe environmental shotgun metagenomes. Viral metagenomics is a culture independent methodology that provides insights on viral diversity, abundance, and functional potential of viruses within the environment. Viruses lack a universal phylogenetic marker making metagenomics the only way to assess the genetic diversity of viruses in an environmental sample. With the advancements of techniques that can exploit next-generation sequencing, viruses can now be studied outside of culturable virus-host pairs. This approach has created improvements in molecular epidemiology and accelerated the discovery of novel viruses.

<span class="mw-page-title-main">Genomics England</span> British company

Genomics England is a British company set up and owned by the United Kingdom Department of Health and Social Care to run the 100,000 Genomes Project. The project aimed in 2014 to sequence 100,000 genomes from NHS patients with a rare disease and their families, and patients with cancer. An infectious disease strand is being led by Public Health England.

Mark J. Pallen is a research leader at the Quadram Institute and Professor of Microbial Genomics at the University of East Anglia. In recent years, he has been at the forefront of efforts to apply next-generation sequencing to problems in microbiology and ancient DNA research.

<span class="mw-page-title-main">Microbiome</span> Microbial community assemblage and activity

A microbiome is the community of microorganisms that can usually be found living together in any given habitat. It was defined more precisely in 1988 by Whipps et al. as "a characteristic microbial community occupying a reasonably well-defined habitat which has distinct physio-chemical properties. The term thus not only refers to the microorganisms involved but also encompasses their theatre of activity". In 2020, an international panel of experts published the outcome of their discussions on the definition of the microbiome. They proposed a definition of the microbiome based on a revival of the "compact, clear, and comprehensive description of the term" as originally provided by Whipps et al., but supplemented with two explanatory paragraphs. The first explanatory paragraph pronounces the dynamic character of the microbiome, and the second explanatory paragraph clearly separates the term microbiota from the term microbiome.

<i>Candida auris</i> Species of fungus

Candida auris is a species of fungus that grows as yeast. It is one of the few species of the genus Candida which cause candidiasis in humans. Often, candidiasis is acquired in hospitals by patients with weakened immune systems. C. auris can cause invasive candidiasis (fungemia) in which the bloodstream, the central nervous system, and internal organs are infected. It has attracted widespread attention because of its multiple drug resistance. Treatment is also complicated because it is easily misidentified as other Candida species.

Clinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens. The capacity to detect all the potential pathogens in a sample makes metagenomic next generation sequencing a potent tool in the diagnosis of infectious disease especially when other more directed assays, such as PCR, fail. Its limitations include clinical utility, laboratory validity, sense and sensitivity, cost and regulatory considerations.

Personalized genomics is the human genetics-derived study of analyzing and interpreting individualized genetic information by genome sequencing to identify genetic variations compared to the library of known sequences. International genetics communities have spared no effort from the past and have gradually cooperated to prosecute research projects to determine DNA sequences of the human genome using DNA sequencing techniques. The methods that are the most commonly used are whole exome sequencing and whole genome sequencing. Both approaches are used to identify genetic variations. Genome sequencing became more cost-effective over time, and made it applicable in the medical field, allowing scientists to understand which genes are attributed to specific diseases.

References

  1. "Global Microbial Identifier". Archived from the original on 2013-04-15. Retrieved 2012-12-23.
  2. Schlundt, J (2011). "The time is right for a global genomic database for microorganisms" (PDF). Health Diplomacy Monitor. 3 (2): 2–3. Archived from the original (PDF) on 2016-03-04. Retrieved 2012-12-23.
  3. Shendure, J (2008). "Next-generation DNA sequencing". Nature Biotechnology. 26 (10): 1135–1145. doi:10.1038/nbt1486. PMID   18846087. S2CID   6384349.
  4. 1 2 "Center for Genomic Epidemiology". www.genomicepidemiology.org.
  5. Inouye, M; et al. (2012). "Short read sequence typing (SRST):multi-locus sequence types from short reads". BMC Genomics. 13: 388. doi: 10.1186/1471-2164-13-338 . PMC   3460743 . PMID   22827703.
  6. Larsen, MV; et al. (2012). "Multilocus sequence typing of total-genome-sequenced bacteria". Journal of Clinical Microbiology. 50 (4): 1355–1366. doi:10.1128/JCM.06094-11. PMC   3318499 . PMID   22238442.
  7. Zankari, E; et al. (2013). "Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing". Journal of Antimicrobial Chemotherapy. 68 (4): 771–7. doi: 10.1093/jac/dks496 . PMID   23233485.
  8. Dunne, WM; et al. (2012). "Next-generation and whole-genome sequencing in the diagnostic clinical microbiology laboratory". European Journal of Clinical Microbiology and Infectious Diseases. 31 (8): 1719–17126. doi:10.1007/s10096-012-1641-7. PMID   22678348. S2CID   11511739.
  9. Current Topics in Microbiology and Immunology, Vol 366 (2013). Mackenzie, J.S.; Jeggo, M.; Daszak, P.; Richt, J (eds.). One Health: The Human-Animal-Environment Interfaces in Emerging Infectious Diseases. Springer. p. 280. ISBN   978-3-540-70961-9.
  10. Wielinga, PR; Schlundt, J (2013). Food Safety: At the Center of a One Health Approach for Combating Zoonoses. Current Topics in Microbiology and Immunology. Vol. 366. pp. 3–17. doi:10.1007/82_2012_238. ISBN   978-3-642-35845-6. PMC   7121890 . PMID   22763857.
  11. A summary of genomic databases. "Bacterial genome databases".
  12. WGS projects info by EBI. "WGS projects".
  13. Genome Browser NCBI. "Genome information by organism".
  14. Genome Browser EMBL. "Access to Completed Genomes".
  15. Microbial Genomes Database. "MBGD".
  16. "2Can Support Portal < EMBL-EBI". www.ebi.ac.uk.
  17. DOE's Joint Genome Institute Integrated Microbial Genomes (IMG). "IMG DOEs JGI".
  18. Aarestrup, F; et al. (2012). "Integrating Genome-based Informatics to Modernize Global Disease Monitoring, Information Sharing, and Response". Emerging Infectious Diseases. 18 (11): e1. doi:10.3201/eid/1811.120453. PMC   3559169 . PMID   23092707.
  19. Kupferschmidt, K (2011). "Epidemiology. Outbreak detectives embrace the genome era". Science. 333 (6051): 1818–1819. doi:10.1126/science.333.6051.1818. PMID   21960605.
  20. "Consensus report of an expert meeting 1-2 September 2011, Brussels, Belgium" (PDF).
  21. "News & Events - Global Microbial Identifier". www.globalmicrobialidentifier.org.
  22. "GMI development plan".