Mouse Genome Informatics

Last updated

Mouse Genome Informatics (MGI) is a free, online database and bioinformatics resource hosted by The Jackson Laboratory, with funding by the National Human Genome Research Institute (NHGRI), the National Cancer Institute (NCI), and the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). [1] MGI provides access to data on the genetics, genomics and biology of the laboratory mouse to facilitate the study of human health and disease. [2] [3] The database integrates multiple projects, with the two largest contributions coming from the Mouse Genome Database and Mouse Gene Expression Database (GXD). [4] As of 2018, MGI contains data curated from over 230,000 publications. [5]

Contents

The MGI resource was first published online in 1994 [5] and is a collection of data, tools, and analyses created and tailored for use in the laboratory mouse, a widely used model organism. It is "the authoritative source of official names for mouse genes, alleles, and strains", which follow the guidelines established by the International Committee on Standardized Genetic Nomenclature for Mice. [6] The history and focus of Jackson Laboratory research and production facilities generates tremendous knowledge and depth which researchers can mine to advance their research. A dedicated community of mouse researchers, worldwide enhances and contributes to the knowledge as well. This is an indispensable tool for any researcher using the mouse as a model organism for their research, and for researchers interested in genes that share homology with the mouse genes. Various mouse research support resources including animal collections and free colony management software are also available at the MGI site. [7]

Mouse Genome Database

The Mouse Genome Database collects and curates comprehensive phenotype and functional annotations for mouse genes and alleles. [8] This is an NHGRI-funded project which contributes to the Mouse Genome Informatics database.

Mouse gene expression database

The Gene Expression Database is a community resource of mouse developmental expression information. [9]

History

The Mouse Genome Informatics homepage as it appeared in 1994 MGI homepage 1994.jpg
The Mouse Genome Informatics homepage as it appeared in 1994

MGI evolved from a project funded by the National Center for Human Genome Research in 1989 to combine the databases of several Jackson Laboratory scientists and create a tool for visualizing data on the mouse genome. [10] The result of that project, led by Joseph H. Nadeau, Larry E. Mobraaten, and Janan T. Eppig, was called the "Encyclopedia of the Mouse Genome" and distributed via floppy disk semi-annually to around 300 scientists around the world. [10] In 1992, that group joined with the team responsible for developing the "Genomic Database for Mouse", led by Muriel T. Davisson and Thomas H. Roderick, to start the "Mouse Genome Informatics" project. [10] That project resulted in the first online release of the "Mouse Genome Database" in 1994. [10]


See also

Related Research Articles

<span class="mw-page-title-main">Biological database</span>

Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.

<span class="mw-page-title-main">KEGG</span> Collection of bioinformatics databases

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

<span class="mw-page-title-main">Amos Bairoch</span>

Amos Bairoch is a Swiss bioinformatician and Professor of Bioinformatics at the Department of Human Protein Sciences of the University of Geneva where he leads the CALIPHO group at the Swiss Institute of Bioinformatics (SIB) combining bioinformatics, curation, and experimental efforts to functionally characterize human proteins.

<span class="mw-page-title-main">Generic Model Organism Database</span>

The Generic Model Organism Database (GMOD) project provides biological research communities with a toolkit of open-source software components for visualizing, annotating, managing, and storing biological data. The GMOD project is funded by the United States National Institutes of Health, National Science Foundation and the USDA Agricultural Research Service.

Gerald Mayer Rubin is an American biologist, notable for pioneering the use of transposable P elements in genetics, and for leading the public project to sequence the Drosophila melanogaster genome. Related to his genomics work, Rubin's lab is notable for development of genetic and genomics tools and studies of signal transduction and gene regulation. Rubin also serves as a vice president of the Howard Hughes Medical Institute and executive director of the Janelia Research Campus.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences and their protein products. RefSeq was first introduced in 2000. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule for major organisms ranging from viruses to bacteria to eukaryotes.

The International Knockout Mouse Consortium (IKMC) is a scientific endeavour to produce a collection of mouse embryonic stem cell lines that together lack every gene in the genome, and then to distribute the cells to scientific researchers to create knockout mice to study. Many of the targeted alleles are designed so that they can generate both complete and conditional gene knockout mice. The IKMC was initiated on March 15, 2007 at a meeting in Brussels. By 2011, Nature reported that approximately 17,000 different genes have already been disabled by the consortium, "leaving only around 3,000 more to go".

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

<span class="mw-page-title-main">EMAGE</span>

EMAGE is an online biological database of gene expression data in the developing mouse embryo. The data held in EMAGE is spatially annotated to a framework of 3D mouse embryo models produced by EMAP. These spatial annotations allow users to query EMAGE by spatial pattern as well as by gene name, anatomy term or Gene Ontology (GO) term. EMAGE is a freely available web-based resource funded by the Medical Research Council (UK) and based at the MRC Human Genetics Unit in the Institute of Genetics and Molecular Medicine, Edinburgh, UK.

Europhenome is a resource for presenting, searching and analysing mouse phenotypes that were revealed by high throughput mouse phenotyping programmes such as EUMODIC.

<span class="mw-page-title-main">BioMart</span>

BioMart is a community-driven project to provide a single point of access to distributed research data. The BioMart project contributes open source software and data services to the international scientific community. Although the BioMart software is primarily used by the biomedical research community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the European Bioinformatics Institute as a data management solution for the Human Genome Project. Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

Cathy H. Wu is the Edward G. Jefferson Chair and professor and director of the Center for Bioinformatics & Computational Biology (CBCB) at the University of Delaware. She is also the director of the Protein Information Resource (PIR) and the North east Bioinformatics Collaborative Steering Committee, and the adjunct professor at the Georgetown University Medical Center.

Model organism databases (MODs) are biological databases, or knowledgebases, dedicated to the provision of in-depth biological data for intensively studied model organisms. MODs allow researchers to easily find background information on large sets of genes, plan experiments efficiently, combine their data with existing knowledge, and construct novel hypotheses. They allow users to analyse results and interpret datasets, and the data they generate are increasingly used to describe less well studied species. Where possible, MODs share common approaches to collect and represent biological information. For example, all MODs use the Gene Ontology (GO) to describe functions, processes and cellular locations of specific gene products. Projects also exist to enable software sharing for curation, visualization and querying between different MODs. Organismal diversity and varying user requirements however mean that MODs are often required to customize capture, display, and provision of data.

Coisogenic strains are one type of inbred strain that differs by a mutation at a single locus and all of the other loci are identical. There are numerous ways to create an inbred strain and each of these strains are unique. Genetically engineered mice can be considered a coisogenic strain if the only difference between the engineered mouse and a wild-type mouse is a specific locus. Coisogenic strains can be used to investigate the function of a certain genetic locus.

PathoPhenoDB is a biological database. The database connects pathogens to their phenotypes using multiple databases such as NCBI, Human Disease Ontology Human Phenotype Ontology, Mammalian Phenotype Ontology, PubChem, SIDER and CARD. Pathogen-disease associations were gathered mainly through the CDC and the List of Infectious Diseases page on Wikipedia. The manner by which they assigned taxonomy was semi-automatic. When mapped against NCBI Taxonomy, if the pathogen was not an exact match, it was then mapped to the parent class. PathoPhenoDB employs NPMI in order to filter pairs based on their co-occurrence statistics.

<span class="mw-page-title-main">Mouse Models of Human Cancer database</span>

The laboratory mouse has been instrumental in investigating the genetics of human disease, including cancer, for over 110 years. The laboratory mouse has physiology and genetic characteristics very similar to humans providing powerful models for investigation of the genetic characteristics of disease.

Judith Anne Blake is a computational biologist at the Jackson Laboratory and Professor of Mammalian Genetics.

References

  1. "MGI-MGI Current Funding Support". jax.org. Retrieved 2 June 2015.
  2. Shaw D (May 2004). "Searching the Mouse Genome Informatics (MGI) resources for information on mouse biology from genotype to phenotype". Current Protocols in Bioinformatics. Chapter 1: Unit 1.7. doi:10.1002/0471250953.bi0107s05. ISBN   0471250953. PMC   5147750 . PMID   18428715.
  3. Qi D, Blake JA, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Bult CJ (8–11 August 2005). Data integration in the mouse genome informatics (MGI) database. Computational Systems Bioinformatics Conference, 2005. Workshops and Poster Abstracts. IEEE. pp. 37–8. doi:10.1109/CSBW.2005.48. ISBN   0-7695-2442-7.
  4. "MGI-About the Mouse Genome Informatics database resource". jax.org. Retrieved 2 June 2015.
  5. 1 2 Law, M; Shaw, DR (2018). Mouse Genome Informatics (MGI) Is the International Resource for Information on the Laboratory Mouse. Methods in Molecular Biology. Vol. 1757. pp. 141–161. doi:10.1007/978-1-4939-7737-6_7. ISBN   978-1-4939-7736-9. PMID   29761459.
  6. Mouse Nomenclature Home Page , retrieved 28 August 2016
  7. Ivica Letunic. "OpenHelix: Mouse Genome Informatics (MGI)". openhelix.com. Archived from the original on 26 December 2017. Retrieved 2 June 2015.
  8. Blake, Judith A.; Bult, Carol J.; Kadin, James A.; Richardson, Joel E.; Eppig, Janan T.; The Mouse Genome Database Group (Jan 2011). "The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics". Nucleic Acids Res. England. 39 (Database issue): D842-8. doi:10.1093/nar/gkq1008. PMC   3013640 . PMID   21051359.
  9. Finger, Jacqueline H; Smith Constance M; Hayamizu Terry F; McCright Ingeborg J; Eppig Janan T; Kadin James A; Richardson Joel E; Ringwald Martin (Jan 2011). "The mouse Gene Expression Database (GXD): 2011 update". Nucleic Acids Res. England. 39 (Database issue): D835-41. doi:10.1093/nar/gkq1132. PMC   3013713 . PMID   21062809.
  10. 1 2 3 4 Eppig JT, Richardson JE, Kadin JA, Ringwald M, Blake JA, Bult CJ (August 2015). "Mouse Genome Informatics (MGI): reflecting on 25 years". Mamm Genome. 26 (7–8): 272–84. doi:10.1007/s00335-015-9589-4. PMC   4534491 . PMID   26238262.