Genomespace

Last updated
GenomeSpace
Developer(s) Broad Institute
Stable release
beta 5.0 / April 2012;12 years ago (2012-04)
Operating system Cross-platform
Platform Web browser
Available inEnglish
License LGPL 2.1
Website www.genomespace.org

GenomeSpace is an environment for genomics software tools and applications. It helps users manage their analysis workflows involving multiple diverse tools, including web applications and desktop tools and facilitates the transfer of data between tools via automatic format conversion. Analyses can use data from local or cloud-based stores.

Contents

GenomeSpace consists of a web-based user interface (UI) for users, and both a representational state transfer (RESTful) application programming interface (API) and a Java-based client development kit (CDK) for developers integrating their applications with GenomeSpace.

GenomeSpace tools

GenomeSpace is linked with several tools and data sources for genomics analysis: Cytoscape, [1] Galaxy, [2] [3] GenePattern, [4] Genomica, [5] geWorkbench, [6] InSilico DB, [7] the Integrative Genomics Viewer (IGV), [8] and the [9] UCSC Genome Browser. These programs provide a wide variety of genomic analyses, including network analysis and visualization, sequence analysis, whole-genome analysis, general statistical methods, gene expression analysis, proteomics, flow cytometry, next-generation sequence analysis, and genomic datasets. Developers of other genomics software can use the GenomeSpace API to add their tools.

Collaborators

The GenomeSpace project is a collaboration of the Mesirov and Regev laboratories at the Broad Institute; the Chang laboratory at Stanford University; the Ideker laboratory at the University of California, San Diego; the Nekrutenko laboratory at Pennsylvania State University; the Segal laboratory at the Weizmann Institute of Science; and the Haussler and Kent laboratories at the University of California, Santa Cruz. GenomeSpace is funded by the National Human Genome Research Institute of the National Institutes of Health.

Related Research Articles

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It can be performed on the entire genome, transcriptome or proteome of an organism, and can also involve only selected segments or regions, like tandem repeats and transposable elements. Methodologies used include sequence alignment, searches against biological databases, and others.

<span class="mw-page-title-main">Comparative genomics</span> Field of biological research

Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.

<span class="mw-page-title-main">Ensembl genome database project</span> Scientific project at the European Bioinformatics Institute

Ensembl genome database project is a scientific project at the European Bioinformatics Institute, which provides a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information.

<span class="mw-page-title-main">ENCODE</span> Research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims "to build a comprehensive parts list of functional elements in the human genome."

The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.

<span class="mw-page-title-main">Metabolic network modelling</span> Form of biological modelling

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

The completion of the human genome sequencing in the early 2000s was a turning point in genomics research. Scientists have conducted series of research into the activities of genes and the genome as a whole. The human genome contains around 3 billion base pairs nucleotide, and the huge quantity of data created necessitates the development of an accessible tool to explore and interpret this information in order to investigate the genetic basis of disease, evolution, and biological processes. The field of genomics has continued to grow, with new sequencing technologies and computational tool making it easier to study the genome.

GenePattern is a freely available computational biology open-source software package originally created and developed at the Broad Institute for the analysis of genomic data. Designed to enable researchers to develop, capture, and reproduce genomic analysis methodologies, GenePattern was first released in 2004. GenePattern is currently developed at the University of California, San Diego.

<span class="mw-page-title-main">Galaxy (computational biology)</span>

Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

The Genomic HyperBrowser is a web-based system for statistical analysis of genomic annotation data.

FAIRE-Seq is a method in molecular biology used for determining the sequences of DNA regions in the genome associated with regulatory activity. The technique was developed in the laboratory of Jason D. Lieb at the University of North Carolina, Chapel Hill. In contrast to DNase-Seq, the FAIRE-Seq protocol doesn't require the permeabilization of cells or isolation of nuclei, and can analyse any cell type. In a study of seven diverse human cell types, DNase-seq and FAIRE-seq produced strong cross-validation, with each cell type having 1-2% of the human genome as open chromatin.

<span class="mw-page-title-main">BioMart</span>

BioMart is a community-driven project to provide a single point of access to distributed research data. The BioMart project contributes open source software and data services to the international scientific community. Although the BioMart software is primarily used by the biomedical research community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the European Bioinformatics Institute as a data management solution for the Human Genome Project. Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.

<span class="mw-page-title-main">In silico PCR</span> Computational tools

In silico PCR refers to computational tools used to calculate theoretical polymerase chain reaction (PCR) results using a given set of primers (probes) to amplify DNA sequences from a sequenced genome or transcriptome.

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

<span class="mw-page-title-main">Geworkbench</span> Genomic data analysis software

geWorkbench is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a component architecture. As of 2016, there are more than 70 plug-ins available, providing for the visualization and analysis of gene expression, sequence, and structure data.

MG-RAST, an open-source web application server, facilitates automatic phylogenetic and functional analysis of metagenomes. It stands as one of the largest repositories for metagenomic data, employing the acronym for Metagenomic Rapid Annotations using Subsystems Technology (MG-RAST). This platform utilizes a pipeline that automatically assigns functions to metagenomic sequences, conducting sequence comparisons at both nucleotide and amino acid levels. Users benefit from phylogenetic and functional insights into the analyzed metagenomes, along with tools for comparing different datasets. MG-RAST also offers a RESTful API for programmatic access.

DisGeNET is a discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET is one of the largest and comprehensive repositories of human gene-disease associations (GDAs) currently available. It also offers a set of bioinformatic tools to facilitate the analysis of these data by different user profiles. It is maintained by the Integrative Biomedical Informatics (IBI) GroupArchived 2016-11-26 at the Wayback Machine, of the (GRIB)-IMIM/UPF, based at the Barcelona Biomedical Research Park (PRBB), Barcelona, Spain.

References

  1. "Home". cytoscape.org.
  2. Goecks, J.; Nekrutenko, A.; Taylor, J.; Galaxy Team, T. (2010). "Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences". Genome Biology. 11 (8): R86. doi: 10.1186/gb-2010-11-8-r86 . PMC   2945788 . PMID   20738864.
  3. Blankenberg, D.; Kuster, G. V.; Coraor, N.; Ananda, G.; Lazarus, R.; Mangan, M.; Nekrutenko, A.; Taylor, J. (2010). Frederick M. Ausubel (ed.). Galaxy: A Web-Based Genome Analysis Tool for Experimentalists. Vol. Chapter 19. pp. Unit Un19.10.Un19–21. doi:10.1002/0471142727.mb1910s89. ISBN   978-0471142720. PMC   4264107 . PMID   20069535.{{cite book}}: |journal= ignored (help)
  4. "Home". genepattern.org.
  5. "Segal Lab: Genomica". Archived from the original on 2012-02-02. Retrieved 2012-04-24.
  6. "Home". geworkbench.org. Archived from the original on 2008-08-27. Retrieved 2022-08-07.
  7. Coletta, A.; Molter, C.; Duqué, R.; Steenhoff, D.; Taminau, J.; De Schaetzen, V.; Meganck, S.; Lazar, C.; Venet, D.; Detours, V.; Nowé, A.; Bersini, H.; Weiss Solís, D. Y. (2012). "InSilico DB genomic datasets hub: An efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor". Genome Biology. 13 (11): R104. doi: 10.1186/gb-2012-13-11-r104 . PMC   3580496 . PMID   23158523.
  8. "Home | Integrative Genomics Viewer".
  9. http://hgwdev-gs.cse.ucsc.edu/cgi-bin/hgTables

Official website