List of databases for oncogenomic research

Last updated

Databases for oncogenomic research are biological databases dedicated to cancer data and oncogenomic research. They can be a primary source of cancer data, offer a certain level of analysis (processed data) or even offer online data mining.

Contents

List

The table below gives an overview of databases for that serve specifically for oncogenomic research. Note that this is not a comprehensive list and does not contain databases that have a generic focus. You may find databases containing cancer data among the List of biological databases or Microarray databases.

DatabaseInstitute / OrganizationAlteration TypesPrimary Source [t 1] Processed Data [t 2] OrganismsCell lines [t 3] Public Data [t 4] Restricted Data [t 5]
The BioExpress® Oncology Suite from Ocimum Bio Solutions contains gene expression data from primary, metastatic, and benign tumor samples, and normal samples, including matched adjacent controls. (BioExpress Oncology Suite) Ocimum Bio Solutions, United States Gene Expression YesYesHuman, Rat and MouseYesNoYes
ClinicalTrials.gov contains descriptions and some results from clinical trials, many of which are genomically targeted. National Institutes of Health, United StatesVariousYesYesHumanNoYesNo
Project Data Sphere from The CEO Life Sciences Consortium allows researchers to share, integrate, and analyze de-identified patient-level, comparator arm, phase III cancer data. The CEO Life Sciences Consortium, United StatesVariousNoYesHumanNoYesYes
Catalogue Of Somatic Mutations In Cancer (COSMIC) Wellcome Trust Sanger Institute, UK Mutation NoYes Human YesYesYes
cBio Cancer Genomics Portal Memorial Sloan-Kettering Cancer Center, United States Copy number, Mutation, Methylation, Gene Expression, miRNA Expression, Protein, Phosphorylation NoYes Human NoYesNo
International Cancer Genome Consortium Worldwide Mutation YesYes Human NoYesYes
Integrative Oncogenomics Cancer Browser (IntOGen) Universitat Pompeu Fabra, Spain Copy number, Mutation, Gene Expression NoYes Human NoYesNo
Mouse Retrovirus Tagged Cancer Gene Database Institute of Molecular and Cell Biology, Singapore Mutations YesYes Mouse NoYesNo
Mouse Tumor Biology Database [note 1] The Jackson Laboratory, United States Copy number, Mutation, Methylation, Gene Expression NoNo Mouse NoNoNo
OncoDB.HCC Academia Sinica, Taiwan Copy number, Gene Expression, QTL NoYes Human, Mouse, Rat NoYesNo
Genevestigator contains data from numerous public repositories including GEO and renowned cancer research projects as TCGA. Nebion AG, Switzerland Gene Expression NoYes Human, Mouse, Rat, Monkey, Dog and othersYesYesYes
OncoLand from Omicsoft Corporation contains data from large-scale Genomic projects, include TCGA, ICGC and others] Omicsoft Corporation, United States Copy number, Mutation, Methylation, Gene Expression, miRNA Expression, Protein, Phosphorylation YesYesHuman, Rat and MouseYesYesYes
Oncomine Compendia Bioscience, Inc., United States Gene Expression NoYes Human YesNoYes
Oncoreveal Boğaziçi University, Turkey Gene Expression NoYes Human NoYesNo
Progenetix Universität Zürich, Switzerland Copy number NoYes Human YesYesNo
The Cancer Genome Atlas National Cancer Institute, United States Copy number, Mutation, Methylation, Gene Expression, miRNA expression YesYes Human NoYesYes
CancerResource University Medicine Berlin, Germany
Roche Cancer Genome Database (RCGDB) Roche Diagnostics, Penzberg, Germany
Network of Cancer Genes King's College London, UK Mutation NoYes Human NoYesNo
MutaGene NCBI, NIH, USAMutationNoYesHumanNoYesNo

See also

Notes

  1. Only contains references to biological data

Related Research Articles

Biostatistics is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Proteomics</span> Large-scale study of proteins

Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In addition, other kinds of proteins include antibodies that protect an organism from infection, and hormones that send important signals throughout the body.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">DNA microarray</span> Collection of microscopic DNA spots attached to a solid surface

A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.

<span class="mw-page-title-main">Biological database</span>

Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.

<span class="mw-page-title-main">Systems biology</span> Computational and mathematical modeling of complex biological systems

Systems biology is the computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach to biological research.

The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription.

Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

<span class="mw-page-title-main">Gene expression profiling</span>

In the field of molecular biology, gene expression profiling is the measurement of the activity of thousands of genes at once, to create a global picture of cellular function. These profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a particular treatment. Many experiments of this sort measure an entire genome simultaneously, that is, every gene present in a particular cell.

<span class="mw-page-title-main">Genetic analysis</span>

Genetic analysis is the overall process of studying and researching in fields of science that involve genetics and molecular biology. There are a number of applications that are developed from this research, and these are also considered parts of the process. The base system of analysis revolves around general genetics. Basic studies include identification of genes and inherited disorders. This research has been conducted for centuries on both a large-scale physical observation basis and on a more microscopic scale. Genetic analysis can be used generally to describe methods both used in and resulting from the sciences of genetics and molecular biology, or to applications resulting from this research.

Genevestigator is an application consisting of a gene expression database and tools to analyse the data. It exists in two versions, biomedical and plant, depending on the species of the underlying microarray and RNAseq as well as single-cell RNA-sequencing data. It was started in January 2004 by scientists from ETH Zurich and is currently developed and commercialized by Nebion AG.

<span class="mw-page-title-main">Oncogenomics</span> Sub-field of genomics

Oncogenomics is a sub-field of genomics that characterizes cancer-associated genes. It focuses on genomic, epigenomic and transcript alterations in cancer.

<span class="mw-page-title-main">Microarray analysis techniques</span>

Microarray analysis techniques are used in interpreting the data generated from experiments on DNA, RNA, and protein microarrays, which allow researchers to investigate the expression state of a large number of genes – in many cases, an organism's entire genome – in a single experiment. Such experiments can generate very large amounts of data, allowing researchers to assess the overall state of a cell or organism. Data in such large quantities is difficult – if not impossible – to analyze without the help of computer programs.

<span class="mw-page-title-main">ChIP-on-chip</span> Molecular biology method

ChIP-on-chip is a technology that combines chromatin immunoprecipitation ('ChIP') with DNA microarray ("chip"). Like regular ChIP, ChIP-on-chip is used to investigate interactions between proteins and DNA in vivo. Specifically, it allows the identification of the cistrome, the sum of binding sites, for DNA-binding proteins on a genome-wide basis. Whole-genome analysis can be performed to determine the locations of binding sites for almost any protein of interest. As the name of the technique suggests, such proteins are generally those operating in the context of chromatin. The most prominent representatives of this class are transcription factors, replication-related proteins, like origin recognition complex protein (ORC), histones, their variants, and histone modifications.

A microarray database is a repository containing microarray gene expression data. The key uses of a microarray database are to store the measurement data, manage a searchable index, and make the data available to other applications for analysis and interpretation.

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

Geniom RT Analyzer is an instrument used in molecular biology for diagnostic testing. The Geniom RT Analyzer utilizes the dynamic nature of tissue microRNA levels as a biomarker for disease progression. The Geniom analyzer incorporates microfluidic and biochip microarray technology in order to quantify microRNAs via a Microfluidic Primer Extension Assay (MPEA) technique.

<span class="mw-page-title-main">BioMart</span>

BioMart is a community-driven project to provide a single point of access to distributed research data. The BioMart project contributes open source software and data services to the international scientific community. Although the BioMart software is primarily used by the biomedical research community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the European Bioinformatics Institute as a data management solution for the Human Genome Project. Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.

References

  1. The database is the publication site for (some of) its cancer raw data
  2. The database contains cancer data at a certain level of analysis (non-raw data)
  3. The database also contains cell line data
  4. The database contains cancer data that is available for everyone
  5. The database contains cancer data that is only available under some restriction