Developer(s) | U.S. Food and Drug Administration |
---|---|
Stable release | 3.5.0 / March 3, 2010 |
Operating system | Linux, Mac OS X, Windows |
Platform | Java |
Type | Bioinformatics data management, analysis, and interpretation |
Website |
ArrayTrack is a multi-purpose bioinformatics tool primarily used for microarray data management, analysis, and interpretation. ArrayTrack was developed to support in-house filter array research for the U.S. Food and Drug Administration in 2001, and was made freely available to the public as an integrated research tool for microarrays in 2003. [1] Since then, ArrayTrack has averaged about 5,000 users per year. It is regularly updated by the National Center for Toxicological Research.
ArrayTrack is composed of three major components: Study Database, Tools, and Libraries, which primarily handle data management, analysis, and interpretation, respectively. Each of these components can be directly accessed from the other two, e.g., analysis Tools can be used directly on experimental data stored in the Study Database, and significant genes discovered from the results can be queried in the Libraries to view additional annotations and associated proteins, pathways, Gene Ontology terms, etc. [2]
ArrayTrack is directly integrated with a variety of other bioinformatics software, such as pathway analysis tools GeneGo MetaCore and Ingenuity Pathway Analysis. [3] [4]
ArrayTrack is freely available to the public and can be accessed online. It is run on the client's computer using a Java-based interface that connects to an Oracle database hosted by the FDA. As a Java-based application, ArrayTrack is compatible with Windows, Mac, and Linux machines.
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.
The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.
Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.
KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.
In the field of molecular biology, gene expression profiling is the measurement of the activity of thousands of genes at once, to create a global picture of cellular function. These profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a particular treatment. Many experiments of this sort measure an entire genome simultaneously, that is, every gene present in a particular cell.
Microarray analysis techniques are used in interpreting the data generated from experiments on DNA, RNA, and protein microarrays, which allow researchers to investigate the expression state of a large number of genes - in many cases, an organism's entire genome - in a single experiment. Such experiments can generate very large amounts of data, allowing researchers to assess the overall state of a cell or organism. Data in such large quantities is difficult - if not impossible - to analyze without the help of computer programs.
GenMAPP is a free, open-source bioinformatics software tool designed to visualize and analyze genomic data in the context of pathways, connecting gene-level datasets to biological processes and disease. First created in 2000, GenMAPP is developed by an open-source team based in an academic research laboratory. GenMAPP maintains databases of gene identifiers and collections of pathway maps in addition to visualization and analysis tools. Together with other public resources, GenMAPP aims to provide the research community with tools to gain insight into biology through the integration of data types ranging from genes to proteins to pathways to disease.
Biomax Informatics is a Munich-based software company specializing in research software for bioinformatics. Biomax was founded in 1997 and has its roots in the Munich Information Center for Protein Sequences (MIPS). The company's customer base consists of companies and research organizations in the areas of drug discovery, diagnostics, fine chemicals, food and plant production. In addition to exclusive software tools, Biomax Informatics provides services and curated knowledge bases.
MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.
Interferome is an online bioinformatics database of interferon-regulated genes (IRGs). These Interferon Regulated Genes are also known as Interferon Stimulated Genes (ISGs). The database contains information on type I, type II and type III regulated genes and is regularly updated. It is used by the interferon and cytokine research community both as an analysis tool and an information resource. Interferons were identified as antiviral proteins more than 50 years ago. However, their involvement in immunomodulation, cell proliferation, inflammation and other homeostatic processes has been since identified. These cytokines are used as therapeutics in many diseases such as chronic viral infections, cancer and multiple sclerosis. These interferons regulate the transcription of approximately 2000 genes in an interferon subtype, dose, cell type and stimulus dependent manner. This database of interferon regulated genes is an attempt at integrating information from high-throughput experiments and molecular biology databases to gain a detailed understanding of interferon biology.
GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science, in collaboration with LifeMap Sciences.
QIAGEN Silicon Valley is a company based in Redwood City, California, USA, that develops software to analyze complex biological systems. QIAGEN Silicon Valley's first product, IPA, was introduced in 2003, and is used to help researchers analyze omics data and model biological systems. The software has been cited in thousands of scientific molecular biology publications and is one of several tools for systems biology researchers and bioinformaticians in drug discovery and institutional research.
In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.
geWorkbench is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a component architecture. As of 2016, there are more than 70 plug-ins available, providing for the visualization and analysis of gene expression, sequence, and structure data.
In bioinformatics, the PANTHER classification system is a large curated biological database of gene/protein families and their functionally related subfamilies that can be used to classify and identify the function of gene products. PANTHER is part of the Gene Ontology Reference Genome Project designed to classify proteins and their genes for high-throughput analysis.
Gene set enrichment analysis (GSEA) (also called functional enrichment analysis or pathway enrichment analysis) is a method to identify classes of genes or proteins that are over-represented in a large set of genes or proteins, and may have an association with different phenotypes (e.g. different organism growth patterns or diseases). The method uses statistical approaches to identify significantly enriched or depleted groups of genes. Transcriptomics technologies and proteomics results often identify thousands of genes which are used for the analysis.
In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.
Pathway is the term from molecular biology for a curated schematic representation of a well characterized segment of the molecular physiological machinery, such as a metabolic pathway describing an enzymatic process within a cell or tissue or a signaling pathway model representing a regulatory process that might, in its turn, enable a metabolic or another regulatory process downstream. A typical pathway model starts with an extracellular signaling molecule that activates a specific receptor, thus triggering a chain of molecular interactions. A pathway is most often represented as a relatively small graph with gene, protein, and/or small molecule nodes connected by edges of known functional relations. While a simpler pathway might appear as a chain, complex pathway topologies with loops and alternative routes are much more common. Computational analyses employ special formats of pathway representation. In the simplest form, however, a pathway might be represented as a list of member molecules with order and relations unspecified. Such a representation, generally called Functional Gene Set (FGS), can also refer to other functionally characterised groups such as protein families, Gene Ontology (GO) and Disease Ontology (DO) terms etc. In bioinformatics, methods of pathway analysis might be used to identify key genes/ proteins within a previously known pathway in relation to a particular experiment / pathological condition or building a pathway de novo from proteins that have been identified as key affected elements. By examining changes in e.g. gene expression in a pathway, its biological activity can be explored. However most frequently, pathway analysis refers to a method of initial characterization and interpretation of an experimental condition that was studied with omics tools or genome-wide association study. Such studies might identify long lists of altered genes. A visual inspection is then challenging and the information is hard to summarize, since the altered genes map to a broad range of pathways, processes, and molecular functions. In such situations, the most productive way of exploring the list is to identify enrichment of specific FGSs in it. The general approach of enrichment analyses is to identify FGSs, members of which were most frequently or most strongly altered in the given condition, in comparison to a gene set sampled by chance. In other words, enrichment can map canonical prior knowledge structured in the form of FGSs to the condition represented by altered genes.
PomBase is a model organism database that provides online access to the fission yeast Schizosaccharomyces pombe genome sequence and annotated features, together with a wide range of manually curated functional gene-specific data. The PomBase website was redeveloped in 2016 to provide users with a more fully integrated, better-performing service.