VISTA (comparative genomics)

Last updated
VISTA Enhancer Browser
Database.png
Content
Descriptiona database of tissue-specific human enhancers.
Contact
Primary citationVisel & al. (2007) [1]
Release date2006
Access
Website http://enhancer.lbl.gov

VISTA is a collection of databases, tools, and servers that permit extensive comparative genomics analyses.

Contents

Background

The VISTA family of tools is developed and hosted at Genomics Division of Lawrence Berkeley National Laboratory. This collaborative effort is supported by the Programs for Genomic Applications grant from the NHLBI/NIH and the Office of Biological and Environmental Research, Office of Science, US Department of Energy.

Lawrence Berkeley National Laboratory (LBNL), commonly referred to as Berkeley Lab, is a United States national laboratory that conducts scientific research on behalf of the United States Department of Energy (DOE). It is located in the Berkeley Hills near Berkeley, California, overlooking the main campus of the University of California, Berkeley. It is managed and operated by the University of California.

The National Heart, Lung, and Blood Institute (NHLBI) is the third largest Institute of the National Institutes of Health, located in Bethesda, Maryland, United States. It is tasked with allocating about $3.0 billion in tax revenue per year to advancing the understanding of the following issues: development and progression of disease, diagnosis of disease, treatment of disease, disease prevention, reduction of health care disparities within the American population, and advancing the effectiveness of the US medical system. NHLBI's Director is Gary H. Gibbons (2012-present).

National Institutes of Health Medical research organization in the United States

The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late 1870s and is now part of the United States Department of Health and Human Services. The majority of NIH facilities are located in Bethesda, Maryland. The NIH conducts its own scientific research through its Intramural Research Program (IRP) and provides major biomedical research funding to non-NIH research facilities through its Extramural Research Program.

It was developed from modules supplied by developers at UC Berkeley, Stanford, and UC Davis, and based partly on the AVID Global Alignment program.

Usage

There are multiple VISTA servers, each allowing different types of searches.

Researchers can use the VISTA Browser:

Genomes

There are more than 28 searchable genomes, including vertebrate, non-vertebrate, plants, fungi, algae, bacteria, and others. More are continually being added. These include:

Collaboration with other projects

Pre-computed full scaffold alignments for microbial genomes are available as the VISTA component of IMG (Integrated Microbial Genomes System) developed in the DOE (Department of Energy's) Joint Genome Institute.

Integrated Microbial Genomes System framework for comparative analysis of the genomes sequenced by the Joint Genome Institute

The Integrated Microbial Genomes system is a genome browsing and annotation platform developed by the U.S. Department of Energy (DOE)-Joint Genome Institute. IMG contains all the draft and complete microbial genomes sequenced by the DOE-JGI integrated with other publicly available genomes. IMG provides users a set of tools for comparative analysis of microbial genomes along three dimensions: genes, genomes and functions. Users can select and transfer them in the comparative analysis carts based upon a variety of criteria. IMG also includes a genome annotation pipeline that integrates information from several tools, including KEGG, Pfam, InterPro, and the Gene Ontology, among others. Users can also type or upload their own gene annotations and the IMG system will allow them to generate Genbank or EMBL format files containing these annotations.

Joint Genome Institute

The U.S. Department of Energy Joint Genome Institute (JGI), currently located in Walnut Creek, California, was created in 1997 to unite the expertise and resources in genome mapping, DNA sequencing, technology development, and information sciences pioneered at the DOE genome centers at Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory (LLNL) and Los Alamos National Laboratory (LANL). Today, as a DOE Office of Science User Facility of Berkeley Lab, the JGI staff is composed of employees from Berkeley Lab, LLNL and the HudsonAlpha Institute for Biotechnology. The JGI also collaborates with other national user facilities, such as the Environmental Molecular Sciences Laboratory at Pacific Northwest National Laboratory (PNNL), the National Energy Research Scientific Computing Center, or NERSC, is a high performance computing (supercomputer) user facility operated by Berkeley Lab, and the DOE Bioenergy Research Centers.

Related Research Articles

Enhancer (genetics) DNA sequence capable of binding activators

In genetics, an enhancer is a short region of DNA that can be bound by proteins (activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are cis-acting. They can be located up to 1 Mbp away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes.

Comparative genomics

Comparative genomics is a field of biological research in which the genomic features of different organisms are compared. The genomic features may include the DNA sequence, genes, gene order, regulatory sequences, and other genomic structural landmarks. In this branch of genomics, whole or large parts of genomes resulting from genome projects are compared to study basic biological similarities and differences as well as evolutionary relationships between organisms. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, comparative genomic approaches start with making some form of alignment of genome sequences and looking for orthologous sequences in the aligned genomes and checking to what extent those sequences are conserved. Based on these, genome and molecular evolution are inferred and this may in turn be put in the context of, for example, phenotypic evolution or population genetics.

Ensembl genome database project gene sequence database

Ensembl genome database project is a joint scientific project between the European Bioinformatics Institute and the Wellcome Trust Sanger Institute, which was launched in 1999 in response to the imminent completion of the Human Genome Project. Ensembl aims to provide a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information.

ENCODE research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims to identify functional elements in the human genome.

Endogenous retrovirus inherited retrovirus encoded in an organisms genome

Endogenous retroviruses (ERVs) are endogenous viral elements in the genome that closely resemble and can be derived from retroviruses. They are abundant in the genomes of jawed vertebrates, and they comprise up to 5–8% of the human genome. ERVs are a subclass of a type of gene called a transposon, which can be packaged and moved within the genome to serve a vital role in gene expression and in regulation. They are distinguished as retrotransposons, which are Class I elements. Researchers have suggested that retroviruses evolved from a type of transposable gene called a retrotransposon, which includes ERVs; these genes can mutate and instead of moving to another location in the genome they can become exogenous or pathogenic. This means that not all ERVs may have originated as an insertion by a retrovirus but that some may have been the source for the genetic information in the retroviruses they resemble. When integration of viral DNA occurs in the germ-line, it can give rise to an ERV, which can later become fixed in the gene pool of the host population.

Conserved sequence Similar DNA, RNA or protein sequences within genomes or among species

In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids or proteins across species, or within a genome, or between donor and receptor taxa. Conservation indicates that a sequence has been maintained by natural selection.

Regulator gene gene involved in controlling the expression of one or more other genes

A regulator gene, regulator, or regulatory gene is a gene involved in controlling the expression of one or more other genes. Regulatory sequences, which encode regulatory genes, are often 5' to the start site of transcription of the gene they regulate. In addition, these sequences can also be found 3' to the transcription start site. In both cases, whether the regulatory sequence occurs before (5') or after (3') the gene it regulates, the sequence is often many kilobases away from the transcription start site. A regulator gene may encode a protein, or it may work at the level of RNA, as in the case of genes encoding microRNAs. An example of a regulator gene is a gene that codes for a repressor protein that inhibits the activity of an operator gene.

BLAT is a pairwise sequence alignment algorithm that was developed by Jim Kent at the University of California Santa Cruz (UCSC) in the early 2000s to assist in the assembly and annotation of the human genome. It was designed primarily to decrease the time needed to align millions of mouse genomic reads and expressed sequence tags against the human genome sequence. The alignment tools of the time were not capable of performing these operations in a manner that would allow a regular update of the human genome assembly. Compared to pre-existing tools, BLAT was ~500 times faster with performing mRNA/DNA alignments and ~50 times faster with protein/protein alignments.

UCSC Malaria Genome Browser is a bioinformatic research tool to study the malaria genome, developed by Hughes Undergraduate Research Laboratory together with the laboratory of Prof. Manuel Ares Jr. at the University of California, Santa Cruz.

MicrobesOnline

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.

The Viral Bioinformatics Resource Center (VBRC) is an online resource providing access to a database of curated viral genomes and a variety of tools for bioinformatic genome analysis. This resource was one of eight BRCs funded by NIAID with the goal of promoting research against emerging and re-emerging pathogens, particularly those seen as potential bioterrorism threats. The VBRC is now supported by Dr. Chris Upton at the University of Victoria.

Cis-regulatory module (CRM) is a stretch of DNA, usually 100–1000 DNA base pairs in length, where a number of transcription factors can bind and regulate expression of nearby genes and regulate their transcription rates. They are labeled as cis because they are typically located on the same DNA strand as the genes they control as opposed to trans, which refers to effects on genes not located on the same strand or farther away, such as transcription factors. One cis-regulatory element can regulate several genes, and conversely, one gene can have several cis-regulatory modules.Cis-regulatory modules carry out their function by integrating the active transcription factors and the associated co-factors at a specific time and place in the cell where this information is read and an output is given.

The UCSC Genome Browser is an on-line, and downloadable, genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

TRANSFAC is a manually curated database of eukaryotic transcription factors, their genomic binding sites and DNA binding profiles. The contents of the database can be used to predict potential transcription factor binding sites.

STARR-seq

STARR-seq is a novel method to assay enhancer activity for millions of candidates from arbitrary sources of DNA. It is used to identify the sequences that act as transcriptional enhancers in a direct, quantitative, and genome-wide manner.

Enhancer-FACS-seq

Enhancer-FACS-seq (eFS), developed by the Bulyk lab at Brigham and Women’s Hospital and Harvard Medical School, is a highly parallel enhancer assay that aims for the identification of active, tissue-specific transcriptional enhancers, in the context of whole Drosophila melanogaster embryos. This technology replaces the use of microscopy to screen for tissue-specific enhancers with fluorescence activated cell sorting (FACS) of dissociated cells from whole embryos, combined with identification by high-throughput Illumina sequencing.

FANTOM

FANTOM is an international research consortium first established in 2000 as part of the RIKEN research institute in Japan. The original meeting gathered international scientists from diverse backgrounds to help annotate the function of mouse cDNA clones generated by the Hayashizaki group. Since the initial FANTOM1 effort, the consortium has released multiple projects that look to understand the mechanisms governing the regulation of mammalian genomes. Their work has generated a large collection of shared data and helped advance biochemical and bioinformatic methodologies in genomics research.

References

  1. 1 2 Visel, Axel; Minovitsky Simon; Dubchak Inna; Pennacchio Len A (Jan 2007). "VISTA Enhancer Browser--a database of tissue-specific human enhancers". Nucleic Acids Res. England. 35 (Database issue): D88–92. doi:10.1093/nar/gkl822. PMC   1716724 . PMID   17130149.