Ensembl Genomes

Last updated

Ensembl Genomes
Ensembl genomes logo.png
Content
DescriptionAn integrative resource for genome-scale data from non-vertebrate species.
Data types
captured
Genomic database
Organisms pan
Contact
Research center European Bioinformatics Institute
Primary citationKersey & al. (2012), [1] Howe & al. (2020) [2]
Release date2009
Access
Website https://ensemblgenomes.org/
Download URL ftp://ftp.ensemblgenomes.org/pub/current
Web service URL https://rest.ensembl.org/
Public SQL accessanonymous@mysql-eg-publicsql.ebi.ac.uk:4157
Miscellaneous
License Apache 2.0
Data release
frequency
4 times per year
VersionRelease 52 (December 2021)

Ensembl Genomes is a scientific project to provide genome-scale data from non-vertebrate species. [1] [2]

Contents

The project is run by the European Bioinformatics Institute, and was launched in 2009 using the Ensembl technology. [3] The main objective of the Ensembl Genomes database is to complement the main Ensembl database by introducing five additional web pages to include genome data for bacteria, fungi, invertebrate metazoa, plants, and protists. [4] For each of the domains, the Ensembl tools are available for manipulation, analysis and visualization of genome data. Most Ensembl Genomes data is stored in MySQL relational databases and can be accessed by the Ensembl REST interface, the Perl API, Biomart or online. [5]

Ensembl Genomes is an open project, and most of the code, tools, and data are available to the public. [6] Ensembl and Ensembl Genomes software uses an Apache 2.0 license [7] license.

Displaying genomic data

Karyotype visualisation in Ensembl Genomes Ensembl genomes visualisation.png
Karyotype visualisation in Ensembl Genomes

The key feature of Ensembl Genomes is its graphical interface, which allows users to scroll through a genome and observe the relative location of features such as conceptual annotation (e.g. genes, SNP loci), sequence patterns (e.g. repeats) and experimental data (e.g. sequences and external sequence features mapped onto the genome). [1] Graphical views are available for varying levels of resolution from an entire karyotype, down to the sequence of a single exon. Information for a genome is spread over four tabs, a species page, a ‘Location’ tab, a ‘Gene’ tab and a ‘Transcript’ tab, each providing information at a higher resolution.

Searching for a particular species using Ensembl Genomes redirects to the species page. Often, a brief description of the species is provided, as well as links to further information and statistics about the genome, the graphical interface and some of the tools available.

A karyotype is available for some species in Ensembl Genomes. [8] If the karyotype is available there will be a link to it in the Gene Assembly section of the species page. Alternatively if users are in the ‘Location’ tab they can also view the karyotype by selecting ‘Whole genome’ in the left-hand menu. Users can click on a location within the karyotype to zoom in to one specific chromosome or a genomic region. [8] This will open the ‘Location’ Tab.

In the 'Location' tab, users can browse genes, variations, sequence conservation, and other types of annotation along the genome. [9] The 'Region in detail' is highly configurable and scalable, and users can choose what they want to see by clicking on the 'Configure this page' button at the bottom of the left-hand menu. By adding and removing tracks users will be able to select the type of data they want to have included in the displays. [9] Data from the following categories can be easily added or removed from this 'Location' tab view: 'Sequence and assembly', 'Genes and transcripts', 'mRNA and protein alignments', 'Other DNA alignments', 'Germline variation', 'Comparative genomics', among others. [9] Users can also change the display options such as the width. [9] A further option allows users to reset the configuration back to the default settings. [9]

More specific information about a select gene can be found in the ‘Gene’ tab. Users can get to this page by searching for desired gene in the search bar and clicking on the gene ID or by clicking on one of the genes shown in the ‘Location’ tab view. The ‘Gene’ tab contains gene-specific information such as gene structure, number of transcripts, position on the chromosome and homology information in the form of gene trees. [10] This information can be accessed via the menu on the left-hand side.

A 'Transcript' tab will also appear when a user chooses to view a gene. The 'Transcript' tab contains much of the same information as the 'Gene' tab, however it is focused on only one transcript. [10]

Tools

Adding Custom tracks to Ensembl Genomes

Ensembl Genomes allows comparing and visualising user data while browsing karyotypes and genes. Most Ensembl Genomes views include an ‘Add your data’ or ‘Manage your data’ button that will allow the user to upload new tracks containing reads or sequences to Ensembl Genomes or to modify data that has been previously uploaded. [11] The uploaded data can be visualised in region views or over the whole karyotype. The uploaded data can be localised using Chromosome Coordinates or BAC Clone Coordinates. [12] The following methods can be used to upload a data file to any Ensembl Genomes page: [13]

  1. Files smaller than 5 MB can be either uploaded directly from any computer or from a web location (URL) to the Ensembl servers.
  2. Larger files can only be uploaded from web locations (URL).
  3. BAM files can only be uploaded using the URL-based approach. The index file (.bam.bai) should be located in the same webserver.
  4. A Distributed Annotation System source can be attached from web locations.

The following file types are supported by Ensembl Genomes: [14]

Visualisation of a custom track labelled "Reads" in Ensembl Genomes Data upload to ensembl genomes.png
Visualisation of a custom track labelled "Reads" in Ensembl Genomes

The data is uploaded temporarily into the servers. Registered users can log in and save their data for future reference. It is possible to share and access the uploaded data using and an assigned URL. [15] Users are also allowed to delete their custom tracks from Ensembl Genomes.

BioMart

BioMart is a programming free search engine incorporated in Ensembl and Ensembl Genomes (except for Ensembl Bacteria) for the purpose of mining and extracting genomic data from the Ensembl databases in table formats like HTML, TSV, CSV or XLS. [16] Release 45 (2019) of Ensembl Genomes has the following data available at the BioMarts:

BioMart view in Ensembl Plants. BioMart view EG.png
BioMart view in Ensembl Plants.

The purpose of the BioMarts in Ensembl Genomes is to allow the user to mine and download tables containing all the genes for a single species, genes in a specific region of a chromosome or genes on one region of a chromosome associated with an InterPro domain. [21] The BioMarts also include filters to refine the data to be extracted and the attributes (Variant ID, Chromosome name, Ensembl ID, location, etc.) that will appear in the final table file can be selected by the user.

The BioMarts can be accessed online in each corresponding domain of Ensembl Genomes or the source code can be installed in UNIX environment from the BioMart git repository [22]

BLAST

A BLAST interface is provided to allow users to search for DNA or protein sequences against the Ensembl Genomes. It can be accessed by the header, located on top of all Ensembl Genome pages, titled BLAST. The BLAST search can be configured to search against individual species or collections of species (maximum of 25). There is a taxonomic browser to allow the selection of taxonomically related species. [23]

Ensembl Genomes provides a second sequence search tool, that uses an algorithm based on Exonerate, that is provided by European Nucleotide Archive. [23] This tool can be accessed by the header, located on top of all Ensembl Genome pages, titled Sequence Search. Users can then choose whether they would like Exonerate to search against all species in the Ensembl Genomes division or against all species in Ensembl Genomes. They can also choose the 'Maximum E-value', which will limit the results that appear to those with E-values below the maximum. Finally users can choose to use an alternative search mode by selecting 'Use spliced query'.

Variant Effect Predictor

The Variant Effect Predictor is one of the most used tools in Ensembl and Ensembl Genomes. It allows to explore and analyse what is the effect that the variants (SNPs, CNVs, indels or structural variations) have on a particular gene, sequence, protein, transcript or transcription factor. [24] To use VEP, the users must input the location of their variants and the nucleotide variations to generate the following results: [25]

There are two ways in which the users can access the VEP. The first form is online-based. In this page, the user generates an input by selection the following parameters: [26]

  1. Species to be compared. The default database for comparison is Ensembl Transcripts, but for some species, other sources can be selected.
  2. Name for the uploaded data (this is optional, but it will make easier to identify the data if many VEP jobs have been performed)
  3. Selection of the input format for the data. If an incorrect file format is selected, VEP will throw an error when running.
  4. Fields for data upload. Users can upload data from their computers, from an URL-based location or by copying directly their contents into a text box.

Data upload to VEP supports VCF, pileup, HGVS notations and a default format. [27] The default format is a whitespace-separated file that contains the data in columns. The first five columns indicate the chromosome, start location, end location, allele (pair of alleles separated by a '/', with the reference allele first) and the strand (+ for forward or – for reverse). [28] The sixth column is a variation identifier and it is optional. If it is left in blank, VEP will assign an identifier to in output file.

VEP also provides additional identifier options to the users, extra options to complement the output and filtering. [29] The filtering options allow features like removal of known variants from results, returning variants in exons only, and restriction of results to specific consequences of the variants. [30]

VEP users also have the possibility of viewing and manipulating all the jobs associated with their session by browsing the "Recent Tickets" tab. In this tab the users can view the status of their search (success, queued, running or failed) and save, delete or resubmit jobs. [31]

The second option to use VEP is by downloading the source code for its use in UNIX environments. [32] All the features are equal between the online and script versions. VEP can also be used with online instances like Galaxy.

When a VEP job is completed the output is a tabular file that contains the following columns: [33]

  1. Uploaded variation - as chromosome_start_alleles
  2. Location - in standard coordinate format (chr:start or chr:start-end)
  3. Allele - the variant allele used to calculate the consequence
  4. Gene - Ensembl stable ID of affected gene
  5. Feature - Ensembl stable ID of feature
  6. Feature type - type of feature. Currently one of Transcript, RegulatoryFeature, MotifFeature.
  7. Consequence - consequence type of this variation
  8. Position in cDNA - relative position of base pair in cDNA sequence
  9. Position in CDS - relative position of base pair in coding sequence
  10. Position in protein - relative position of amino acid in protein
  11. Amino acid change - only given if the variation affects the protein-coding sequence
  12. Codon change - the alternative codons with the variant base in upper case
  13. Co-located variation - known identifier of existing variation
  14. Extra - this column contains extra information as key=value pairs separated by ";". Displays extra identifiers.
Variant Effect Predictor Output file VEP output.png
Variant Effect Predictor Output file

Other common output formats for VEP include JSON and VDF formats. [34]

Programmatic data access

The Ensembl Genomes [REST] interface allows access to the data using your favourite programming language.

You can also access data using the Perl API and Biomart.

Current species

Ensembl Genomes makes no attempt to include all possible genomes, rather the genomes that are included on the site are those that are deemed to be scientifically important. [35] Each site contains the following number of species:

Collaborations

Ensembl Genomes continuously expands the annotation data through collaboration with other organisations involved in genome annotation projects and research. The following organisations are collaborators of Ensembl Genomes: [42]

See also

Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

<span class="mw-page-title-main">Comparative genomics</span> Field of biological research

Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.

<span class="mw-page-title-main">Ensembl genome database project</span> Scientific project at the European Bioinformatics Institute

Ensembl genome database project is a scientific project at the European Bioinformatics Institute, which provides a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information.

The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.

The completion of the human genome sequencing in the early 2000s was a turning point in genomics research. Scientists have conducted series of research into the activities of genes and the genome as a whole. The human genome contains around 3 billion base pairs nucleotide, and the huge quantity of data created necessitates the development of an accessible tool to explore and interpret this information in order to investigate the genetic basis of disease, evolution, and biological processes. The field of genomics has continued to grow, with new sequencing technologies and computational tool making it easier to study the genome.

In bioinformatics, the general feature format is a file format used for describing genes and other features of DNA, RNA and protein sequences.

The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences and their protein products. RefSeq was introduced in 2000. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule for major organisms ranging from viruses to bacteria to eukaryotes.

GENCODE is a scientific project in genome research and part of the ENCODE scale-up project.

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science, in collaboration with LifeMap Sciences.

DECIPHER is a web-based resource and database of genomic variation data from analysis of patient DNA. It documents submicroscopic chromosome abnormalities and pathogenic sequence variants, from over 25000 patients and maps them to the human genome using Ensembl or UCSC Genome Browser. In addition it catalogues the clinical characteristics from each patient and maintains a database of microdeletion/duplication syndromes, together with links to relevant scientific reports and support groups.

The Consensus Coding Sequence (CCDS) Project is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies. The CCDS project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier, and ensures that they are consistently represented by the National Center for Biotechnology Information (NCBI), Ensembl, and UCSC Genome Browser. The integrity of the CCDS dataset is maintained through stringent quality assurance testing and on-going manual curation.

<span class="mw-page-title-main">BioMart</span>

BioMart is a community-driven project to provide a single point of access to distributed research data. The BioMart project contributes open source software and data services to the international scientific community. Although the BioMart software is primarily used by the biomedical research community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the European Bioinformatics Institute as a data management solution for the Human Genome Project. Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.

<span class="mw-page-title-main">METTL26</span> Protein-coding gene in the species Homo sapiens

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.

WormBase is an online biological database about the biology and genome of the nematode model organism Caenorhabditis elegans and contains information about other related nematodes. WormBase is used by the C. elegans research community both as an information resource and as a place to publish and distribute their results. The database is regularly updated with new versions being released every two months. WormBase is one of the organizations participating in the Generic Model Organism Database (GMOD) project.

<span class="mw-page-title-main">SnpEff</span> Open source tool that annotates variants and predicts their coding effects.

SnpEff is an open source tool that performs annotation on variants and predicts their effects on genes by using an interval forest approach. This program takes pre-determined variants listed in a data file that contains the nucleotide change and its position and predicts if the variants are deleterious. This program was first developed to predict effects of single nucleotide polymorphisms (SNPs) in Drosophila,. As of July 2024, this SnpEff paper has been cited 10076 times. SnpEff has been used for various applications – from personalized medicine, to profiling bacteria. This annotation and prediction software can be compared to ANNOVAR and Variant Effect Predictor, but each use different nomenclatures

<span class="mw-page-title-main">C9orf25</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

ANNOVAR is a bioinformatics software tool for the interpretation and prioritization of single nucleotide variants (SNVs), insertions, deletions, and copy number variants (CNVs) of a given genome.

<span class="mw-page-title-main">Chromosome 12 open reading frame 71</span> Protein encoded in humans by c12orf71 gene

Chromosome 12 open reading frame 71 (c12orf71) is a protein which in humans is encoded by c12orf71 gene. The protein is also known by the alias LOC728858.

References

  1. 1 2 3 Kersey, P. J.; Staines, D. M.; Lawson, D.; Kulesha, E.; Derwent, P.; Humphrey, J. C.; Hughes, D. S. T.; Keenan, S.; Kerhornou, A.; Koscielny, G.; Langridge, N.; McDowall, M. D.; Megy, K.; Maheswari, U.; Nuhn, M.; Paulini, M.; Pedro, H.; Toneva, I.; Wilson, D.; Yates, A.; Birney, E. (2011). "Ensembl Genomes: An integrative resource for genome-scale data from non-vertebrate species". Nucleic Acids Research. 40 (Database issue): D91–D97. doi:10.1093/nar/gkr895. PMC   3245118 . PMID   22067447.
  2. 1 2 Howe KL, Contreras-Moreira B, De Silva N, Maslen G, Akanni W, Allen J, Alvarez-Jarreta J, Barba M, Bolser DM, Cambel L, Carbajo M, Chakiachvili M, Christensen M, Cummins C, Cuzick A, Davis P, Fexova S, Gall A, George N, Gil L, Gupta P, Hammond-Kosack KE, Haskell E, Hunt S, Jaiswal P, Janacek S, Kersey PJ, Langridge N, Maheswari U, Maurel T, McDowall MD, Moore B, Muffato M, Naamati G, Naithani S, Olson A, Papatheodorou I, Patricio M, Paulini M, Pedro H, Perry E, Preece J, Rosello M, Russell M, Sitnik V, Staines DM, Stein J, Tello-Ruiz MK, Trevanion SJ, Urban M, Wei S, Ware D, Williams G, Yates AD, Flicek P (January 2020). "Ensembl Genomes 2020—enabling non-vertebrate genomic research". Nucleic Acids Research. 48(D1) (D1): D689–D695. doi: 10.1093/nar/gkz890 . PMC   6943047 . PMID   31598706.
  3. Hubbard, T. J. P.; Aken, B. L.; Ayling, S.; Ballester, B.; Beal, K.; Bragin, E.; Brent, S.; Chen, Y.; Clapham, P.; Clarke, L.; Coates, G.; Fairley, S.; Fitzgerald, S.; Fernandez-Banet, J.; Gordon, L.; Graf, S.; Haider, S.; Hammond, M.; Holland, R.; Howe, K.; Jenkinson, A.; Johnson, N.; Kahari, A.; Keefe, D.; Keenan, S.; Kinsella, R.; Kokocinski, F.; Kulesha, E.; Lawson, D.; Longden, I. (2009). "Ensembl 2009". Nucleic Acids Research. 37 (Database issue): D690–D697. doi:10.1093/nar/gkn828. PMC   2686571 . PMID   19033362.
  4. "About Ensembl Genomes". Ensembl Genomes. Ensembl. Retrieved 2 September 2014.
  5. "Ensembl Genomes MySQL". ensemblgenomes.org. Ensembl Genomes. Retrieved 11 September 2014.
  6. Kinsella, Rhoda J.; Kähäri, Andreas; Syed, Haider; Zamora, Jorge; Proctor, Glenn; Spudich, Giulietta; Almeida-King, Jeff; Staines, Daniel; Derwent, Paul; Kerhournou, Arnaud; Kersey, Paul; Flicek, Paul (2011). "Ensembl BioMarts: a hub for data retrieval across taxonomic space". Database. 2011 (2011): 2. doi:10.1093/database/bar030. PMC   3170168 . PMID   21785142.
  7. "Software License". Ensembl. Retrieved 9 June 2020.
  8. 1 2 "Whole Genome". Ensembl Genomes. Retrieved 7 September 2014.
  9. 1 2 3 4 5 "Frequently Asked Questions". Ensembl Genomes. Retrieved 7 September 2014.
  10. 1 2 Spudich, G; Fernández-Suárez, X. M.; Birney, E (2007). "Genome browsing with Ensembl: A practical overview". Briefings in Functional Genomics and Proteomics. 6 (3): 202–19. doi: 10.1093/bfgp/elm025 . PMID   17967807.
  11. "Uploading your data to Ensembl". Ensembl Genomes. Ensembl Genomes. Retrieved 9 September 2014.
  12. "Coordinates for data location in Ensembl Genomes". Ensembl Genomes. Ensembl Genomes. Retrieved 9 September 2014.
  13. "Methods for data upload". Ensembl Plants. Ensembl Genomes. Retrieved 9 September 2014.
  14. "Supported data files". Ensembl Plants. Ensembl Genomes. Retrieved 9 September 2014.
  15. "Saving and Sharing data in Ensembl Genomes". Ensembl Plants. Ensembl Genomes.
  16. "Data Mining in Ensembl with Data Mining in Ensembl with BioMart" (PDF). Ensembl: 2. 2014. Retrieved 11 September 2014.
  17. "Ensembl Protists". Ensembl Protists. Ensembl Genomes. Retrieved 1 October 2019.
  18. "Ensembl Fungi". Ensembl Fungi. Ensembl Genomes. Retrieved 1 October 2019.
  19. "Ensembl Metazoa". Ensembl Metazoa. Ensembl Genomes. Retrieved 1 October 2019.
  20. "Ensembl Plants". Ensembl Plants. Ensembl Genomes. Retrieved 1 October 2019.
  21. "Data Mining in Ensembl with Data Mining in Ensembl with BioMart" (PDF). Ensembl: 3. 2014. Retrieved 11 September 2014.
  22. "BioMart 0.9.0 user manual" (PDF). May 2014. p. 5. Retrieved 11 September 2014.
  23. 1 2 "Frequently Asked Questions". Ensembl Genomes. Archived from the original on 10 September 2014. Retrieved 11 September 2014.
  24. "Variant Effect Predictor". ensembl.org. Ensembl. Retrieved 11 September 2014.
  25. "Variant Effect Predictor results overview". ensembl.org. Ensembl. Retrieved 11 September 2014.
  26. "Data input to VEP". ensembl.org. Ensembl. Retrieved 11 September 2014.
  27. "VEP supported file formats". ensembl.org. Ensembl. Retrieved 11 September 2014.
  28. "VEP default file". ensembl.org. Ensembl. Retrieved 11 September 2014.
  29. "VEP options and extras". ensembl.org. Ensembl. Retrieved 11 September 2014.
  30. "VEP filtering". ensembl.org. Ensembl. Retrieved 11 September 2014.
  31. "VEP jobs". ensembl.org. Ensembl. Retrieved 11 September 2014.
  32. "VEP script download". ensembl.org. Ensembl. Retrieved 11 September 2014.
  33. "VEP Output". ensembl.org. Ensembl Genomes. Retrieved 11 September 2014.
  34. "VEP Output formats". ensembl.org. Ensembl Genomes. Retrieved 11 September 2014.
  35. 1 2 Kersey, P. J.; Allen, J. E.; Christensen, M; Davis, P; Falin, L. J.; Grabmueller, C; Hughes, D. S.; Humphrey, J; Kerhornou, A; Khobova, J; Langridge, N; McDowall, M. D.; Maheswari, U; Maslen, G; Nuhn, M; Ong, C. K.; Paulini, M; Pedro, H; Toneva, I; Tuli, M. A.; Walts, B; Williams, G; Wilson, D; Youens-Clark, K; Monaco, M. K.; Stein, J; Wei, X; Ware, D; Bolser, D. M.; et al. (2014). "Ensembl Genomes 2013: Scaling up access to genome-wide data". Nucleic Acids Research. 42 (Database issue): D546–52. doi:10.1093/nar/gkt979. PMC   3965094 . PMID   24163254.
  36. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  37. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  38. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  39. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  40. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  41. "Species List". Ensembl Genomes. Retrieved 1 July 2024.
  42. "Collaborators - Ensembl Genomes". Ensembl Genomes. Ensembl Genomes. Retrieved 3 September 2014.