EB-eye

EBI Search
Native name	EBI Search (formerly EB-eye)
Type of site	Data search engine
Available in	English language
Owner	European Bioinformatics Institute
Services	Research and services in bioinformatics
URL	ebi.ac.uk
Registration	Optional
Launched	2006;19 years ago
Current status	Online

Last updated February 17, 2025

The EBI Search is a scalable text search engine that provides easy and uniform access to the biological data resources and services hosted at the European Bioinformatics Institute (EBI).^[1]^[2]

The original and primary purpose of EBI Search is to provide search and indexing capabilities of publicly available biological data, thus enabling research in the fields of bioinformatics and life sciences by supporting both basic research and the broader scientific community by making biological data easily accessible and searchable.^[3]

In addition to the EBI Search website, a RESTful API interface is available, enabling programmatic data queries. This allows its search and retrieval capabilities to be exploited in workflows and analytical pipe-lines.

History

The EBI Search project was developed in August 2006 at the European Bioinformatics Institute as software under the name EBI-eye on top of the existing Apache Lucene open-source search engine. ^[3] The project was soon explanded to include more than 62 distinct datasets, covering about 400 million entries and was renamed to EBI Search. ^[2]^[1]

In 2017, EBI Search was improved by implementing "search as a service" through a RESTful API that let other websites integrate its search capabilities into their platforms, eliminating the need to build separate search systems. The service was also enhanced with features like hierarchical taxonomy navigation and similar-entry suggestions, while scaling to handle over 300 million searches and 1.3 billion records that could be re-indexed in under 24 hours. ^[4]

In 2019, EBI Search was further developed to include a new HTTP cache mechanism improving response times, unlimited cross-references retrieval, support for Cross-Origin Resource Sharing (CORS), and integration of new data resources like Europe PMC, BioSamples, Rfam, and reviewed ChEMBL. ^[5]

During the COVID-19 pandemic, the project was updated to handle increased data needs.^[6] At present, the EBI Search engine indexes more than 140 different data resources, making it one of the most comprehensive search tools for biological and biomedical data.

Data Resources

EMBL-EBI hosts a vast amount of molecular data and other information that is indexed by EBI Search. The search engine indexes data from various data resources. All these resources are freely available and regularly updated through EMBL-EBI's data management pipeline.

The EBI Search can search only the information that gets indexed. This implies that other search engines operating on biological data might yield different results. As a rule of thumb, the EBI Search engine indexes identifiers, names, descriptions, keywords and cross-references.

The indexed data includes nucleotide sequences and protein sequences, protein families, structural data, gene expression profiles, protein interactions, biological pathways, and small molecules. Additionally, EBI Search indexes academic literature, patents, and institutional information.

Search Interface

When users enter text into EBI Search interfaces - whether through the search boxes or by specifying the query parameter in RESTful API calls - their input gets converted into a standardized search query format. This converted query is what actually retrieves the search results.

Searching using the website

The user can search globally across all data resources indexed by EBI search by using the EBI search box. You can simply type some query terms into the text search box there and press the search button (or press Enter). The user can thus search globally across all EBI Search data resources. The system then displays a summary page with a list of various data sets and the number of matches found in each of them.

In EBI Search boxes you can enter any meaningful term to find relevant information by typing, for example, accession numbers/identifiers (such as VAV_HUMAN), gene symbols (for instance tpi1), species or keywords.

Simple search examples

Search for insulin receptor
Search for p53
Search for web production team
Search for escherichia NOT coli
Search for C2H2 zinc finger family
Search for DNA binding

Advanced search examples

For more complex queries, you can use the EBI Search query syntax:

Search for description:azurin
Search for paired box protein BUT NOT fragment OR paxillin
Search for (bacterium OR organism) AND (unidentified OR uncultured) (environmental samples)

The query builder allows users to create and save complex queries on the available data to get specific search results. It is also possible to search using cross-references.

Search Results

The EBI Search website presents results in a three-column layout designed for efficient data exploration. The left column displays a summary of hits per category/domain with customizable facets for filtering results. The central column lists the primary search results with direct URLs to original data entries. The right column shows related data and alternative views. For gene and protein queries, specialized "Gene & protein summaries" appear above the main results, collating data from multiple EMBL-EBI resources according to molecular biology's central dogma.

Features and Tools

Users can interact with search results in several ways:

Data Export: Results can be downloaded in multiple formats (XML, JSON, TSV, CSV) using the 'Save result' button, with a current limit of 100 entries per download

Analysis Tools: Direct launching of domain-specific tools (e.g., BLAST for sequence analysis, Clustal Omega for multiple sequence alignment) from selected search results

RSS Alerts: Users can create RSS feeds to monitor updates to their search queries, particularly useful for tracking new publications, protein entries, or structural data

Cross-References: Results include links to related entries across different EMBL-EBI databases, facilitating comprehensive data exploration

Result Relevance

Search result ordering primarily follows Apache Lucene's scoring system, where closer matches receive higher relevance scores. Users can influence result ranking using the caret symbol (^) followed by a boost factor—for example, "prostate^4 AND cancer" gives greater weight to entries matching "prostate". While EBI Search can be configured to boost specific domains or fields, runtime boosting is recommended for most precise control over result ordering.

Searching using the API

The EBI Search provides RESTful Web Services that allow programmatic access to biological data from the EBI Search data resources. This service is particularly useful for researchers and developers who wish to include EBI Search results into their code pipelines or to simply use it with a custom developed interface.

Users can interact with the API through various endpoints supporting different response formats including XML, JSON, RSS, and CSV. The service enables faceted searching, cross-reference searching, and auto-completion functionality across multiple databases.

Developers can access the API using sample clients available in several programming languages including Perl, Python, and Java, with each client requiring specific libraries such as LWP for Perl or Requests for Python.

The API currently follows Apache Lucene query syntax and returns appropriate HTTP status codes to indicate the success or failure of requests.

References

1 2 Squizzato S.; Park Y.M.; Buso N.; Gur T.; Cowley A.; Li W.; Uludag M.; Pundir S.; Cham J.A.; McWilliam H.; Lopez R. (2015). "The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI". Nucleic Acids Res. 43 (W1): W585-8. doi:10.1093/nar/gkv316. PMC 4489232 . PMID 25855807.
1 2 Valentin F.; Squizzato S.; Goujon M.; McWilliam H.; Paern J.; Lopez R. (2010). "Fast and efficient searching of biological data resources—using EB-eye". Brief Bioinform. 11 (4): 375–384. doi: 10.1093/bib/bbp065 . PMC 2905521 . PMID 20150321.
1 2 Goujon, M.; Valentin, F.; Miyar, T.; McWilliam, H.; Lopez, R. (December 2007). "The EB-eye". No. 13.4. EMBnet.news. p. 18-21.
↑ Park, YM; Squizzato, S; Buso, N; Gur, T; Lopez, R (May 2017). "The EBI search engine: EBI search as a service-making biological data accessible for all". Nucleic Acids Research. 45 (W1): W545 –W549. doi:10.1093/nar/gkx359.
↑ Madeira, F.; Park, Y.M.; Lee, J.; Buso, N.; Gur, T.; Madhusoodanan, N.; Basutkar, P.; Tivey, ARN; Potter, SC; Finn, RD; Lopez, R (12 April 2019). "The EMBL-EBI search and sequence analysis tools APIs in 2019". Nucleic Acids Research. 47 (W1): W636 –W641. doi:10.1093/nar/gkz268.
↑ Madeira, Fábio; Pearce, Matt; Basutkar, Prasad; Lee, Joon; Edbali, Ossama; Madhusoodanan, Nandana; Kolesnikov, Anton; Lopez, Rodrigo (July 2022). "Search and sequence analysis tools services from EMBL-EBI in 2022". Nucleic Acids Research. 50 (W1): W276 –W279. doi:10.1093/nar/gkac240.

"EMBnet.News (Volume 14, Nr. 1, December 2007)". EMBnetNews. December 2007. Retrieved 1 April 2009.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[pmid25855807-1] 1 2 Squizzato S.; Park Y.M.; Buso N.; Gur T.; Cowley A.; Li W.; Uludag M.; Pundir S.; Cham J.A.; McWilliam H.; Lopez R. (2015). "The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI". Nucleic Acids Res. 43 (W1): W585-8. doi:10.1093/nar/gkv316. PMC 4489232 . PMID 25855807.

[pmid20150321-2] 1 2 Valentin F.; Squizzato S.; Goujon M.; McWilliam H.; Paern J.; Lopez R. (2010). "Fast and efficient searching of biological data resources—using EB-eye". Brief Bioinform. 11 (4): 375–384. doi: 10.1093/bib/bbp065 . PMC 2905521 . PMID 20150321.

[original_article_emblnet-3] 1 2 Goujon, M.; Valentin, F.; Miyar, T.; McWilliam, H.; Lopez, R. (December 2007). "The EB-eye". No. 13.4. EMBnet.news. p. 18-21.

[10.1093/nar/gkx359-4] Park, YM; Squizzato, S; Buso, N; Gur, T; Lopez, R (May 2017). "The EBI search engine: EBI search as a service-making biological data accessible for all". Nucleic Acids Research. 45 (W1): W545 –W549. doi:10.1093/nar/gkx359.

[10.1093/nar/gkz268-5] Madeira, F.; Park, Y.M.; Lee, J.; Buso, N.; Gur, T.; Madhusoodanan, N.; Basutkar, P.; Tivey, ARN; Potter, SC; Finn, RD; Lopez, R (12 April 2019). "The EMBL-EBI search and sequence analysis tools APIs in 2019". Nucleic Acids Research. 47 (W1): W636 –W641. doi:10.1093/nar/gkz268.

[10.1093/nar/gkac240-6] Madeira, Fábio; Pearce, Matt; Basutkar, Prasad; Lee, Joon; Edbali, Ossama; Madhusoodanan, Nandana; Kolesnikov, Anton; Lopez, Rodrigo (July 2022). "Search and sequence analysis tools services from EMBL-EBI in 2022". Nucleic Acids Research. 50 (W1): W276 –W279. doi:10.1093/nar/gkac240.

[1]

[2]

[3]

[4]

[5]

[6]