Search aggregator

Last updated

A search aggregator is a type of metasearch engine which gathers results from multiple search engines simultaneously, typically through RSS search results. It combines user specified search feeds (parameterized RSS feeds which return search results) to give the user the same level of control over content as a general aggregator.[ citation needed ]

Contents

Soon after the introduction of RSS, sites began publicising their search results in parameterized RSS feeds. Search aggregators are an increasingly popular way to take advantage of the power of multiple search engines with a flexibility not seen in traditional metasearch engines. To the end user, a search aggregator may appear to be just a customizable search engine and the use of RSS may be completely hidden. However, the presence of RSS is directly responsible for the existence of search aggregators and a critical component in the behind-the-scenes technology.

History

The concept of search aggregation is a relatively recent phenomenon with the first ones becoming available in 2006. In 2005 Amazon published the OpenSearch specification for making search results available in a generic XML format. While many sites currently publish results in OpenSearch, many simply publish in generic RSS format. However, while OpenSearch syndication allows for greater flexibility in the way Search Aggregators display results, it is generally not required.

Functional overview

A search aggregator typically allows users to select specific search engines ad hoc to perform a specified query. At the time the user enters the query into the Search Aggregator, it generates the required URL "on the fly" by inserting the search query into the parameterized URL for the search feed. A parameterized URL looks something like this:

https://news.google.com/news?hl=en&ned=us&q={SEARCH_TERMS}&ie=UTF-8&output=rss

In this case, the {SEARCH_TERMS} parameter would be replaced with the user requested search terms, and the query would be sent to the host. The Search Aggregator would then parse the results and display them in a user-friendly way.

Advantages

This system has several advantages over traditional metasearch engines. Primarily, it allows the user greater flexibility in deciding which engines should be used to perform the query. [1] They also allow for easy addition of new engines to the users personal collection (similar to the way a user adds a new news feed to a news aggregator.)

Patents

Apple patent US6847959B1, [2] filed January 5, 2000, covers universal search aggregation. This resulted in the removal [3] of this feature from Samsung Android smart phones in July 2012.

See also

Related Research Articles

<span class="mw-page-title-main">Google Search</span> Search engine from Google

Google Search is a search engine operated by Google. It allows users to search for information on the Internet by entering keywords or phrases. Google Search uses algorithms to analyze and rank websites based on their relevance to the search query. It is the most popular search engine worldwide.

In computing, a search engine is an information retrieval software system designed to help find information stored on one or more computer systems. Search engines discover, crawl, transform, and store information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. The most widely used type of search engine is a web search engine, which searches for information on the World Wide Web.

<span class="mw-page-title-main">RSS</span> Family of web feed formats

RSS is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many different websites in a single news aggregator, which constantly monitor sites for new content, removing the need for the user to manually check them. News aggregators can be built into a browser, installed on a desktop computer, or installed on a mobile device.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

<span class="mw-page-title-main">Address bar</span> Web browser widget that shows the current URL

In a web browser, the address bar is the element that shows the current URL. The user can type a URL into it to navigate to a chosen website. In most modern browsers, non-URLs are automatically sent to a search engine. In a file browser, it serves the same purpose of navigation, but through the file-system hierarchy.

OpenSearch is a collection of technologies that allow the publishing of search results in a format suitable for syndication and aggregation. Introduced in 2005, it is a way for websites and search engines to publish search results in a standard and accessible format.

<span class="mw-page-title-main">News aggregator</span> Client software that aggregates syndicated web content

In computing, a news aggregator, also termed a feed aggregator, content aggregator, feed reader, news reader, or simply an aggregator, is client software or a web application that aggregates digital content such as online newspapers, blogs, podcasts, and video blogs (vlogs) in one location for easy viewing. The updates distributed may include journal tables of contents, podcasts, videos, and news items.

Federated search retrieves information from a variety of sources via a search application built on top of one or more search engines. A user makes a single query request which is distributed to the search engines, databases or other query engines participating in the federation. The federated search then aggregates the results that are received from the search engines for presentation to the user. Federated search can be used to integrate disparate information resources within a single large organization ("enterprise") or for the entire web.

A mashup, in web development, is a web page or web application that uses content from more than one source to create a single new service displayed in a single graphical interface. For example, a user could combine the addresses and photographs of their library branches with a Google map to create a map mashup. The term implies easy, fast integration, frequently using open application programming interfaces and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. The term mashup originally comes from creating something by combining elements from two or more sources.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

A travel website is a website that provides travel reviews, trip fares, or a combination of both. Over 1.5 billion people book travel per year, 70% of which is done online.

Multisearch is a multitasking search engine which includes both search engine and metasearch engine characteristics with additional capability of retrieval of search result sets that were previously classified by users. It enables the user to gather results from its own search index as well as from one or more search engines, metasearch engines, databases or any such kind of information retrieval (IR) programs. Multisearch is an emerging feature of automated search and information retrieval systems which combines the capabilities of computer search programs with results classification made by a human.

Language Integrated Query is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

A Web query topic classification/categorization is a problem in information science. The task is to assign a Web search query to one or more predefined categories, based on its topics. The importance of query classification is underscored by many services provided by Web search. A direct application is to provide better search result pages for users with interests of different categories. For example, the users issuing a Web query "apple" might expect to see Web pages related to the fruit apple, or they may prefer to see products or news related to the computer company. Online advertisement services can rely on the query classification results to promote different products more accurately. Search result pages can be grouped according to the categories predicted by a query classification algorithm. However, the computation of query classification is non-trivial. Different from the document classification tasks, queries submitted by Web search users are usually short and ambiguous; also the meanings of the queries are evolving over time. Therefore, query topic classification is much more difficult than traditional document classification tasks.

<span class="mw-page-title-main">Windows Search</span> Desktop search platform by Microsoft

Windows Search is a content index desktop search platform by Microsoft introduced in Windows Vista as a replacement for both the previous Indexing Service of Windows 2000 and the optional MSN Desktop Search for Windows XP and Windows Server 2003, designed to facilitate local and remote queries for files and non-file items in compatible applications including Windows Explorer. It was developed after the postponement of WinFS and introduced to Windows constituents originally touted as benefits of that platform.

EMML, or Enterprise Mashup Markup Language, is an XML markup language for creating enterprise mashups, which are software applications that consume and mash data from variety of sources. These applications often perform logical or mathematical operations as well as present the data.

Yebol was a vertical "decision" search engine that had developed a knowledge-based, semantic search platform. Based in San Jose, California, Yebol's artificial intelligence human intelligence-infused algorithms automatically cluster and categorize search results, web sites, pages and contents that it presents in a visually indexed format that is more aligned with initial human intent. Yebol used association, ranking and clustering algorithms to analyze related keywords or web pages. Yebol presented as one of its goals the creation of a unique "homepage look" for every possible search term.

The following outline is provided as an overview of and topical guide to search engines.

<span class="mw-page-title-main">Google Now</span> Intelligent personal assistant

Google Now was a feature of Google Search of the Google app for Android and iOS. Google Now proactively delivered information to users to predict information they might need in the form of informational cards. Google Now branding is no longer used, but the functionality continues in the Google app and its discover tab.

<span class="mw-page-title-main">Searx</span> Metasearch engine

Searx is a free and open-source metasearch engine, available under the GNU Affero General Public License version 3, with the aim of protecting the privacy of its users. To this end, Searx does not share users' IP addresses or search history with the search engines from which it gathers results. Tracking cookies served by the search engines are blocked, preventing user-profiling-based results modification. By default, Searx queries are submitted via HTTP POST, to prevent users' query keywords from appearing in webserver logs. Searx was inspired by the Seeks project, though it does not implement Seeks' peer-to-peer user-sourced results ranking.

References

  1. "Use of 'anonymous' search engine aggregator DuckDuckGo rockets following PRISM scandal" Belfast Telegraph. Retrieved 2015-08-12.
  2. "Patent US6847959 – Universal interface for retrieval of information in a computer system – Google Patents" . Retrieved 2012-08-16.
  3. Florian Mueller (2012-02-15). "Last week's Apple-Samsung lawsuit involves eight patents, 17 products – bid for Nexus ban is based on only a subset". FOSS Patents. Retrieved 2012-08-16.