MetaCrawler

Last updated
MetaCrawler
Metacrawler logo 2018.png
Type of site
Metasearch engine
Available inEnglish
OwnerSystem1
URL metacrawler.com
RegistrationNo
LaunchedJuly 7, 1995;29 years ago (1995-07-07)

MetaCrawler is a search engine. It is a registered trademark of InfoSpace and was created by Erik Selberg.

Contents

It was originally a metasearch engine, as its name suggests. Throughout its lifetime it combined web search results from sources including Google, Yahoo!, Bing (formerly Live Search), Ask.com, About.com, MIVA, LookSmart and other search engine programs. MetaCrawler also provided users the option to search for images, video, news, business and personal telephone directories, and for a while even audio.

History

MetaCrawler was the first metasearch engine, originally developed in 1994 at the University of Washington by graduate student Erik Selberg and Professor Oren Etzioni as Erik Selberg's Ph.D. qualifying project. [1] Originally, it was created in order to provide a reliable abstraction layer to web search engine programs in order to study semantic structure on the World Wide Web. However, it was a useful service in its own right, and had a number of research challenges.

MetaCrawler was originally operating on four Digital Equipment Corporation AlphaStations [2] and processing several hundred thousand queries per day. This was starting to create significant bandwidth load at UW. It became clear that MetaCrawler needed to have some method of paying for the queries it was forwarding to the primary search engines. Some time after the search engine launched, NetBot, Inc., which was cofounded by Etzioni, [3] was initiated to commercialize MetaCrawler [4] and three other UW programs: Ahoy! The HomePage Finder, Occam, and ShopBot. Ahoy! and Occam were never actually commercialized. NetBot then combined the core of MetaCrawler with ShopBot to create a meta-shopping website, Jango. [5]

MetaCrawler launched on July 7, 1995. [6]

MetaCrawler site in 1996 Metacrawler screenshot 1996.png
MetaCrawler site in 1996

As of late 1995, MetaCrawler logged over 7,000 search queries per week, and accessed six services: Galaxy, InfoSeek, Lycos, Open Text, WebCrawler and Yahoo. [7] By late 1996, there were over 150,000 queries per day. [8]

MetaCrawler's owners were unable to determine a reasonable business model, so in January 1997 they sold it to another Internet startup company, Go2Net, [9] in which Microsoft co-founder Paul Allen later invested a 54 percent stake. [10] Go2Net went public in April that year, registering on Nasdaq. [11] MetaCrawler had about 30,000 daily visitors at the start of 1997, but by mid 1998 jumped to 275,000. [12]

Old MetaCrawler logo, used c. 1997 to 2003 Metacrawler old logo.png
Old MetaCrawler logo, used c. 1997 to 2003

NetBot would eventually be purchased by Excite in October 1997 for $35 million, where Jango became part of the Excite Network Shopping Channel. [13] Both Selberg and Etzioni resumed working for UW until 1999, when they joined Go2Net for a year, quitting just prior to Go2Net's acquisition by InfoSpace, Inc. in July 2000 for $4.2 billion. [14] By that time, Go2Net had purchased another metasearch engine, Dogpile. [15]

In 2014, MetaCrawler was merged into another one of InfoSpace's search engines, Zoo.com, [16] which was originally launched in 2006. [17] The MetaCrawler domain at first redirected to Zoo.com, [18] [19] but was afterwards changed to redirect to msxml.excite.com, the search page for Excite, also operated by InfoSpace. [20] [21]

In July 2016, InfoSpace was sold by parent company Blucora to OpenMail for $45 million, putting MetaCrawler under the ownership of OpenMail. [22] OpenMail was later renamed System1. [23]

In 2017, MetaCrawler relaunched as its own search engine. [24]

See also

Related Research Articles

In computing, a search engine is an information retrieval software system designed to help find information stored on one or more computer systems. Search engines discover, crawl, transform, and store information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. The most widely used type of search engine is a web search engine, which searches for information on the World Wide Web.

WebCrawler is a search engine, and one of the oldest surviving search engines on the web today. For many years, it operated as a metasearch engine. WebCrawler was the first web search engine to provide full text search.

robots.txt Internet protocol

robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.

HowStuffWorks is an American commercial infotainment website founded by professor and author Marshall Brain, to provide its target audience an insight into the way many things work. The site uses various media to explain complex concepts, terminology, and mechanisms—including photographs, diagrams, videos, animations, and articles.

<span class="mw-page-title-main">Excite (web portal)</span> Internet portal

Excite is an American website operated by IAC that provides outsourced internet content such as a metasearch engine, with outsourced weather and news content on the main page. As of 2024, all of Excite's operations are controlled by services outside of the business.

<span class="mw-page-title-main">Dogpile</span> Metasearch engine

Dogpile is a metasearch engine for information on the World Wide Web that fetches results from Google, Yahoo!, Yandex, Bing, and other popular search engines, including those from audio and video content providers such as Yahoo!.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

<span class="mw-page-title-main">Startpage.com</span> Privacy-focused search engine based in the Netherlands

Startpage is a Dutch search engine company that highlights privacy as its distinguishing feature. The website advertises that it allows users to obtain Google Search results while protecting users' privacy by not storing personal information or search data and removing all trackers. Startpage.com also includes an Anonymous View browsing feature that allows users the option to open search results via proxy for increased anonymity.

<span class="mw-page-title-main">Singingfish</span>

Singingfish was an audio/video search engine that powered audio video search for Windows Media Player, WindowsMedia.com, RealOne/RealPlayer, Real Guide, AOL Search, Dogpile, Metacrawler and Singingfish.com, among others. Launched in 2000, it was one of the earliest and longest lived search engines dedicated to multimedia content. Acquired in 2003 by AOL, it was slowly folded into the AOL search offerings and all web hits from RMC TV to Singingfish were being redirected to AOL Video and as of February 2007 Singingfish had ceased to exist as a separate service.

<span class="mw-page-title-main">InfoSpace</span> American company

Infospace, Inc. was an American company that offered private label search engine, online directory, and provider of metadata feeds. The company's flagship metasearch site was Dogpile and its other notable consumer brands were WebCrawler and MetaCrawler. After a 2012 rename to Blucora, the InfoSpace business unit was sold to data management company OpenMail.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

A comparison shopping website, sometimes called a price comparison website, price analysis tool, comparison shopping agent, shopbot, aggregator or comparison shopping engine, is a vertical search engine that shoppers use to filter and compare products based on price, features, reviews and other criteria. Most comparison shopping sites aggregate product listings from many different retailers but do not directly sell products themselves, instead earning money from affiliate marketing agreements. In the United Kingdom, these services made between £780m and £950m in revenue in 2005. Hence, E-commerce accounted for an 18.2 percent share of total business turnover in the United Kingdom in 2012. Online sales already account for 13% of the total UK economy, and its expected to increase to 15% by 2017. There is a huge contribution of comparison shopping websites in the expansion of the current E-commerce industry.

<span class="mw-page-title-main">Info.com</span>

Info is a metasearch engine, which as of 2013, provided results from search engines Google, Yahoo!, Ask, Bing, Yandex, and Open Directory. As of 2004, news search was powered by Topix.net, Info.com's web search engine information was powered by Shopping.com and Info.com had White Page and Yellow Page search. As of 2013, Info.com also had search plugins for Google Chrome, Internet Explorer and Firefox.

PolyCola, previously known as GahooYoogle, is a metasearch engine which was created by Arbel Hakopian.

<span class="mw-page-title-main">LeapFish</span>

LeapFish.com was a search aggregator that retrieved results from other portals and search engines, including Google, Bing and Yahoo!, and also search engines of blogs, videos etc. It was a registered trademark of Dotnext Inc, launched on 3 November 2008.

Netbot was the first commercial Internet price comparison service. Founded by University of Washington Computer Science professors Oren Etzioni and Daniel S. Weld the company was funded by ARCH Venture Partners, Alta Partners and the Madrona Venture Group, and the University of Washington was also a shareholder. Netbot introduced the Jango comparison shopping “agent” first as a browser plug-in and later, as a server product. In addition, the company operated MetaCrawler, a metasearch engine, before licensing it to Go2Net. In October 1997, Netbot was acquired by the Excite portal for $35M.

<span class="mw-page-title-main">Oren Etzioni</span> Professor Emeritus of computer science, founder of Allen Institute for Artificial Intelligence

Oren Etzioni is Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). Etzioni is the founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. Etzioni is a Technical Director of the AI2 Incubator, and a venture partner at the Madrona Venture Group.

<span class="mw-page-title-main">Searx</span> Metasearch engine

Searx is a discontinued free and open-source metasearch engine, available under the GNU Affero General Public License version 3, with the aim of protecting the privacy of its users. To this end, Searx does not share users' IP addresses or search history with the search engines from which it gathers results. Tracking cookies served by the search engines are blocked, preventing user-profiling-based results modification. By default, Searx queries are submitted via HTTP POST, to prevent users' query keywords from appearing in webserver logs. Searx was inspired by the Seeks project, though it does not implement Seeks' peer-to-peer user-sourced results ranking.

Zoo.com is a metasearch engine, which as of 2006, provided results from search engines and other sources, including Google, Yahoo! and Wikipedia. Also as of 2006, Zoo.com provided news content from ABC News, Fox News and Yahoo! News.

System1 is an American Internet advertising company. Formerly known as OpenMail, it was founded in 2013. It describes itself as operating a "Responsive Acquisition Marketing Platform", and cites privacy as one of its principal foci, although it has been criticized for its influence on privacy-focused properties, including search engine Startpage.com. It is headquartered in Marina del Rey, California.

References

  1. "Erik Selberg IoT Conference Bio" . Retrieved 2019-09-25.
  2. "Federated Search". federatedsearchblog.com. Retrieved 2019-01-29.
  3. "Robots Mailing List Archive: The Metacrawler, Reborn". webdoc.gwdg.de. Retrieved 2019-01-29.
  4. "ASEE PRISM - Apr 1999 - Cover Story - Building Strategic Partnerships". www.prism-magazine.org. Archived from the original on 2015-09-09. Retrieved 2019-01-29.
  5. "Shopping Engine History - SingleFeed, Shopping Engines made easy". www.singlefeed.com. Retrieved 2019-01-29.
  6. "Multi-Service Search and Comparison Using the MetaCrawler". www.w3.org. Retrieved 2019-02-12.
  7. "Multi-Service Search and Comparison Using the MetaCrawler". homes.cs.washington.edu. Retrieved 2019-01-29.
  8. "The MetaCrawler Architecture for Resource Aggregation on the Web". homes.cs.washington.edu. Retrieved 2019-01-29.
  9. "Robots Mailing List Archive: The Metacrawler, Reborn". webdoc.gwdg.de. Retrieved 2019-01-29.
  10. "Paul Allen sets 54-percent stake in Go2Net - Mar. 15, 1999". money.cnn.com. Retrieved 2019-01-29.
  11. Court, Randolph (1998-04-23). "Go2net's Low-Overhead Plan". Wired. ISSN   1059-1028 . Retrieved 2019-01-29.
  12. King, Suzanne. "O.P. Web site developer is Seattle-bound". American City Business Journals . Retrieved 2019-01-29.
  13. "Excite to buy NetBot". CNET. Retrieved 2019-01-29.
  14. Francisco, August Cole, Bambi. "InfoSpace to acquire Go2Net". MarketWatch. Retrieved 2019-01-29.{{cite web}}: CS1 maint: multiple names: authors list (link)
  15. "Google Added To Go2Net's MetaCrawler and Dogpile Metasearch Services – News announcements – News from Google – Google". googlepress.blogspot.com. Retrieved 2019-01-29.
  16. "MetaCrawler". 2014-01-01. Archived from the original on 2014-01-01. Retrieved 2019-01-29.
  17. "New Kid-Friendly Search Engine Zoo.com Offers Wealth of Information Without the Worry". www.businesswire.com. 2006-11-14. Retrieved 2019-01-29.
  18. "What is Zoo?". www.computerhope.com. Retrieved 2019-01-29.
  19. "Module 1". mason.gmu.edu. Archived from the original on 2019-01-26. Retrieved 2019-01-29.
  20. "Søgemaskiner - Search-on.dk JR Corp". www.search-on.dk. Retrieved 2019-01-29.
  21. "MetaCrawler". Archived from the original on 2016-01-01. Retrieved 2019-01-29.
  22. "Blucora to sell InfoSpace business for $45 million". Seattle Times. July 5, 2016.
  23. "System1 raises $270 million for 'consumer intent' advertising". L.A. Biz. Retrieved 2017-12-01.
  24. "MetaCrawler". 2017-04-03. Archived from the original on 2017-04-03. Retrieved 2019-01-29.