Googlewhack

Last updated

A Googlewhack was a contest to find a Google Search query that returns a single result. A Googlewhack must consist of two words found in a dictionary and was only considered legitimate if both of the search terms appear in the result. Published googlewhacks were short-lived since when published to a website, the new number of hits would become at least two: one to the original hit found, and one to the publishing site, unless a screenshot was provided. [1] Googlewhacks generally no longer exist due to changes in Google search indexing.

Contents

History

The term googlewhack, coined by Gary Stock, first appeared on the web at UnBlinking on 8 January 2002. [2] Subsequently, Stock created The Whack Stack, at googlewhack.com, to allow the verification and collection of user-submitted Googlewhacks.

Googlewhacks were the basis of British comedian Dave Gorman's comedy tour Dave Gorman's Googlewhack Adventure and book of the same name. [3] In these Gorman tells the true story of how, while attempting to write a novel for his publisher, he became obsessed with Googlewhacks and travelled across the world finding people who had authored them. Although he never completed his original novel, Dave Gorman's Googlewhack Adventure went on to be a Sunday Times No. 1 best seller in the UK.

Participants at Googlewhack.com discovered the sporadic "cleaner girl" bug in Google's search algorithm where "results 1–1 of thousands" were returned for two relatively common words [4] such as Anxiousness Scheduler [5] or Italianate Tablesides. [6]

Googlewhack went offline in November 2009 after Google stopped providing definition links.[ definition needed ] Gary Stock stated on the game's web page soon afterward that he was pursuing solutions for Googlewhack to remain viable.[ citation needed ]

Score

Some people propose the googlewhack "score", which is the product of the hits of the individual words. [7] Thus a googlewhack score is highest when the individual words produce a large number of hits.

Examples

Variations

New Scientist has discussed the idea of a Googlewhackblatt, which is similar to a Googlewhack except that it involves finding a single word that produces only one Google result. Lists of these have become available, but as with Googlewhacks, they result in the Googlewhackblatt status of the word being destroyed—unless it is blocked by robots.txt or the word does not produce any Google results before it is added to the list, thus forming the Googlewhackblatt Paradox. Those words that do not produce any Google search results at all are known as Antegooglewhackblatts before they are listed—and subsequently elevated to Googlewhackblatt status if it is not blocked by robots.txt.

Feedback stories are also available on the New Scientist website, thus resulting in the destruction of any existing Googlewhackblatts that are ever printed in the magazine. Antegooglewhackblatts that are posted on the Feedback website become known as Feedbackgooglewhackblatts as their Googlewhackblatt status is created. In addition, New Scientist has more recently discovered another way to obtain a Googlewhackblatt without falling into the Googlewhackblatt Paradox. One can write the Googlewhackblatt on a website, but backward, and then search on elgooG to view the list properly while still keeping the Googlewhackblatt's status as a Googlewhackblatt.

In contrast to Googlewhacks, many Googlewhackblatts and Antegooglewhackblatts are nonsense words or uncommon misspellings that are not in dictionaries and probably never will be.

Practical use of specially constructed Googlewhackblatts was proposed by Leslie Lamport (although he did not use the term). [9]

Research applications

The probabilities of internet search result values for multi-word queries was studied in 2008 with the help of Googlewhacks. [10] [11] [12] Based on data from 351 Googlewhacks from the "WhackStack" a list of previously documented Googlewhacks, [13] the Heaps' law coefficient for the indexed World Wide Web (about 8 billion pages in 2008) was measured to be . This result is in line with previous studies which used under 20,000 pages. [14] The googlewhacks were a key in calibrating the model so that it could be extended automatically to analyse the relatedness of word pairs.

See also

Related Research Articles

Forth is a stack-oriented programming language and interactive integrated development environment designed by Charles H. "Chuck" Moore and first used by other programmers in 1970. Although not an acronym, the language's name in its early years was often spelled in all capital letters as FORTH. The FORTH-79 and FORTH-83 implementations, which were not written by Moore, became de facto standards, and an official technical standard of the language was published in 1994 as ANS Forth. A wide range of Forth derivatives existed before and after ANS Forth. The free and open-source software Gforth implementation is actively maintained, as are several commercially supported systems.

<span class="mw-page-title-main">Google Search</span> Search engine from Google

Google Search is a search engine operated by Google. It allows users to search for information on the Web by entering keywords or phrases. Google Search uses algorithms to analyze and rank websites based on their relevance to the search query. It is the most popular search engine worldwide.

Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating related and/or unrelated phrases, to manipulate the relevance or prominence of resources indexed in a manner inconsistent with the purpose of the indexing system.

robots.txt Internet protocol

robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.

<span class="mw-page-title-main">Dave Gorman</span> English comedian and presenter

David James Gorman is an English comedian, presenter, and writer.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic rather than direct traffic or paid traffic. Unpaid traffic may originate from different kinds of searches, including image search, video search, academic search, news search, and industry-specific vertical search engines.

<span class="mw-page-title-main">Google bombing</span> Practice that causes a webpage to have a high rank in Google

The terms Google bombing and Google washing refer to the practice of causing a website to rank highly in web search engine results for irrelevant, unrelated or off-topic search terms by linking heavily. In contrast, search engine optimization (SEO) is the practice of improving the search engine listings of web pages for relevant search terms.

<span class="mw-page-title-main">Google (verb)</span> Transitive verb, meaning to search for something using the Google search engine

Owing to the dominance of the Google search engine, to google has become a transitive verb. The neologism commonly refers to searching for information on the World Wide Web, typically using the Google search engine. The American Dialect Society chose it as the "most useful word of 2002". It was added to the Oxford English Dictionary on June 15, 2006, and to the eleventh edition of the Merriam-Webster Collegiate Dictionary in July 2006.

The blink element is a non-standard HTML element that indicates to a user agent that the page author intends the content of the element to blink. The element was introduced in Netscape Navigator but is no longer supported and often ignored by modern Web browsers; some, such as Internet Explorer, never supported the element at all.

A statistically improbable phrase (SIP) is a phrase or set of words that occurs more frequently in a document than in some larger corpus. Amazon.com uses this concept in determining keywords for a given book or chapter, since keywords of a book or chapter are likely to appear disproportionately within that section. Christian Rudder has also used this concept with data from online dating profiles and Twitter posts to determine the phrases most characteristic of a given race or gender in his book Dataclysm. SIPs with a linguistic density of two or three words, adjective, adjective, noun or adverb, adverb, verb, will signal the author's attitude, premise or conclusions to the reader or express an important idea.

Sitemaps is a protocol in XML format meant for a webmaster to inform search engines about URLs on a website that are available for web crawling. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more efficiently and to find URLs that may be isolated from the rest of the site's content. The Sitemaps protocol is a URL inclusion protocol and complements robots.txt, a URL exclusion protocol.

<span class="mw-page-title-main">Google Translate</span> Multilingual neural machine translation service

Google Translate is a multilingual neural machine translation service developed by Google to translate text, documents and websites from one language into another. It offers a website interface, a mobile app for Android and iOS, as well as an API that helps developers build browser extensions and software applications. As of October 2024, Google Translate supports 244 languages and language varieties at various levels. It served over 200 million people daily in May 2013, and over 500 million total users as of April 2016, with more than 100 billion words translated daily.

<span class="mw-page-title-main">Search engine</span> Software system for finding relevant information on the Web

A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query. The user inputs a query within a web browser or a mobile app, and the search results are often a list of hyperlinks, accompanied by textual summaries and images. Users also have the option of limiting the search to a specific type of results, such as images, videos, or news.

Google was officially launched in 1998 by Larry Page and Sergey Brin to market Google Search, which has become the most used web-based search engine. Larry Page and Sergey Brin, students at Stanford University in California, developed a search algorithm first (1996) known as "BackRub", with the help of Scott Hassan and Alan Steremberg. The search engine soon proved successful, and the expanding company moved several times, finally settling at Mountain View in 2003. This marked a phase of rapid growth, with the company making its initial public offering in 2004 and quickly becoming one of the world's largest media companies. The company launched Google News in 2002, Gmail in 2004, Google Maps in 2005, Google Chrome in 2008, and the social network known as Google+ in 2011, in addition to many other products. In 2015, Google became the main subsidiary of the holding company Alphabet Inc.

Dave Gorman's Googlewhack Adventure is a stand-up comedy performance by Dave Gorman which toured between 2003 and 2005. The show follows Gorman's life between his 31st and 32nd birthday: unable to write a novel, Gorman is distracted into travelling the world in search of a "chain" of ten Googlewhacks. A Googlewhack is a pair of words which yield exactly one result on Google, and Gorman's aim is to meet ten people who are owners of the websites which the previous person's Googlewhack leads to. The show uses Gorman's usual style of incorporating charts and maps into a story with a ludicrous premise, and was created to pay back the publishers' advance for his unwritten novel.

Google Search Console is a web service by Google which allows webmasters to check indexing status, search queries, crawling errors and optimize visibility of their websites.

<span class="mw-page-title-main">Google Dictionary</span> Online dictionary service by Google

Google Dictionary is an online dictionary service of Google that can be accessed with the "define" operator and other similar phrases in Google Search. It is also available in Google Translate and as a Google Chrome extension. The dictionary content is licensed from Oxford University Press's Oxford Languages. It is available in different languages, such as English, Spanish and French. The service also contains pronunciation audio, Google Translate, a word origin chart, Ngram Viewer, and word games, among other features for the English-language version. Originally available as a standalone service, it was integrated into Google Search, with the separate service discontinued in August 2011.

<span class="mw-page-title-main">Wayback Machine</span> Digital archive by the Internet Archive

The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, an American nonprofit organization based in San Francisco, California. Created in 1996 and launched to the public in 2001, it allows users to go "back in time" to see how websites looked in the past. Its founders, Brewster Kahle and Bruce Gilliat, developed the Wayback Machine to provide "universal access to all knowledge" by preserving archived copies of defunct web pages.

<span class="mw-page-title-main">Googlization</span> Neologism

Googlization is a neologism that describes the expansion of Google's search technologies and aesthetics into more markets, web applications, and contexts, including traditional institutions such as the library. The rapid rise of search media, particularly Google, is part of new media history and draws attention to issues of access and to relationships between commercial interests and media.

References

  1. "Googlewhack official rules". Googlewhack.com. Archived from the original on 18 August 2017. Retrieved 28 March 2014.
  2. "Googlewhacking: The Search for The One True Googlewhack". Unblinking.com. Retrieved 28 March 2014.
  3. Gorman, Dave (2005). Dave Gorman's Googlewhack! adventure. London: Ebury. ISBN   0091897424.
  4. "Googlewhack NACK!". Googlewhack.com. Archived from the original on 6 August 2017. Retrieved 28 March 2014.
  5. Essex, Mike (13 February 2012). "Anxiousness Scheduler". Blog.blagman.co.uk. Archived from the original on 29 March 2014. Retrieved 28 March 2014.
  6. "italianate tablesides". Googlewhack.com. Archived from the original on 21 January 2013. Retrieved 28 March 2014.
  7. googlewhack scoring is discussed numerous places, e.g.,: Archived 3 December 2013 at the Wayback Machine Archived 3 December 2013 at the Wayback Machine
  8. 1 2 "What is Googlewhacking? - Definition from WhatIs.com". WhatIs.com. Retrieved 17 December 2021.
  9. Archival References to Web Pages, Ninth International World Wide Web Conference: Poster Proceedings (May 2000)
  10. Lansey JC, Bukiet B (January 2009). "Internet Search Result Probabilities, Heaps' Law and Word Associativity". Journal of Quantitative Linguistics. 16 (1): 40–66. doi:10.1080/09296170802514153. S2CID   1808897.
  11. Googlewhacks for Fun and Profit on YouTube Google Tech Talk 2008
  12. "Poster Presentation" (PDF). Retrieved 28 March 2014.
  13. "The Whack Stack". Googlewhack. 13 February 2010. Archived from the original on 21 January 2013. Retrieved 17 December 2021.
  14. Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval, ACM Press, 1999.

Further reading