Type of site | Search engine |
---|---|
Available in | English |
Owner | IBM (International Business Machines) |
URL | www |
Launched | November 1, 2010 |
Current status | Defunct (March 27, 2015) |
Blekko, trademarked as blekko (lowercase), [1] was a company that provided a web search engine with the stated goal of providing better search results than those offered by Google Search, with results gathered from a set of 3 billion trusted webpages and excluding such sites as content farms. The company's site, launched to the public on November 1, 2010, used slashtags to provide results for common searches. Blekko also offered a downloadable search bar. It was acquired by IBM in March 2015, and the service was discontinued.
The company was co-founded in 2007 by Rich Skrenta, who had created Newhoo, which was acquired by Netscape and renamed as the Open Directory Project. [2] Blekko raised $24 million in venture capital from such individuals as Netscape founder Marc Andreessen and Ron Conway, as well as from U.S. Venture Partners and CMEA Capital. [3] The company's goal was to be able to provide useful search results without the extraneous links often provided by Google. Individuals who enter searches for such frequently searched categories as cars, finance, health and hotels received results prescreened by Blekko editors who used what The New York Times described as "Wikipedia-style policing" to weed out pages created by content farms and focus on results from professionals. [4] Use of slashtags restricted the set of search results to those matching the specified characteristic and a slashtag was to be automatically added for search categories with prescreened results. [5] This use of slashtag is also implemented in the access of videos and images because Blekko did not have the option to search specific databases for these contents. [6]
Queries related to personal health were limited to a prescreened list of sites that Blekko editors had determined to be trustworthy, excluding many sites that rank highly in Google searches. [2] As of Blekko's launch date, its 8,000 beta editors had developed 3,000 slashtags corresponding to the site's most frequent searches. [5] The company hoped to use editors to develop prepared lists of the 50 sites that best match its 100,000 most frequent search targets. [2] Additional tools allowed users to see the IP address that a website is running on and let registered users label a site as spam. [7]
Blekko also differentiated itself by offering richer data than its competitors. For instance, if a user accessed a domain name with the added /seo, he would be directed to a page containing the statistics of the URL. [8] This is the reason experts cited Blekko's fitness with the Big Data paradigm since it gathers multiple data sets and presents them visually so that the user is provided with quick, meaningful, and actionable information. [9]
At the time, Blekko announced plans to earn revenue by selling ads based on slashtags and search results. The company also planned to provide data on its algorithm for ranking search results, including details for inbound links to specific sites. [4]
As part of a permanent post in Blekko's help section was the following "Web search bill of rights": [10]
One writer referred to the bill of rights as "what we assume is a poke at Google". [11] [12]
In 2011 Blekko announced blocking "content farmy sites", to reduce spam, in line with its bill of rights. [13]
In May 2012 Mozilla announced an "instant search" browser plugin for Firefox designed to cache repetitive search requests, in partnership with Blekko. [14]
In August 2012 Blekko put all its SEO statistics behind a paywall, [15] despite previously declaring that "ranking data shall not be kept secret" [12] in its bill of rights. [15]
IBM bought Blekko and closed the search service on 27 March 2015, redirecting searches to a page announcing "The blekko technology and team have joined IBM Watson!" and linking to a blog post announcing that the blekko service was closed, with blekko's web-crawling abilities to be integrated into IBM Watson, adding advanced Web-crawling, categorization and intelligent filtering technology. [16]
Blekko used an initiative called slashtags, [1] consisting of a text tag preceded by a "/" slash character, to allow ease of searching and categorise searches. System and pre-defined slashtags allowed users to start searching right away. Users could create slashtags after signup, to perform custom-sorted searches and to reduce spam. [4]
The following features were available to all users:
Blekko offered a downloadable browser toolbar or search bar which changes default search and home page URLs of the user's web browsers. [17]
In 2010, John Dvorak described the site as adding "so much weird dimensionality" to search, and recommended it as "the best out-of-the-chute new engine I've seen in the last 10 years". [7] In Matthew Rogers' review of the site, he found it "slow and cumbersome", and stated that he did not understand the necessity or utility for slashtags. [18] In his PCMag.com review, Jeffrey L. Wilson expressed approval of some search results, but criticized the site's social features which "bog down the search experience." [19]
Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed, in a manner inconsistent with the purpose of the indexing system.
Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic rather than direct traffic or paid traffic. Unpaid traffic may originate from different kinds of searches, including image search, video search, academic search, news search, and industry-specific vertical search engines.
Richard Skrenta is a computer programmer and Silicon Valley entrepreneur who created the web search engine blekko.
Googlebot is the web crawler software used by Google that collects documents from the web to build a searchable index for the Google Search engine. This name is actually used to refer to two different types of web crawlers: a desktop crawler and a mobile crawler.
The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not indexed by standard web search-engines. This is in contrast to the "surface web", which is accessible to anyone using the Internet. Computer-scientist Michael K. Bergman is credited with coining the term in 2001 as a search-indexing term.
A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.
Doorway pages are web pages that are created for the deliberate manipulation of search engine indexes (spamdexing). A doorway page will affect the index of a search engine by inserting results for particular phrases while sending visitors to a different page. Doorway pages that redirect visitors without their knowledge use some form of cloaking. This usually falls under Black Hat SEO.
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.
Search engine marketing (SEM) is a form of Internet marketing that involves the promotion of websites by increasing their visibility in search engine results pages (SERPs) primarily through paid advertising. SEM may incorporate search engine optimization (SEO), which adjusts or rewrites website content and site architecture to achieve a higher ranking in search engine results pages to enhance pay per click (PPC) listings and increase the Call to action (CTA) on the website.
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. The web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.
A scraper site is a website that copies content from other websites using web scraping. The content is then mirrored with the goal of creating revenue, usually through advertising and sometimes by selling user data. Scraper sites come in various forms. Some provide little, if any material or information, and are intended to obtain user information such as e-mail addresses, to be targeted for spam e-mail. Price aggregation and shopping sites access multiple listings of a product and allow a user to rapidly compare the prices.
A SEO contest is a prize-awarding activity which challenges search engine optimization (SEO) practitioners to achieve high ranking under major search engines such as Google, Yahoo, and MSN using certain keyword(s). This type of contest is controversial because it often leads to massive amounts of link spamming as participants try to boost the rankings of their pages by any means available. The organizing body of a SEO competition may hold the activity without promotion of a product or service in mind; or they may organize a contest in order to market something on the Internet. Participants can showcase their skills and potentially discover and share new techniques for promoting websites.
nofollow is a setting on a web page hyperlink that directs search engines not to use the link for page ranking calculations. It is specified in the page as a type of link relation; that is: <a rel="nofollow" ...>
. Because search engines often calculate a site's importance according to the number of hyperlinks from other sites, the nofollow
setting allows web site authors to indicate that the presence of a link is not an endorsement of the target site's importance.
A search engine is a software system that is designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). The information may be a mix of links to web pages, images, videos, infographics, articles, research papers, and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time information by running an algorithm on a web crawler. Internet content that is not capable of being searched by a web search engine is generally described as the deep web.
Search Engine Results Pages (SERP) are the pages displayed by search engines in response to a query by a user. The main component of the SERP is the listing of results that are returned by the search engine in response to a keyword query.
Google Search Console is a web service by Google which allows webmasters to check indexing status and optimize visibility of their websites.
Search neutrality is a principle that search engines should have no editorial policies other than that their results be comprehensive, impartial and based solely on relevance. This means that when a user types in a search engine query, the engine should return the most relevant results found in the provider's domain, without manipulating the order of the results, excluding results, or in any other way manipulating the results to a certain bias.
A content farm is a company that employs large numbers of freelance writers to generate a large amount of textual web content which is specifically designed to satisfy algorithms for maximal retrieval by automated search engines which is known as SEO. Their main goal is to generate advertising revenue through attracting reader page views, as first exposed in the context of social spam.
Million Short is a web search engine from Toronto-based startup Exponential Labs. The search engine, which brands itself as “more of a discovery engine,” allows users to filter the top million websites on the internet out of their search, resulting in a unique set of results and placing an emphasis on content discovery. This approach to search is also designed to combat the impact that aggressive black and grey hat SEO practices have on mainstream search results.
Local search engine optimization is similar to (national) SEO in that it is also a process affecting the visibility of a website or a web page in a web search engine's unpaid results often referred to as "natural", "organic", or "earned" results. In general, the higher ranked on the search results page and more frequently a site appears in the search results list, the more visitors it will receive from the search engine's users; these visitors can then be converted into customers. Local SEO, however, differs in that it is focused on optimizing a business' online presence so that its web pages will be displayed by search engines when users enter local searches for its products or services. Ranking for local search involves a similar process to general SEO but includes some specific elements to rank a business for local search.