Doorway page

Last updated

Doorway pages (bridge pages, portal pages, jump pages, gateway pages or entry pages) are web pages that are created for the deliberate manipulation of search engine indexes (spamdexing). A doorway page will affect the index of a search engine by inserting results for particular phrases while sending visitors to a different page. Doorway pages that redirect visitors without their knowledge use some form of cloaking. This usually falls under Black Hat SEO.

Contents

If a visitor clicks through to a typical doorway page from a search engine results page, in most cases they will be redirected with a fast Meta refresh command to another page. Other forms of redirection include use of JavaScript and server side redirection, from the server configuration file. Some doorway pages may be dynamic pages generated by scripting languages such as Perl and PHP.

Identification

Doorway pages are often easy to identify in that they have been designed primarily for search engines, not for human beings. Sometimes a doorway page is copied from another high ranking page, but this is likely to cause the search engine to detect the page as a duplicate and exclude it from the search engine listings.

Because many search engines give a penalty for using the META refresh command, [1] some doorway pages just trick the visitor into clicking on a link to get them to the desired destination page, or they use JavaScript for redirection.

More sophisticated doorway pages, called Content Rich Doorways, are designed to gain high placement in search results without using redirection. They incorporate at least a minimum amount of design and navigation similar to the rest of the site to provide a more human-friendly and natural appearance. Visitors are offered standard links as calls to action.

Landing pages are regularly misconstrued to equate to Doorway pages within the literature. The former are content rich pages to which traffic is directed within the context of pay-per-click campaigns and to maximize SEO campaigns.

Doorway pages are also typically used for sites that maintain a blacklist of URLs known to harbor spam, such as Facebook, Tumblr and Deviantart.

Cloaking

Doorway pages often also employ cloaking techniques for misdirection. Cloaked pages will show a version of that page to human visitor which is different from the one provided to crawlers—usually implemented via server-side scripts. The server can differentiate between bots, crawlers and human visitors based on various flags, including source IP address or user-agent. Cloaking will simultaneously trick search engines to rank sites higher for irrelevant keywords, while displaying monetizing any human traffic by showing visitors spammy, often irrelevant, content. The practice of cloaking is considered to be highly manipulative and condemned within the SEO industry and by search engines, and its use can result in significant penalty or the complete removal of sites from being indexed. [2]

Redirection

Webmasters that use doorway pages would generally prefer that users never actually see these pages and instead be delivered to a "real" page within their sites. To achieve this goal, redirection is sometimes used. This may be as simple as installing a meta refresh tag on the doorway pages. An advanced system might make use of cloaking. In either case, such redirection may make your doorway pages unacceptable to search engines.

Construction

A content-rich doorway page must be constructed in a search-engine-friendly manner, or it may be construed as search engine spam, possibly resulting in the page being banned from the index for an undisclosed amount of time.

These types of doorways utilize (but are not limited to) the following:

In culture

Doorway pages were examined as a cultural and political phenomenon along with spam poetry and flarf. [3]

See also

Related Research Articles

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page. They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed in a manner inconsistent with the purpose of the indexing system.

The Robots Exclusion Protocol, commonly known by its filename robots.txt, is a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic rather than direct traffic or paid traffic. Unpaid traffic may originate from different kinds of searches, including image search, video search, academic search, news search, and industry-specific vertical search engines.

<span class="mw-page-title-main">Link farm</span> Group of websites that link to each other

On the World Wide Web, a link farm is any group of websites that all hyperlink to other sites in the group for the purpose of increasing SEO rankings. In graph theoretic terms, a link farm is a clique. Although some link farms can be created by hand, most are created through automated programs and services. A link farm is a form of spamming the index of a web search engine. Other link exchange systems are designed to allow individual websites to selectively exchange links with other relevant websites and are not considered a form of spamdexing.

<span class="mw-page-title-main">Googlebot</span> Web crawler used by Google

Googlebot is the web crawler software used by Google that collects documents from the web to build a searchable index for the Google Search engine. This name is actually used to refer to two different types of web crawlers: a desktop crawler and a mobile crawler.

Cloaking is a search engine optimization (SEO) technique in which the content presented to the search engine spider is different from that presented to the user's browser. This is done by delivering content based on the IP addresses or the User-Agent HTTP header of the user requesting the page. When a user is identified as a search engine spider, a server-side script delivers a different version of the web page, one that contains content not present on the visible page, or that is present but not searchable. The purpose of cloaking is sometimes to deceive search engines so they display the page when it would not otherwise be displayed. However, it can also be a functional technique for informing search engines of content they would not otherwise be able to locate because it is embedded in non-textual containers, such as video or certain Adobe Flash components. Since 2006, better methods of accessibility, including progressive enhancement, have been available, so cloaking is no longer necessary for regular SEO.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

URL redirection, also called URL forwarding, is a World Wide Web technique for making a web page available under more than one URL address. When a web browser attempts to open a URL that has been redirected, a page with a different URL is opened. Similarly, domain redirection or domain forwarding is when all pages in a URL domain are redirected to a different domain, as when wikipedia.com and wikipedia.net are automatically redirected to wikipedia.org.

Google AdSense is a program run by Google through which website publishers in the Google Network of content sites serve text, images, video, or interactive media advertisements that are targeted to the site content and audience. These advertisements are administered, sorted, and maintained by Google. They can generate revenue on either a per-click or per-impression basis. Google beta-tested a cost-per-action service, but discontinued it in October 2008 in favor of a DoubleClick offering. In Q1 2014, Google earned US$3.4 billion, or 22% of total revenue, through Google AdSense. AdSense is a participant in the AdChoices program, so AdSense ads typically include the triangle-shaped AdChoices icon. This program also operates on HTTP cookies. In 2021, over 38.3 million websites use AdSense.

Keyword stuffing is a search engine optimization (SEO) technique, considered webspam or spamdexing, in which keywords are loaded into a web page's meta tags, visible content, or backlink anchor text in an attempt to gain an unfair rank advantage in search engines. Keyword stuffing may lead to a website being temporarily or permanently banned or penalized on major search engines. The repetition of words in meta tags may explain why many search engines no longer use these tags. Nowadays, search engines focus more on the content that is unique, comprehensive, relevant, and helpful that overall makes the quality better which makes keyword stuffing useless, but it is still practiced by many webmasters.

Search engine marketing (SEM) is a form of Internet marketing that involves the promotion of websites by increasing their visibility in search engine results pages (SERPs) primarily through paid advertising. SEM may incorporate search engine optimization (SEO), which adjusts or rewrites website content and site architecture to achieve a higher ranking in search engine results pages to enhance pay per click (PPC) listings and increase the Call to action (CTA) on the website.

In blogging, a ping is an XML-RPC-based push mechanism by which a weblog notifies a server that its content has been updated. An XML-RPC signal is sent from the weblog to one or more Ping servers, as specified by originating weblog), to notify a list of their "Services" of new content on the weblog.

A scraper site is a website that copies content from other websites using web scraping. The content is then mirrored with the goal of creating revenue, usually through advertising and sometimes by selling user data. Scraper sites come in various forms. Some provide little, if any material or information, and are intended to obtain user information such as e-mail addresses, to be targeted for spam e-mail. Price aggregation and shopping sites access multiple listings of a product and allow a user to rapidly compare the prices.

nofollow is a setting on a web page hyperlink that directs search engines not to use the link for page ranking calculations. It is specified in the page as a type of link relation; that is: <a rel="nofollow" ...>. Because search engines often calculate a site's importance according to the number of hyperlinks from other sites, the nofollow setting allows website authors to indicate that the presence of a link is not an endorsement of the target site's importance.

<span class="mw-page-title-main">Search engine</span> Software system that is designed to search for information on the World Wide Web

A search engine is a software system that finds web pages that match a web search. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). The information may be a mix of links to web pages, images, videos, infographics, articles, research papers, and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories and social bookmarking sites, which are maintained by human editors, search engines also maintain real-time information by running an algorithm on a web crawler. Any internet-based content that cannot be indexed and searched by a web search engine falls under the category of deep web.

<span class="mw-page-title-main">Geotargeting</span> Website content based on a visitors location

In geomarketing and internet marketing, geotargeting is the method of delivering different content to visitors based on their geolocation. This includes country, region/state, city, metro code/zip code, organization, IP address, ISP, or other criteria. A common usage of geotargeting is found in online advertising, as well as internet television with sites such as iPlayer and Hulu. In these circumstances, content is often restricted to users geolocated in specific countries; this approach serves as a means of implementing digital rights management. Use of proxy servers and virtual private networks may give a false location.

In the field of search engine optimization (SEO), link building describes actions aimed at increasing the number and quality of inbound links to a webpage with the goal of increasing the search engine rankings of that page or website. Briefly, link building is the process of establishing relevant hyperlinks to a website from external sites. Link building can increase the number of high-quality links pointing to a website, in turn increasing the likelihood of the website ranking highly in search engine results. Link building is also a proven marketing tactic for increasing brand awareness.

White fonting is the practice of inserting hidden keywords into the body of an electronic document, in order to influence the actions of a search program reviewing that document. The name white fonting comes from the practice of adding keywords to a webpage, using a white font on a white background, in an effort to hide the additional keywords from sight.

Google Penguin was a codename for a Google algorithm update that was first announced on April 24, 2012. The update was aimed at decreasing search engine rankings of websites that violate Google's Webmaster Guidelines by using now declared Grey Hat SEM techniques involved in increasing artificially the ranking of a webpage by manipulating the number of links pointing to the page. Such tactics are commonly described as link schemes. According to Google's John Mueller, as of 2013, Google announced all updates to the Penguin filter to the public.

References

  1. "Google Groups". www.google.com. Retrieved 2018-04-24.
  2. "Webmaster Guidelines - Search Console Help". support.google.com. Retrieved 2018-04-24.
  3. Kurtov, Michael. "Doorway into Weak Beauty. On the Cultural and Political Value of Spamdexing".{{cite journal}}: Cite journal requires |journal= (help)