Keyword density

Last updated

Keyword density is the percentage of times a keyword or phrase appears on a web page compared to the total number of words on the page. In the context of search engine optimization, keyword density can be used to determine whether a web page is relevant to a specified keyword or keyword phrase.

Contents

In the late 1990s, the early days of search engines, keyword density was an important factor in page ranking. However, as webmasters discovered how to implement optimum keyword density, search engines began giving priority to other factors beyond the direct control of webmasters. Today, the overuse of keywords, a practice called keyword stuffing, will cause a web page to be penalized.

The formula to calculate your keyword density on a web page for SEO purposes is , where Nkr is how many times you repeated a specific keyword, and Tkn the total words in the analyzed text. The result is a keyword density value. When calculating keyword density, ignore html tags and other embedded tags which will not appear in the text of the page once published.

When calculating the density of a keyword phrase, the formula would be , [1] where Nwp is the number of words in the phrase. So, for example, for a four-hundred word page about search engine optimization where "search engine optimization" is used four times, the keyword phrase density is (4*3/400)*100 or 3 percent.

From a mathematical viewpoint, the original concept of keyword density refers to the frequency (Nkr) of appearance of a keyword in a dissertation. A "keyword" consisting of multiple terms, e.g. "blue suede shoes," is an entity in itself. T frequency of the phrase "blue suede shoes" within a dissertation drives the key(phrase) density. It is "more" mathematically correct for a "keyphrase" to be calculated just like the original calculation, but considering the word group, "blue suede shoes," as a single appearance, not three:

Density = ( Nkr / Tkn ) * 100.

'Keywords' (kr) that consist of several words artificially inflate the total word count of the dissertation. The purest mathematical representation should adjust the total word count (Tkn) lower by removing the excess key(phrase) word counts from the total:

Density = ( Nkr / ( Tkn -( Nkr * ( Nwp-1 ) ) ) ) * 100. where Nwp = the number of terms in the keyphrase.

See also

Related Research Articles

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page. They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

Web indexing, or Internet indexing, comprises methods for indexing the contents of a website or of the Internet as a whole. Individual websites or intranets may use a back-of-the-book index, while search engines usually use keywords and metadata to provide a more useful vocabulary for Internet or onsite searching. With the increase in the number of periodicals that have articles online, web indexing is also becoming important for periodical websites.

Spamdexing is the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed in a manner inconsistent with the purpose of the indexing system.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic rather than direct traffic or paid traffic. Unpaid traffic may originate from different kinds of searches, including image search, video search, academic search, news search, and industry-specific vertical search engines.

Automatic summarization is the process of shortening a set of data computationally, to create a subset that represents the most important or relevant information within the original content. Artificial intelligence algorithms are commonly developed and employed to achieve this, specialized for different types of data.

The anchor text, link label or link text is the visible, clickable text in an HTML hyperlink. The term "anchor" was used in older versions of the HTML specification for what is currently referred to as the a element, or <a>. The HTML specification does not have a specific term for anchor text, but refers to it as "text that the a element wraps around". In XML terms, the anchor text is the content of the element, provided that the content is text.

In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases.

Keyword stuffing is a search engine optimization (SEO) technique, considered webspam or spamdexing, in which keywords are loaded into a web page's meta tags, visible content, or backlink anchor text in an attempt to gain an unfair rank advantage in search engines. Keyword stuffing may lead to a website being temporarily or permanently banned or penalized on major search engines. The repetition of words in meta tags may explain why many search engines no longer use these tags. Nowadays, search engines focus more on the content that is unique, comprehensive, relevant, and helpful that overall makes the quality better which makes keyword stuffing useless, but it is still practiced by many webmasters.

Search engine marketing (SEM) is a form of Internet marketing that involves the promotion of websites by increasing their visibility in search engine results pages (SERPs) primarily through paid advertising. SEM may incorporate search engine optimization (SEO), which adjusts or rewrites website content and site architecture to achieve a higher ranking in search engine results pages to enhance pay per click (PPC) listings and increase the Call to action (CTA) on the website.

A video search engine is a web-based search engine which crawls the web for video content. Some video search engines parse externally hosted content while others allow content to be uploaded and hosted on their own servers. Some engines also allow users to search by video format type and by length of the clip. The video search results are usually accompanied by a thumbnail view of the video.

An SEO contest is a prize activity that challenges search engine optimization (SEO) practitioners to achieve high ranking under major search engines such as Google, Yahoo, and MSN using certain keyword(s). This type of contest is controversial because it often leads to massive amounts of link spamming as participants try to boost the rankings of their pages by any means available. The SEO competitors hold the activity without the promotion of a product or service in mind, or they may organize a contest in order to market something on the Internet. Participants can showcase their skills and potentially discover and share new techniques for promoting websites.

nofollow is a setting on a web page hyperlink that directs search engines not to use the link for page ranking calculations. It is specified in the page as a type of link relation; that is: <a rel="nofollow" ...>. Because search engines often calculate a site's importance according to the number of hyperlinks from other sites, the nofollow setting allows website authors to indicate that the presence of a link is not an endorsement of the target site's importance.

In information retrieval, an index term is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records. They are an integral part of bibliographic control, which is the function by which libraries collect, organize and disseminate documents. They are used as keywords to retrieve documents in an information system, for instance, a catalog or a search engine. A popular form of keywords on the web are tags, which are directly visible and can be assigned by non-experts. Index terms can consist of a word, phrase, or alphanumerical term. They are created by analyzing the document either manually with subject indexing or automatically with automatic indexing or more sophisticated methods of keyword extraction. Index terms can either come from a controlled vocabulary or be freely assigned.

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

Keyword research is a practice search engine optimization (SEO) professionals use to find and analyze search terms that users enter into search engines when looking for products, services, or general information. Keywords are related to search queries.

Vector space model or term vector model is an algebraic model for representing text documents as vectors such that the distance between vectors represents the relevance between the documents. It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System.Znamy się ?

<span class="mw-page-title-main">PageRank</span> Algorithm used by Google Search to rank web pages

PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. According to Google:

PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.

Trademark stuffing is a form of keyword stuffing, an unethical search engine optimization method used by webmasters and Internet marketers in order to manipulate search engine ranking results served by websites such as Google, Yahoo! and Microsoft Bing. A key characteristic of trademark stuffing is the intent of the infringer to confuse search engines and Internet users into thinking a website or web page is owned or otherwise authorized by the trademark owner. Trademark stuffing does not include using trademarks on third party website pages with the boundaries of Fair Use. When used effectively, trademark stuffing enables infringing websites to capture search engine traffic that may have otherwise been received by an authorized website or trademark owner.

Hummingbird is the codename given to a significant algorithm change in Google Search in 2013. Its name was derived from the speed and accuracy of the hummingbird. The change was announced on September 26, 2013, having already been in use for a month. "Hummingbird" places greater emphasis on natural language queries, considering context and meaning over individual keywords. It also looks deeper at content on individual pages of a website, with improved ability to lead users directly to the most appropriate page rather than just a website's homepage.

<span class="mw-page-title-main">Yoast SEO</span> Search engine optimization plugin for WordPress

Yoast SEO is a search engine optimization (SEO) tool plug-in for WordPress.

References

  1. Taniar, David; Gervasi, Osvaldo; Murgante, Beniamino; Apduhan, Bernady O.; Pardede, Eric (2010-03-16). Computational Science and Its Applications - ICCSA 2010: International Conference, Fukuoka, Japan, March 23-26, 2010, Proceedings . Springer Science & Business Media. p.  212. ISBN   9783642121883. (Nkr * Nwp / Tkn) * 100.