Hreflang

Last updated

The rel="alternate" hreflang="x" link attribute is a HTML meta element described in RFC 8288. Hreflang specifies the language and optional geographic restrictions for a document. Hreflang is interpreted by search engines and can be used by webmasters to clarify the lingual and geographical targeting of a website.

Contents

Purpose

Many websites are targeted at audience with different languages and localized for different countries. This can cause a lot of duplicate content or near duplicate content, as well as targeting issues with users from search engines. [1]

Search engines use hreflang to understand the lingual and geographical targeting of websites and use the information to show the right URL in search results, depending on user language and region preference.

There are 3 basic scenarios that can be covered with hreflang:

Hreflang attribute help your website deliver multiple variations of the website in different languages. [2]

Implementation

Hreflang can be implemented in 3 different ways.

The hreflang definition is done by creating a full set of all language and region specific versions of the same document. Every URL in the set must reference the full URL set. A self-reference is required, so the including document has to be always part of the URL set.[ citation needed ]

Language and country codes

Hreflang accepts values that define languages ISO 639-1 and countries (ISO 3166-1). A language or a combination of language and region can be used as a value. A country-only value is not allowed.

Language Example
en
fr
be

Language and Region Example
fr-CA
en-CA
en-US

The hreflang value has to follow the standard in order to be used by search engines.

Language script variations

RFC 5646 allows language script variations as value for hreflang. Language script variations can directly be addressed using ISO 15924.

Examples
zh-Hant: Chinese (Traditional)
zh-Hans: Chinese (Simplified)

x-default

x-default is a reserved hreflang value that can be used to specify a default version for a document. The x-default URL is not targeted at a specific region and/or language and is supposed to be shown to unspecified users. Google suggests defining an x-default version in each URL set, which will be shown to users from unspecified regions or languages in search results. [3] [4] Typically, in multilingual websites, the TLD (https://www.example.com) will get the x-default value in each URL set and the language folders/subdomains will be assigned hreflang values.

The URL that is defined as the x-default for a certain document, can also be specified for a certain language or language and region at the same time.

Markup examples

HTML

<linkrel="alternate"hreflang="en-US"href="http://example.com/page.html">
<html><head><linkrel="alternate"hreflang="en-US"href="http://example.com/page.html"><linkrel="alternate"hreflang="en-CA"href="http://example.com/en-ca/page.html"><linkrel="alternate"hreflang="en-GB"href="http://example.com/en-gb/page.html"><linkrel="alternate"hreflang="fr-CA"href="http://example.com/fr-ca/page.html"><linkrel="alternate"hreflang="x-default"href="http://example.com/page.html"></head><body>     ...     </body></html>

HTTP

HTTP/1.1200OKContent-Type:application/pdfLink:<http://example.com/page.pdf>; rel="alternate";hreflang="x-default", <http://uk.example.com/page.pdf>; rel="alternate";hreflang="en-GB",<http://us.example.com/page.pdf>; rel="alternate";hreflang="en-US"...

XML sitemaps

<?xml version="1.0" encoding="UTF-8"?><urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>http://example.com/page.html</loc><xhtml:linkrel="alternate"hreflang="en-US"href="http://us.example.com/page.html"/><xhtml:linkrel="alternate"hreflang="en-GB"href="http://uk.example.com/page.html"/><xhtml:linkrel="alternate"hreflang="x-default"href="http://example.com/page.html"/></url><url><loc>http://us.example.com/page.html</loc><xhtml:linkrel="alternate"hreflang="en-GB"href="http://uk.example.com/page.html"/><xhtml:linkrel="alternate"hreflang="x-default"href="http://example.com/page.html"/><xhtml:linkrel="alternate"hreflang="en-US"href="http://us.example.com/page.html"/></url><url><loc>http://uk.example.com/page.html</loc><xhtml:linkrel="alternate"hreflang="en-US"href="http://us.example.com/page.html"/><xhtml:linkrel="alternate"hreflang="x-default"href="http://example.com/page.html"/><xhtml:linkrel="alternate"hreflang="en-GB"href="http://uk.example.com/page.html"/></url></urlset> ... 

Related Research Articles

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page. They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.

<span class="mw-page-title-main">Atom (web standard)</span> Web standards

The name Atom applies to a pair of related Web standards. The Atom Syndication Format is an XML language used for web feeds, while the Atom Publishing Protocol is a simple HTTP-based protocol for creating and updating web resources.

URL redirection, also called URL forwarding, is a World Wide Web technique for making a web page available under more than one URL address. When a web browser attempts to open a URL that has been redirected, a page with a different URL is opened. Similarly, domain redirection or domain forwarding is when all pages in a URL domain are redirected to a different domain, as when wikipedia.com and wikipedia.net are automatically redirected to wikipedia.org.

A query string is a part of a uniform resource locator (URL) that assigns values to specified parameters. A query string commonly includes fields added to a base URL by a Web browser or other client application, for example as part of an HTML document, choosing the appearance of a page, or jumping to positions in multimedia content.

Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data. They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines, web syndication and aggregators such as RSS.

A sitemap is a list of pages of a web site within a domain.

Sitemaps is a protocol in XML format meant for a webmaster to inform search engines about URLs on a website that are available for web crawling. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more efficiently and to find URLs that may be isolated from the rest of the site's content. The Sitemaps protocol is a URL inclusion protocol and complements robots.txt, a URL exclusion protocol.

hCard is a microformat for publishing the contact details of people, companies, organizations, and places, in HTML, Atom, RSS, or arbitrary XML. The hCard microformat does this using a 1:1 representation of vCard properties and values, identified using HTML classes and rel attributes.

GRDDL is a markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. The GRDDL specification shows examples using XSLT, however it was intended to be abstract enough to allow for other implementations as well. It became a Recommendation on September 11, 2007.

RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

In computer science and web development, XML Events is a W3C standard for handling events that occur in an XML document. These events are typically caused by users interacting with the web page using a device, such as a web browser on a personal computer or mobile phone.

Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

In computing, Facelets is an open-source Web template system under the Apache license and the default view handler technology for Jakarta Faces. The language requires valid input XML documents to work. Facelets supports all of the JSF UI components and focuses completely on building the JSF component tree, reflecting the view for a JSF application.

A link relation is a descriptive attribute attached to a hyperlink in order to define the type of the link, or the relationship between the source and destination resources. The attribute can be used by automated systems, or can be presented to a user in a different way.

HTML attributes are special words used inside the opening tag to control the element's behaviour. HTML attributes are a modifier of a HTML element type. An attribute either modifies the default functionality of an element type or provides functionality to certain element types unable to function correctly without them. In HTML syntax, an attribute is added to a HTML start tag.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

A canonical link element is an HTML element that helps webmasters prevent duplicate content issues in search engine optimization by specifying the "canonical" or "preferred" version of a web page. It is described in RFC 6596, which went live in April 2012.

References

  1. "Versions localisées de vos pages | Google Search Central | Documentation". Google for Developers. Retrieved 2023-12-11.
  2. Shaikh, Gulammohiyuddin (29 November 2021). "A Guide To Hreflang Tag Best Practices For SEO".
  3. "Use hreflang for language and regional URLs". Google Inc. Retrieved 2015-10-08.
  4. "Introducing "x-default hreflang" for international landing pages". Google Webmaster Central Blog. Retrieved 2015-10-08.