Abbreviation | schema |
---|---|
Year started | 2011 |
Latest version | 15.0 (2022-10-25) [1] |
Organization | Google, Yahoo!, Microsoft, Yandex |
Base standards | URI, HTML5, RDF, Microdata, ISO 8601 |
Related standards | RDFa, Microformat, RDFS, OWL, N-Triples, Turtle, JSON, JSON-LD, CSV |
Domain | Semantic Web |
License | CC-BY-SA 3.0 |
Website | schema |
Schema.org is a reference website that publishes documentation and guidelines for using structured data mark-up on web-pages (called microdata). Its main objective is to standardize HTML tags to be used by webmasters for creating rich results (displayed as visual data or infographic tables on search engine results) about a certain topic of interest. [2] It is a part of the semantic web project, which aims to make document mark-up codes more readable and meaningful to both humans and machines.
Schema.org is an initiative launched on June 2, 2011, by Bing, Google and Yahoo! [3] [4] [5] (operators of the world's largest search engines at that time) [6] to create and support a common set of schemas for structured data markup on web pages. In November 2011, Yandex (whose search engine is the largest in Russia) joined the initiative. [7] [8] They propose using the schema.org vocabulary along with the Microdata, RDFa, or JSON-LD formats [9] to mark up website content with metadata about itself. Such markup can be recognized by search engine spiders and other parsers, thus granting access to the meaning of the sites (see Semantic Web). The initiative also describes an extension mechanism for adding additional properties. [10] In 2012, the GoodRelations ontology was integrated into Schema.org. [11] Public discussion of the initiative largely takes place on the W3C public vocabularies mailing list. [12]
Much of the vocabulary on Schema.org was inspired by earlier formats, such as microformats, FOAF, and OpenCyc. [13] Microformats, with its most dominant representative hCard, continue (as of 2015) to be published widely on the web, where the deployment of Schema.org has strongly increased between 2012 and 2014. [14] In 2015, [15] Google began supporting the JSON-LD format, and as of September, 2017 recommended using JSON-LD for structured data whenever possible. [16] [17]
Despite the advantages of using Schema.org, adoption remained limited as of 2016. A survey in 2016 of 300 US-based marketing agencies and B2C advertisers across industries showing only 17% uptake. [18]
Such validators as the soon-to-be-deprecated Google Structured Data Testing Tool, [19] or more recent [20] Google Rich Results Test Tool, [21] Yandex Microformat validator, [22] and Bing Markup Validator [23] can be used to test the validity of the data marked up with the schemas and Microdata. More recently, Google Search Console (formerly webmaster tools) has provided a report section for unparsable structured data. If any Schema code on a website is incorrect, it will show in this report. [24] Some schema markups such as Organization and Person are commonly used to influence search results returned by Google's Knowledge Graph. [25]
There are a number of items that a web page can be marked up with using a Schema, with examples including:
The following is an example [26] of how to mark up information about a movie and its director using the Schema.org schemas and microdata. In order to mark up the data, the attribute itemtype
along with the URL of the schema is used. The attribute itemscope
defines the scope of the itemtype. The kind of the current item can be defined by using the attribute itemprop
.
<divitemscopeitemtype="http://schema.org/Movie"><h1itemprop="name">Avatar</h1><divitemprop="director"itemscopeitemtype="http://schema.org/Person"> Director: <spanitemprop="name">James Cameron</span> (born <timeitemprop="birthDate"datetime="1954-08-16">August 16, 1954</time>) </div><spanitemprop="genre">Science fiction</span><ahref="../movies/avatar-theatrical-trailer.html"itemprop="trailer">Trailer</a></div>
<divvocab="http://schema.org/"typeof="Movie"><h1property="name">Avatar</h1><divproperty="director"typeof="Person"> Director: <spanproperty="name">James Cameron</span> (born <timeproperty="birthDate"datetime="1954-08-16">August 16, 1954</time>) </div><spanproperty="genre">Science fiction</span><ahref="../movies/avatar-theatrical-trailer.html"property="trailer">Trailer</a></div>
<scripttype="application/ld+json">{"@context":"http://schema.org/","@type":"Movie","name":"Avatar","director":{"@type":"Person","name":"James Cameron","birthDate":"1954-08-16"},"genre":"Science fiction","trailer":"../movies/avatar-theatrical-trailer.html"}</script>
HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.
XSD, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in.
The noindex value of an HTML robots meta tag requests that automated Internet bots avoid indexing a web page. Reasons why one might want to use this meta tag include advising robots not to index a very large database, web pages that are very transitory, web pages that are under development, web pages that one wishes to keep slightly more private, or the printer and mobile-friendly versions of pages. Since the burden of honoring a website's noindex tag lies with the author of the search robot, sometimes these tags are ignored. Also the interpretation of the noindex tag is sometimes slightly different from one search engine company to the next.
Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data. They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines, web syndication and aggregators such as RSS.
Sitemaps is a protocol in XML format meant for a webmaster to inform search engines about URLs on a website that are available for web crawling. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more efficiently and to find URLs that may be isolated from the rest of the site's content. The Sitemaps protocol is a URL inclusion protocol and complements robots.txt
, a URL exclusion protocol.
hCard is a microformat for publishing the contact details of people, companies, organizations, and places, in HTML, Atom, RSS, or arbitrary XML. The hCard microformat does this using a 1:1 representation of vCard properties and values, identified using HTML classes and rel attributes.
nofollow is a setting on a web page hyperlink that directs search engines not to use the link for page ranking calculations. It is specified in the page as a type of link relation; that is: <a rel="nofollow" ...>
. Because search engines often calculate a site's importance according to the number of hyperlinks from other sites, the nofollow
setting allows website authors to indicate that the presence of a link is not an endorsement of the target site's importance.
RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.
Geo is a microformat used for marking up geographical coordinates in HTML. Coordinates are expected in angular units of degrees and geodetic datum WGS84. Although termed a "draft" specification, the format is a de facto standard, stable and in widespread use; not least as a sub-set of the published hCalendar and hCard microformat specifications, neither of which is still a draft.
Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document can be extracted into Resource Description Framework (RDF). This can be of great use for searching within data.
Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in web pages and web applications rather than merely to define its presentation or look. Semantic HTML is processed by traditional web browsers as well as by many other user agents. CSS is used to suggest its presentation to human users.
Bing Webmaster Tools is a free service as part of Microsoft's Bing search engine which allows webmasters to add their websites to the Bing index crawler, see their site's performance in Bing and a lot more. The service also offers tools for webmasters to troubleshoot the crawling and indexing of their website, submission of new URLs, Sitemap creation, submission and ping tools, website statistics, consolidation of content submission, and new content and community resources.
Microdata is a WHATWG HTML specification used to nest metadata within existing content on web pages. Search engines, web crawlers, and browsers can extract and process Microdata from a web page and use it to provide a richer browsing experience for users. Search engines benefit greatly from direct access to Microdata because it allows them to understand the information on web pages and provide more relevant results to users. Microdata uses a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata is an attempt to provide a simpler way of annotating HTML elements with machine-readable tags than the similar approaches of using RDFa and microformats.
XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.
JSON-LD is a method of encoding linked data using JSON. One goal for JSON-LD was to require as little effort as possible from developers to transform their existing JSON to JSON-LD. JSON-LD allows data to be serialized in a way that is similar to traditional JSON. It was initially developed by the JSON for Linking Data Community Group before being transferred to the RDF Working Group for review, improvement, and standardization, and is currently maintained by the JSON-LD Working Group. JSON-LD is a World Wide Web Consortium Recommendation.
Linked Data Notifications (LDN) is a W3C Recommendation that describes a communications protocol based on HTTP, URI, and RDF on how servers (receivers) can receive messages pushed to them by applications (senders), as well as how other applications (consumers) may retrieve those messages. Any web resource can advertise a receiving endpoint (inbox) for notification messages. Messages are expressed in RDF, and can contain arbitrary data.
The Thing Description (TD) (or W3C WoT Thing Description (TD)) is a royalty-free, open information model with a JSON based representation format for the Internet of Things (IoT). A TD provides a unified way to describe the capabilities of an IoT device or service with its offered data model and functions, protocol usage, and further metadata. Using Thing Descriptions help reduce the complexity of integrating IoT devices and their capabilities into IoT applications.
{{cite web}}
: CS1 maint: numeric names: authors list (link)