GRDDL

Last updated

GRDDL (pronounced "griddle") is a markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. The GRDDL specification shows examples using XSLT, however it was intended to be abstract enough to allow for other implementations as well. It became a Recommendation on September 11, 2007. [1]

Contents

Mechanism

XHTML and transformations

A document specifies associated transformations, using one of a number of ways.

For instance, an XHTML document may contain the following markup:

<headprofile="http://www.w3.org/2003/g/data-view  http://dublincore.org/documents/dcq-html/  http://gmpg.org/xfn/11"><linkrel="transformation"href="grokXFN.xsl"/>

Document consumers are informed that there are GRDDL transformations available in this page, by including the following in the profile attribute of the head element:

http://www.w3.org/2003/g/data-view 

The available transformations are revealed through one or more link elements:

<linkrel="transformation"href="grokXFN.xsl"/>

This code is valid for XHTML 1.x only. The profile attribute has been dropped in HTML5, including its XML serialisation.

Microformats and profile transformations

If an XHTML page contains Microformats, there is usually a specific profile.

For instance, a document with hcard information should have:

<headprofile="http://www.w3.org/2003/g/data-view http://www.w3.org/2006/03/hcard">

When fetched http://www.w3.org/2006/03/hcard has:

<headprofile="http://www.w3.org/2003/g/data-view">

and

<p>Use of this profile licenses RDF data extracted by    <arel="profileTransformation"href="../vcard/hcard2rdf.xsl">hcard2rdf.xsl</a>     from <ahref="http://www.w3.org/2006/vcard/ns">the 2006 vCard/RDF work</a>. </p>

The GRDDL aware agent can then use that profileTransformation to extract all hcard data from pages that reference that link.

XML and transformations

In a similar fashion to XHTML, GRDDL transformations can be attached to XML documents.

XML namespace transformations

Just like a profileTransformation, an XML namespace can have a transformation associated with it.

This allows entire XML dialects (for instance, KML or Atom) to provide meaningful RDF.

An XML document simply points to a namespace

<fooxmlns="http://example.com/1.0/"><!-- document content here --></foo>

and when fetched, http://example.com/1.0/ points to a namespaceTransformation.

This also allows very large amounts of the existing XML data in the wild to become RDF/XML with minimal effort from the namespace author.

Output

Once a document has been transformed, there is an RDF representation of that data.

This output is generally put into a database and queried via SPARQL.

Implementations

GRDDL consumers (also known as GRDDL aware agents)

See also

Related Research Articles

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.

Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content, and is one of a number of mathematical markup languages. Its aim is to natively integrate mathematical formulae into World Wide Web pages and other documents. It is part of HTML5 and is a ISO/IEC standard ISO/IEC 40314 since 2015.

<span class="mw-page-title-main">Wireless Markup Language</span> Markup language intended for devices that implement the Wireless Application Protocol specification

Wireless Markup Language (WML), based on XML, is a now-obsolete markup language intended for devices that implement the Wireless Application Protocol (WAP) specification, such as mobile phones. It provides navigational support, data input, hyperlinks, text and image presentation, and forms, much like HTML. It preceded the use of other markup languages used with WAP, such as XHTML and HTML itself, which achieved dominance as processing power in mobile devices increased.

<span class="mw-page-title-main">Geography Markup Language</span> XML grammar for geographical features

The Geography Markup Language (GML) is the XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet. Key to GML's utility is its ability to integrate all forms of geographic information, including not only conventional "vector" or discrete objects, but coverages and sensor data.

XHTML Basic is an XML-based structured markup language primarily used for simple user agents, typically mobile devices.

XML Linking Language, or XLink, is an XML markup language and W3C specification that provides methods for creating internal and external links within XML documents, and associating metadata with those links.

XHTML Friends Network (XFN) is an HTML microformat developed by Global Multimedia Protocols Group that provides a simple way to represent human relationships using links. XFN enables web authors to indicate relationships to the people in their blogrolls by adding one or more keywords as the rel attribute to their links. XFN was the first microformat, introduced in December 2003.

Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data. They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines, web syndication and aggregators such as RSS.

hCard is a microformat for publishing the contact details of people, companies, organizations, and places, in HTML, Atom, RSS, or arbitrary XML. The hCard microformat does this using a 1:1 representation of vCard properties and values, identified using HTML classes and rel attributes.

RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document can be extracted into Resource Description Framework (RDF). This can be of great use for searching within data.

<span class="mw-page-title-main">Semantic HTML</span> HTML used to reinforce meaning of documents or webpages

Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in web pages and web applications rather than merely to define its presentation or look. Semantic HTML is processed by traditional web browsers as well as by many other user agents. CSS is used to suggest its presentation to human users.

Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

A link relation is a descriptive attribute attached to a hyperlink in order to define the type of the link, or the relationship between the source and destination resources. The attribute can be used by automated systems, or can be presented to a user in a different way.

The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulae, graphics, bibliographies etc.

Microdata is a WHATWG HTML specification used to nest metadata within existing content on web pages. Search engines, web crawlers, and browsers can extract and process Microdata from a web page and use it to provide a richer browsing experience for users. Search engines benefit greatly from direct access to this structured data because it allows them to understand the information on web pages and provide more relevant results to users. Microdata uses a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata is an attempt to provide a simpler way of annotating HTML elements with machine-readable tags than the similar approaches of using RDFa and microformats.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

References

Notes