SHACL

SHACL
Shapes Constraint Language
Abbreviation	SHACL
Status	Published, W3C Recommendation
Year started	2015
First published	October 8, 2015;8 years ago
Organization	W3C
Committee	RDF Data Shapes Working Group
Editors	Holger Knublauch; Dimitris Kontokostas;
Base standards	RDF ; SPARQL ;
Related standards	RDF Schema ; OWL ;
Domain	Semantic Web
Website	www.w3.org/TR/shacl/

Last updated August 30, 2024

Shapes Constraint Language^[1] (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontologies expressed as RDF graphs.^[3]

SHACL models are defined in terms of constraints on the content, structure and meaning of a graph. SHACL is a highly expressive language. Among others, it includes features to express conditions that constrain the number of values that a property may have, the type of such values, numeric ranges, string matching patterns, and logical combinations of such constraints. SHACL also includes an extension mechanism to express more complex conditions in languages such as SPARQL and JavaScript. SHACL Rules add inferencing capabilities to SHACL, allowing users to define what new statements can be inferred from existing (asserted) statements.

Terminology

SHACL lets its users describe shapes of data, targeting where a specific shape applies.

Property Shapes

A property shape describes characteristics of graph nodes that can be reached via a specific path. A path can be a single predicate (property) or a chain of predicates. A property shape must always specify a path. This is done by using sh:path predicate. One can think of property shapes that use simple paths as describing values of certain properties e.g., values of an age property or values of a works for property. Complex paths can specify a combination of different predicates in a chain, including the inverse direction, alternative predicates and transitive chains.

Property shapes can be defined as part of a node shape. In this case, a node shape points to property shapes using sh:property predicate. Property shapes can also be "stand-alone" i.e., completely independent from any node shapes.

Node Shapes

A node shape describes characteristics of specific graph nodes irrespective of how you get to them. It can, for example, be said that certain graph nodes must be literals or a URIs, etc. It is common to include property shapes into a node shape, effectively defining values of many different properties of a node.

For example, a node shape for an employee may incorporate property shapes for age and works for properties.

Constraints

A constraint is a way to describe different characteristics of values. A shape will contain one or more constraint declarations. SHACL provides many pre-built constraint types. For example, sh:datatype is used to describe the type of literal values e.g., if they are strings or integers or dates. sh:minCount is used to describe the minimum required number of values. sh:length is used to describe the number of characters for a value.

Targets

A target connects a shape with data it describes. The simplest way to specify a target is to say that a node shape is also a class. This means that its definition is applicable to all members (instances) of a class. Other ways to define a target of a shape are by:

Explicitly saying that a shape targets members of a certain class. This can be done instead of making a node shape also a class.
Saying that a shape targets a specific resource by giving its URI.
Saying that a shape targets all subjects or all objects of triples with a certain predicate.
Using a SPARQL query to select a set of resource to be targeted.

Target declarations can be included in a node shape or in a property shape. However, when a property shape is a part of a node shape, its own targets are ignored.

SHACL uses rdfs:subClassOf statements to identify targets. A shape targeting members of a class, also targets members of all its subclasses. In other words, all SHACL definitions for a class are inherited by subclasses.

Validation

SHACL enables validation of graphs. A SHACL validation engine takes as input a graph to be validated (called data graph) and a graph containing SHACL shapes declarations (called shapes graph) and produces a validation report, also expressed as a graph. All these graphs can be represented in any Resource Description Framework (RDF) serialization formats including JSON-LD or Turtle.

SHACL is fairly unique in its approach in that it builds-in not only the ability to specify a severity level of validation results, but also the ability to return suggestions on how data may be fixed if the validation result is raised. Built-in levels are Violation, Warning and Info, defaulting to Violation if no sh:severity has been specified for a shape. Users of SHACL can add other, custom levels of severity. Validation results may also have values for other properties, as described in the specification. For example, the property sh:resultMessage is designed to communicate additional textual details to users, including recommendations on how data may be fixed to address to validation result. In cases where a constraint does not have any values for sh:message in the shapes graph the SHACL processor may automatically generate other values for sh:resultMessage. Some SHACL processors (e.g., the one implemented by TopQuadrant) made these suggestions actionable in software, automating their application on user's request.

Specifications

World Wide Web Consortium published the following SHACL Specifications:

SHACL^[1] (W3C Technical Recommendation) is the main document, defining the features of SHACL Core and its extension mechanism called SHACL-SPARQL. SHACL Core defines the basic syntax and structure of shapes, constraints, the built-in kinds of constraints, and how to link shapes to data nodes. SHACL-SPARQL defines how to express constraints that are not covered by the built-in constraint kinds.
SHACL Advanced Features^[4] (W3C Working Group Note), the most recent version of which is maintained by the SHACL Community Group defines support for SHACL Rules, a powerful feature (inspired by SPIN rules) for data transformations, inferences and mappings based on data shapes. Also includes extensions of SHACL-SPARQL such as user-defined functions.
SHACL JavaScript Extensions^[5] (W3C Working Group Note) defines how JavaScript can be used to express constraints, rules, functions and other features. This covers similar ground as SHACL-SPARQL, but using JavaScript as its execution language.
SHACL Compact Syntax^[6] (SHACL Community Group Report).

Open-source tools

The SHACL Test Suite and Implementation Report^[7] linked to from the SHACL W3C specification lists some open source tools that could be used for SHACL validation as of June 2019. By the end of 2019 many commercial RDF database and framework vendors announced support for at least SHACL Core.

Some of the open source tools listed in the report are:

dotNetRDF SHACL - an online SHACL validator service written in the .NET Framework ^[8]^[9]
pySHACL - an open source SHACL validator library for command line use written in Python ^[10]
SHaclEX - a Scala implementation of both SHACL and ShEx ^[11]
TopBraid SHACL API - an open source implementation of SHACL by TopQuadrant, based on Apache Jena. It covers SHACL Core and SHACL-SPARQL validation as well as SHACL Advanced Features, SHACL Javascript Extension and SHACL Compact Syntax. The same code is used in the TopBraid commercial products.^[12]

SHACL Playground is a free SHACL validation service implemented in JavaScript.^[13]

Eclipse RDF4J is an open source Java framework by the Eclipse Foundation for processing RDF data, which supports SHACL validation.^[14]

Commercial tools

SHACL is supported by most RDF Graph technology vendors including Cambridge Semantics (Anzo)^[15], Franz (AllegroGraph), Metaphacts^[16], Ontotext (GraphDB)^[17], Stardog^[18] and TopQuadrant. There is even support in the commercial products that use property graph data model, such as Neo4J. ^[19]

Levels of implementation may vary. At minimum, vendors support SHACL Core. Some also support SHACL SPARQL for higher expressivity, while others may support SHACL Advanced Features which include rules and functions.

Related Research Articles

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.

RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies. It uses various forms of RDF vocabularies, intended to structure RDF resources. RDF and RDFS can be saved in a triplestore, then one can extract some knowledge from them using a query language, like SPARQL.

SPARQL is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium, and is recognized as one of the key technologies of the semantic web. On 15 January 2008, SPARQL 1.0 was acknowledged by W3C as an official recommendation, and SPARQL 1.1 in March, 2013.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information. This library contains parsers/serializers for almost all of the known RDF serializations, such as RDF/XML, Turtle, N-Triples, & JSON-LD, many of which are now supported in their updated form. The library also contains both in-memory and persistent Graph back-ends for storing RDF information and numerous convenience functions for declaring graph namespaces, lodging SPARQL queries and so on. It is in continuous development with the most recent stable release, rdflib 6.1.1 having been released on 20 December 2021. It was originally created by Daniel Krech with the first release in November, 2002.

Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.

In computing, Terse RDF Triple Language (Turtle) is a syntax and file format for expressing data in the Resource Description Framework (RDF) data model. Turtle syntax is similar to that of SPARQL, an RDF query language. It is a common data format for storing RDF data, along with N-Triples, JSON-LD and RDF/XML.

An RDF query language is a computer language, specifically a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format.

Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A model can also be queried through SPARQL 1.1.

In computing, reactive programming is a declarative programming paradigm concerned with data streams and the propagation of change. With this paradigm, it is possible to express static or dynamic data streams with ease, and also communicate that an inferred dependency within the associated execution model exists, which facilitates the automatic propagation of the changed data flow.

A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred".

The Semantic Web Stack, also known as Semantic Web Cake or Semantic Web Layer Cake, illustrates the architecture of the Semantic Web.

XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people to access common information more effectively. It was developed by the American software company Metaweb and run publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced on 16 July 2010. Google's Knowledge Graph is powered in part by Freebase.

Cypher is a declarative graph query language that allows for expressive and efficient data querying in a property graph.

<span class="mw-page-title-main">ShEx</span>

Shape Expressions (ShEx) is a data modelling language for validating and describing a Resource Description Framework (RDF).

GQL is a standardized query language for property graphs first described in ISO/IEC 76120, released in April 2024 by ISO/IEC.

References

1 2 3 4 Knublauch, Holger; Kontokostas, Dimitris, eds. (2017-07-20). "Shapes Constraint Language (SHACL)". W3C. RDF Data Shapes Working Group. Retrieved 2021-04-06.
1 2 "Shapes Constraint Language (SHACL) Publication History - W3C". W3C. 20 July 2017. Retrieved 2021-04-06.
↑ "CAMSS Assessment of SHACL by the European Commission".
↑ Knublauch, Holger; Allemang, Dean; Steyskal, Simon, eds. (2017-06-08). "SHACL Advanced Features". W3C. RDF Data Shapes Working Group. Retrieved 2021-04-06.
↑ Knublauch, Holger; Maria, Pano, eds. (2018-01-09). "SHACL JavaScript Extensions". W3C. SHACL Community Group.
↑ Knublauch, Holger; Maria, Pano, eds. (2018-01-09). "SHACL Compact Syntax". W3C. SHACL Community Group.
↑ Labra Gayo, Jose Emilio; Knublauch, Holger; Kontokostas, Dimitris, eds. (2021-01-22). "SHACL Test Suite and Implementation Report". W3C.
↑ Lang, Samu (n.d.). "dotNetRDF SHACL". langsamu.net. Retrieved 2021-04-06.
↑ Lang, Samu (2019-06-01). "dotNetRDF SHACL validator service". GitHub. Retrieved 2021-04-07.
↑ Sommer, Ashley; Car, Nicholas (2018-08-15). "RDFLib/pySHACL: A Python validator for SHACL". GitHub. Retrieved 2021-04-06.
↑ Labra Gayo, Jose Emilio; et al. (Web Semantics Oviedo, University of Oviedo). "weso/shaclex: SHACL/ShEx implementation". GitHub. Retrieved 2021-04-06.
↑ Knublauch, Holger (2015-05-24). "TopQuadrant/shacl: SHACL API in Java based on Apache Jena". GitHub. Retrieved 2021-04-06.
↑ Knublauch, Holger (2017-05-01). "SHACL Playground". SHACL Playground. Retrieved 2021-04-07.
↑ "Programming With RDF4J: Validation With SHACL" . Retrieved 29 August 2024.
↑ "AnzoGraph DB 3.1: Validate Data with SHACL (Preview)" . Retrieved 29 August 2024.
↑ "metaphactory: Data Quality Service" . Retrieved 29 August 2024.
↑ "GraphDB 10.7: SHACL validation" . Retrieved 29 August 2024.
↑ "Stardog: Data Quality Constraints: SHACL Constraints" . Retrieved 29 August 2024.
↑ "Validating Neo4j graphs against SHACL" . Retrieved 29 August 2024.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[w3c-standard-1] 1 2 3 4 Knublauch, Holger; Kontokostas, Dimitris, eds. (2017-07-20). "Shapes Constraint Language (SHACL)". W3C. RDF Data Shapes Working Group. Retrieved 2021-04-06.

[history-2] 1 2 "Shapes Constraint Language (SHACL) Publication History - W3C". W3C. 20 July 2017. Retrieved 2021-04-06.

[3] "CAMSS Assessment of SHACL by the European Commission".

[shacl-advanced-feats-4] Knublauch, Holger; Allemang, Dean; Steyskal, Simon, eds. (2017-06-08). "SHACL Advanced Features". W3C. RDF Data Shapes Working Group. Retrieved 2021-04-06.

[shacl-js-5] Knublauch, Holger; Maria, Pano, eds. (2018-01-09). "SHACL JavaScript Extensions". W3C. SHACL Community Group.

[shacl-compact-syntax-6] Knublauch, Holger; Maria, Pano, eds. (2018-01-09). "SHACL Compact Syntax". W3C. SHACL Community Group.

[test-suite-7] Labra Gayo, Jose Emilio; Knublauch, Holger; Kontokostas, Dimitris, eds. (2021-01-22). "SHACL Test Suite and Implementation Report". W3C.

[dotnetRdf-shacl-site-8] Lang, Samu (n.d.). "dotNetRDF SHACL". langsamu.net. Retrieved 2021-04-06.

[dotNetRdf-shacl-github-9] Lang, Samu (2019-06-01). "dotNetRDF SHACL validator service". GitHub. Retrieved 2021-04-07.

[pyshacl-10] Sommer, Ashley; Car, Nicholas (2018-08-15). "RDFLib/pySHACL: A Python validator for SHACL". GitHub. Retrieved 2021-04-06.

[shaclex-11] Labra Gayo, Jose Emilio; et al. (Web Semantics Oviedo, University of Oviedo). "weso/shaclex: SHACL/ShEx implementation". GitHub. Retrieved 2021-04-06.

[shacl-topquadrant-12] Knublauch, Holger (2015-05-24). "TopQuadrant/shacl: SHACL API in Java based on Apache Jena". GitHub. Retrieved 2021-04-06.

[shacl-playground-13] Knublauch, Holger (2017-05-01). "SHACL Playground". SHACL Playground. Retrieved 2021-04-07.

[rdf4j-14] "Programming With RDF4J: Validation With SHACL" . Retrieved 29 August 2024.

[anzo-shacl-15] "AnzoGraph DB 3.1: Validate Data with SHACL (Preview)" . Retrieved 29 August 2024.

[16] "metaphactory: Data Quality Service" . Retrieved 29 August 2024.

[17] "GraphDB 10.7: SHACL validation" . Retrieved 29 August 2024.

[18] "Stardog: Data Quality Constraints: SHACL Constraints" . Retrieved 29 August 2024.

[shacl-neo4j-19] "Validating Neo4j graphs against SHACL" . Retrieved 29 August 2024.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

v t e Data exchange formats
Human readable	Atom CSV EDIFACT JSON Web Encryption Web Token Web Signature Property list RDF Rebol TOML XML YAML
Binary	AMF ASN.1 SMI Avro Base32 Base64 BSON UBJSON Cap'n Proto CBOR FlatBuffers MessagePack Property list Protocol Buffers Thrift Cyphal DSDL XDR uuencode yEnc
Comparison of data-serialization formats