XML Signature

Last updated January 20, 2025

XML Signature (also called XMLDSig, XML-DSig, XML-Sig) defines an XML syntax for digital signatures and is defined in the W3C recommendation XML Signature Syntax and Processing. Functionally, it has much in common with PKCS #7 but is more extensible and geared towards signing XML documents. It is used by various Web technologies such as SOAP, SAML, and others.

XML signatures can be used to sign data–a resource–of any type, typically XML documents, but anything that is accessible via a URL can be signed. An XML signature used to sign a resource outside its containing XML document is called a detached signature; if it is used to sign some part of its containing document, it is called an enveloped signature;^[1] if it contains the signed data within itself it is called an enveloping signature.^[2]

Structure

An XML Signature consists of a Signature element in the http://www.w3.org/2000/09/xmldsig# namespace. The basic structure is as follows:

<Signature><SignedInfo><CanonicalizationMethod/><SignatureMethod/><Reference><Transforms/><DigestMethod/><DigestValue/></Reference><Reference/>etc. </SignedInfo><SignatureValue/><KeyInfo/><Object/></Signature>

The SignedInfo element contains or references the signed data and specifies what algorithms are used.
- The SignatureMethod and CanonicalizationMethod elements are used by the SignatureValue element and are included in SignedInfo to protect them from tampering.
- One or more Reference elements specify the resource being signed by URI reference and any transformations to be applied to the resource prior to signing.
  - Transforms contains the transformations applied to the resource prior to signing. A transformation can be a XPath-expression that selects a defined subset of the document tree.^[3]
  - DigestMethod specifies the hash algorithm before applying the hash.
  - DigestValue contains the Base64 encoded result of applying the hash algorithm to the transformed resource(s) defined in the Reference element attributes.
The SignatureValue element contains the Base64 encoded signature result - the signature generated with the parameters specified in the SignatureMethod element - of the SignedInfo element after applying the algorithm specified by the CanonicalizationMethod.
KeyInfo element optionally allows the signer to provide recipients with the key that validates the signature, usually in the form of one or more X.509 digital certificates. The relying party must identify the key from context if KeyInfo is not present.
The Object element (optional) contains the signed data if this is an enveloping signature.

Validation and security considerations

When validating an XML Signature, a procedure called Core Validation is followed.

Reference Validation: Each Reference's digest is verified by retrieving the corresponding resource and applying any transforms and then the specified digest method to it. The result is compared to the recorded DigestValue; if they do not match, validation fails.
Signature Validation: The SignedInfo element is serialized using the canonicalization method specified in CanonicalizationMethod, the key data is retrieved using KeyInfo or by other means, and the signature is verified using the method specified in SignatureMethod.

This procedure establishes whether the resources were really signed by the alleged party. However, because of the extensibility of the canonicalization and transform methods, the verifying party must also make sure that what was actually signed or digested is really what was present in the original data, in other words, that the algorithms used there can be trusted not to change the meaning of the signed data.

Because the signed document's structure can be tampered with leading to "signature wrapping" attacks, the validation process should also cover XML document structure. Signed element and signature element should be selected using absolute XPath expression, not getElementByName methods.^[4]

XML canonicalization

The creation of XML Signatures is substantially more complex than the creation of an ordinary digital signature because a given XML Document (an "Infoset", in common usage among XML developers) may have more than one legal serialized representation. For example, whitespace inside an XML Element is not syntactically significant, so that <Elem > is syntactically identical to <Elem>.

Since the digital signature ensures data integrity, a single-byte difference would cause the signature to vary. Moreover, if an XML document is transferred from computer to computer, the line terminator may be changed from CR to LF to CR LF, etc. A program that digests and validates an XML document may later render the XML document in a different way, e.g. adding excess space between attribute definitions with an element definition, or using relative (vs. absolute) URLs, or by reordering namespace definitions. Canonical XML is especially important when an XML Signature refers to a remote document, which may be rendered in time-varying ways by an errant remote server.

To avoid these problems and guarantee that logically-identical XML documents give identical digital signatures, an XML canonicalization transform (frequently abbreviated C14n) is employed when signing XML documents (for signing the SignedInfo, a canonicalization is mandatory). These algorithms guarantee that semantically-identical documents produce exactly identical serialized representations.

Another complication arises because of the way that the default canonicalization algorithm handles namespace declarations; frequently a signed XML document needs to be embedded in another document; in this case the original canonicalization algorithm will not yield the same result as if the document is treated alone. For this reason, the so-called Exclusive Canonicalization, which serializes XML namespace declarations independently of the surrounding XML, was created.

Benefits

XML Signature is more flexible than other forms of digital signatures such as Pretty Good Privacy and Cryptographic Message Syntax, because it does not operate on binary data, but on the XML Infoset, allowing to work on subsets of the data (this is also possible with binary data in non-standard ways, for example encoding blocks of binary data in base64 ASCII), having various ways to bind the signature and signed information, and perform transformations. Another core concept is canonicalization, that is to sign only the "essence", eliminating meaningless differences like whitespace and line endings.

Issues

There are criticisms directed at the architecture of XML security in general,^[5] and at the suitability of XML canonicalization in particular as a front end to signing and encrypting XML data due to its complexity, inherent processing requirement, and poor performance characteristics.^[6]^[7]^[8] The argument is that performing XML canonicalization causes excessive latency that is simply too much to overcome for transactional, performance sensitive SOA applications.

These issues are being addressed in the XML Security Working Group.^[9]^[10]

Without proper policy and implementation^[4] the use of XML Dsig in SOAP and WS-Security can lead to vulnerabilities,^[11] such as XML signature wrapping.^[12]

Applications

An example of applications of XML Signatures:

Digital signing of XBRL annual reports by auditors in the Netherlands. A PKIoverheid X.509 certificate, approved by the Royal National Institute of Chartered Accountants [ nl ], is required. The electronic signature is legally binding. The SBR Assurance standard^[13] is part of the Dutch Standard Business Reporting program.

Related Research Articles

A document type definition (DTD) is a specification file that contains set of markup declarations that define a document type for an SGML-family markup language. The DTD specification file can be used to validate documents.

A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world objects such as people and places, concepts. URIs are used to identify anything described using the Resource Description Framework (RDF), for example, concepts that are part of an ontology defined using the Web Ontology Language (OWL), and people who are described using the Friend of a Friend vocabulary would each have an individual URI.

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

In computer programming, an S-expression is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming language Lisp, which uses them for source code as well as data.

XSD, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in.

In computing, RELAX NG is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also offers a popular compact, non-XML syntax. Compared to other XML schema languages RELAX NG is considered relatively simple.

Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of elements and XPath languages. In many implementations, the Schematron XML is processed into XSLT code for deployment anywhere that XSLT can be used.

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.

XPath 2.0 is a version of the XPath language defined by the World Wide Web Consortium, W3C. It became a recommendation on 23 January 2007. As a W3C Recommendation it was superseded by XPath 3.0 on 10 April 2014.

In computer science, canonicalization is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form. This can be done to compare different representations for equivalence, to count the number of distinct data structures, to improve the efficiency of various algorithms by eliminating repeated calculations, or to make it possible to impose a meaningful sorting order.

In computer hypertext, a URI fragment is a string of characters that refers to a resource that is subordinate to another, primary resource. The primary resource is identified by a Uniform Resource Identifier (URI), and the fragment identifier points to the subordinate resource.

XML Encryption (XML-Enc) is a specification governed by a World Wide Web Consortium (W3C) recommendation, that defines how to encrypt the contents of an XML element.

XML Information Set is a W3C specification describing an abstract data model of an XML document in terms of a set of information items. The definitions in the XML Information Set specification are meant to be used in other specifications that need to refer to the information in a well-formed XML document.

XML documents have a hierarchical structure and can conceptually be interpreted as a tree structure, called an XML tree.

Canonical XML is a normal form of XML, intended to allow relatively simple comparison of pairs of XML documents for equivalence; for this purpose, the Canonical XML transformation removes non-meaningful differences between the documents. Any XML document can be converted to Canonical XML.

Security Assertion Markup Language 2.0 (SAML 2.0) is a version of the SAML standard for exchanging authentication and authorization identities between security domains. SAML 2.0 is an XML-based protocol that uses security tokens containing assertions to pass information about a principal between a SAML authority, named an Identity Provider, and a SAML consumer, named a Service Provider. SAML 2.0 enables web-based, cross-domain single sign-on (SSO), which helps reduce the administrative overhead of distributing multiple authentication tokens to the user. SAML 2.0 was ratified as an OASIS Standard in March 2005, replacing SAML 1.1. The critical aspects of SAML 2.0 are covered in detail in the official documents SAMLCore, SAMLBind, SAMLProf, and SAMLMeta.

XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.

Content Assembly Mechanism (CAM) is an XML-based standard for creating and managing information exchanges that are interoperable and deterministic descriptions of machine-processable information content flows into and out of XML structures. CAM is a product of the OASIS Content Assembly Technical Committee.

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform XML processing technologies centered on a non-extractive XML, "document-centric" parsing technique called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the following:

References

↑ "XML Signature Syntax and Processing Version 1.1".
↑ "XML Signature Syntax and Processing Version 1.1".
↑ XML-Signature XPath Filter 2.0
1 2 Pawel Krawczyk (2013). "Secure SAML validation to prevent XML signature wrapping attacks".
↑ "Why XML Security is Broken".
↑ Performance of Web Services Security
↑ Performance Comparison of Security Mechanisms for Grid Services
↑ Zhang, Jimmy (January 9, 2007). "Accelerate WSS applications with VTD-XML". JavaWorld . Retrieved 2020-07-24.
↑ W3C Workshop on Next Steps for XML Signature and XML Encryption, 2007
↑ "XML Security 2.0 Requirements and Design Considerations".
↑ "XML Signature Element Wrapping Attacks and Countermeasures" (PDF). IBM Research Division. Retrieved 2023-09-07.
↑ Juraj Somorovsky; Andreas Mayer; Jorg Schwenk; Marco Kampmann; Meiko Jensen (2012). "On Breaking SAML: Be Whoever You Want to Be" (PDF).
↑ "SBR Assurance" . Retrieved 2023-09-07.

External links

XML Signature Syntax and Processing
Canonical XML
Additional XML Security Uniform Resource Identifiers (URIs)
Exclusive XML Canonicalization
XMLSignatures Java binding for XMLBeans and JAXB.
Step-by-Step example of how a signature is created.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "XML Signature Syntax and Processing Version 1.1".

[2] "XML Signature Syntax and Processing Version 1.1".

[3] XML-Signature XPath Filter 2.0

[pk-4] 1 2 Pawel Krawczyk (2013). "Secure SAML validation to prevent XML signature wrapping attacks".

[5] "Why XML Security is Broken".

[6] Performance of Web Services Security

[7] Performance Comparison of Security Mechanisms for Grid Services

[8] Zhang, Jimmy (January 9, 2007). "Accelerate WSS applications with VTD-XML". JavaWorld . Retrieved 2020-07-24.

[9] W3C Workshop on Next Steps for XML Signature and XML Encryption, 2007

[10] "XML Security 2.0 Requirements and Design Considerations".

[11] "XML Signature Element Wrapping Attacks and Countermeasures" (PDF). IBM Research Division. Retrieved 2023-09-07.

[12] Juraj Somorovsky; Andreas Mayer; Jorg Schwenk; Marco Kampmann; Meiko Jensen (2012). "On Breaking SAML: Be Whoever You Want to Be" (PDF).

[sbra-13] "SBR Assurance" . Retrieved 2023-09-07.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]