XML validation

Last updated

XML validation is the process of checking a document written in XML (eXtensible Markup Language) to confirm that it is both well-formed and also "valid" in that it follows a defined structure. A well-formed document follows the basic syntactic rules of XML, which are the same for all XML documents. [1] A valid document also respects the rules dictated by a particular DTD or XML schema. [2] Automated tools – validators – can perform well-formedness tests and many other validation tests, but not those that require human judgement, such as correct application of a schema to a data set.

Contents

Standards

Tools

Related Research Articles

A document type definition (DTD) is a set of markup declarations that define a document type for an SGML-family markup language.

Standard Generalized Markup Language Markup language

The Standard Generalized Markup Language is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

XSD, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document. They can check if it adheres to the description of the element it is placed in.

In computing, RELAX NG is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also offers a popular compact, non-XML syntax. Compared to other XML schema languages RELAX NG is considered relatively simple.

Chemical Markup Language is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on an XML Schema, the most robust and widely used system for precise information management in many areas. It has been developed over more than a decade by Murray-Rust, Rzepa and others and has been tested in many areas and on a variety of machines.

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.

W3C Markup Validation Service Validator service by the World Wide Web Consortium

The Markup Validation Service is a validator by the World Wide Web Consortium (W3C) that allows Internet users to check HTML and XHTML documents for well-formed markup. Markup validation is an important step towards ensuring the technical quality of web pages. However, it is not a complete measure of web standards conformance. Though W3C validation is important for browser compatibility and site usability, it has not been confirmed what effect it has on search engine optimization.

XML Information Set is a W3C specification describing an abstract data model of an XML document in terms of a set of information items. The definitions in the XML Information Set specification are meant to be used in other specifications that need to refer to the information in a well-formed XML document.

Rick Jelliffe Australian computer programmer

Richard (Rick) Alan Jelliffe is an Australian programmer and standards activist, particularly associated with web standards, markup languages, internationalization and schema languages. He is the founder and Chief Technical Officer of Topologi Pty. Ltd, an XML tools vendor in Sydney. He has a degree in economics from the University of Sydney.

Oxygen XML Editor multi-platform XML editor, XSLT/XQuery debugger and profiler

The Oxygen XML Editor is a multi-platform XML editor, XSLT/XQuery debugger and profiler with Unicode support. It is a Java application, so it can run in Windows, Mac OS X, and Linux. It also has a version that can run as an Eclipse plugin.

Extensible Forms Description Language (XFDL) is a high-level computer language that facilitates defining a form as a single, stand-alone object using elements and attributes from the Extensible Markup Language (XML). Technically, it is a class of XML originally specified in a World Wide Web Consortium (W3C) Note. See Specifications below for links to the current versions of XFDL. XFDL It offers precise control over form layout, permitting replacement of existing business/government forms with electronic documents in a human-readable, open standard.

Data exchange is the process of taking data structured under a source schema and transforming it into data structured under a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

H eXtensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

Content Assembly Mechanism (CAM) is an XML-based standard for creating and managing information exchanges that are interoperable and deterministic descriptions of machine-processable information content flows into and out of XML structures. CAM is a product of the OASIS Content Assembly Technical Committee.

A structured document is an electronic document where some method of markup is used to identify the whole and parts of the document as having various meanings beyond their formatting. For example, a structured document might identify a certain portion as a "chapter title" rather than as "Helvetica bold 24" or "indented Courier". Such portions in general are commonly called "components" or "elements" of a document.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

Liquid XML Studio IDE is a Windows based XML editor and XML data binding toolkit. It includes graphical editors for authoring XML documents, XML Schema, WSDL documents, XSLT documents and HTML documents. It also includes user interface extension to Microsoft Visual Studio through the Visual Studio Industry Partner (VSIP) program.

A well-formed document in XML is a document that "adheres to the syntax rules specified by the XML 1.0 specification in that it must satisfy both physical and logical structures".

References

  1. "Well-Formed XML Documents". Extensible Markup Language (XML) 1.1. W3C. 2004.
  2. "Constraints and Validation Rules". XML Schema Part 1: Structures Second Edition. W3C. 2004.
Articles discussing XML validation