Filename extension | .xpl |
---|---|
Internet media type | application/xproc+xml |
Developed by | World Wide Web Consortium |
Type of format | Stylesheet language |
Extended from | XML |
Standard | XProc 3.0 |
XProc is an XML transformation language for processing documents in pipelines: chaining conversions and other steps together to achieve the desired results. The current (stable) version is 3.0 [1] . It is a W3C Recommendation. It can handle documents in XML, HTML, JSON, text and binary.
Its main characteristics are:
The following is a (very) simple XProc pipeline:
<p:declare-stepxmlns:p="http://www.w3.org/ns/xproc"version="3.0"><p:inputport="source"/><p:outputport="result"/><p:add-attributeattribute-name="timestamp"attribute-value="{current-dateTime()}"/><p:deletematch="@data"/></p:declare-step>
source
. This is where the original document flows in.result
. This is where the resulting document flows out.source
port automatically flows into the first step of the pipeline. This p:add-attribute
step adds an attribute called timestamp
with the current date and time.p:delete
step that removes all attributes called data
.p:delete
is the last step, the resulting document flows out through the output result
port.So if you supply the following XML document to this pipeline:
<exampledata="321"><itemdata="123">Somedata...</item></example>
It comes out as:
<exampletimestamp="2024-09-11T15:05:22.82+02:00"><item>Somedata...</item></example>
The exact date and time recorded in the timestamp
attribute is of course dependent on the date and time the pipeline is executed.
The learning page of the XProc website [2] contains links to all the learning and reference materials the XProc community group is aware of. There is a special 101 section with introductory learning materials.
Ideas for something, some programming language, for processing were there right from the beginnings of XML, at the end of the twentieth century. But it was not until the end of 2005 that the W3C started a working group called the XML Processing Model Working Group. this resulted in the recommendation for XProc 1.0 dated May 11, 2010. [3] .
There were various attempts to create working XProc 1.0 processors. The only two currently available as open source products that implement the full 1.0 standard are XML Calabash [4] and MorganaXProc [5] .
After the release of version 1.0, the XProc working group continued debating a next version. Ideas were raised for version 2.0. This was based on a non-XML syntax which didn’t raise a lot of support from the community. As things happen, the enthusiasm and energy in the working group became less and less and in 2016 it ceased to exist.
In June 2017 the XProc Next Community Group [6] was founded and started working on a new version, now completely XML based. Because this was a completely different approach than the 2.0 initiative, the version number was increased to 3.0. A stable version was released on 12 September 2022 [1] .
In 2024 the working group started work on a minor update to 3.1.
This is the logo of XProc. It was created by Bethan Tovey-Walsh. The fish is called Kanava, which is Finnish for pipeline.
The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an HTML or XML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
XSD, a recommendation of the World Wide Web Consortium (W3C), specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document, to assure it adheres to the description of the element it is placed in.
In software, an XML pipeline is formed when XML processes, especially XML transformations and XML validations, are connected.
Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of elements and XPath languages. In many implementations, the Schematron XML is processed into XSLT code for deployment anywhere that XSLT can be used.
XSL-FO is a markup language for XML document formatting that is most often used to generate PDF files. XSL-FO is part of XSL, a set of W3C technologies designed for the transformation and formatting of XML data. The other parts of XSL are XSLT and XPath. Version 1.1 of XSL-FO was published in 2006.
XML Signature defines an XML syntax for digital signatures and is defined in the W3C recommendation XML Signature Syntax and Processing. Functionally, it has much in common with PKCS #7 but is more extensible and geared towards signing XML documents. It is used by various Web technologies such as SOAP, SAML, and others.
XPath 2.0 is a version of the XPath language defined by the World Wide Web Consortium, W3C. It became a recommendation on 23 January 2007. As a W3C Recommendation it was superseded by XPath 3.0 on 10 April 2014.
In software engineering, a pipeline consists of a chain of processing elements, arranged so that the output of each element is the input of the next. The concept is analogous to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters. This is also called the pipe(s) and filters design pattern which is monolithic. Its advantages are simplicity and low cost while its disadvantages are lack of elasticity, fault tolerance and scalability. Connecting elements into a pipeline is analogous to function composition.
The identity transform is a data transformation that copies the source data into the destination data without change.
RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.
The Oxygen XML Editor is a multi-platform XML editor, XSLT/XQuery debugger and profiler with Unicode support. It is a Java application so it can run in Windows, Mac OS X, and Linux. It also has a version that can run as an Eclipse plugin.
Extensible Forms Description Language (XFDL) is a high-level computer language that facilitates defining a form as a single, stand-alone object using elements and attributes from the Extensible Markup Language (XML). Technically, it is a class of XML originally specified in a World Wide Web Consortium (W3C) Note. See Specifications below for links to the current versions of XFDL. XFDL It offers precise control over form layout, permitting replacement of existing business/government forms with electronic documents in a human-readable, open standard.
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.
Animation of Scalable Vector Graphics, an open XML-based standard vector graphics format is possible through various means:
Cascading Style Sheets (CSS) is a style sheet language used for specifying the presentation and styling of a document written in a markup language such as HTML or XML. CSS is a cornerstone technology of the World Wide Web, alongside HTML and JavaScript.
XQuery is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats. The language is developed by the XML Query working group of the W3C. The work is closely coordinated with the development of XSLT by the XSL Working Group; the two groups share responsibility for XPath, which is a subset of XQuery.
Data Format Description Language is a modeling language for describing general text and binary data in a standard way. It was published as an Open Grid Forum Recommendation in February 2021, and in April 2024 was published as an ISO standard.