In software, an XML pipeline is formed when XML (Extensible Markup Language) processes, especially XML transformations and XML validations, are connected.
For instance, given two transformations T1 and T2, the two can be connected so that an input XML document is transformed by T1 and then the output of T1 is fed as input document to T2. Simple pipelines like the one described above are called linear; a single input document always goes through the same sequence of transformations to produce a single output document.
Linear operations can be divided in at least two parts
They operate at the inner document level
They take the input document as a whole
They are mainly introduced in XProc and help to handle the sequence of document as a whole
Non-linear operations on pipelines may include:
Some standards also categorize transformation as macro (changes impacting an entire file) or micro (impacting only an element or attribute)
XML pipeline languages are used to define pipelines. A program written with an XML pipeline language is implemented by software known as an XML pipeline engine, which creates processes, connects them together and finally executes the pipeline. Existing XML pipeline languages include:
Different XML Pipeline implementations support different granularity of flow.
Until May 2010, there was no widely used standard for XML pipeline languages. However, with the introduction of the W3C XProc standard as a W3C Recommendation as of May 2010, [6] widespread adoption can be expected.
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
In computing, the term Extensible Stylesheet Language (XSL) is used to refer to a family of languages used to transform and render XML documents.
XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.
In computing, the Java API for XML Processing, or JAXP, one of the Java XML Application programming interfaces, provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:
Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of elements and XPath languages. In many implementations, the Schematron XML is processed into XSLT code for deployment anywhere that XSLT can be used.
XForms is an XML format used for collecting inputs from web forms. XForms was designed to be the next generation of HTML / XHTML forms, but is generic enough that it can also be used in a standalone manner or with presentation languages other than XHTML to describe a user interface and a set of common data manipulation tasks.
XSL-FO is a markup language for XML document formatting that is most often used to generate PDF files. XSL-FO is part of XSL, a set of W3C technologies designed for the transformation and formatting of XML data. The other parts of XSL are XSLT and XPath. Version 1.1 of XSL-FO was published in 2006.
Apache Cocoon, usually abbreviated as Cocoon, is a web application framework built around the concepts of Pipeline, separation of concerns, and component-based web development. The framework focuses on XML and XSLT publishing and is built using the Java programming language. Cocoon's use of XML is intended to improve compatibility of publishing formats, such as HTML and PDF. The content management systems Apache Lenya and Daisy have been created on top of the framework. Cocoon is also commonly used as a data warehousing ETL tool or as middleware for transporting data between systems.
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.
In software engineering, a pipeline consists of a chain of processing elements, arranged so that the output of each element is the input of the next. The concept is analogous to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters. This is also called the pipe(s) and filters design pattern which is monolithic. Its advantages are simplicity and low cost while its disadvantages are lack of elasticity, fault tolerance and scalability. Connecting elements into a pipeline is analogous to function composition.
GRDDL is a markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. The GRDDL specification shows examples using XSLT, however it was intended to be abstract enough to allow for other implementations as well. It became a Recommendation on September 11, 2007.
The identity transform is a data transformation that copies the source data into the destination data without change.
In computing, the two primary stylesheet languages are Cascading Style Sheets (CSS) and the Extensible Stylesheet Language (XSL). While they are both called stylesheet languages, they have very different purposes and ways of going about their tasks.
The Oxygen XML Editor is a multi-platform XML editor, XSLT/XQuery debugger and profiler with Unicode support. It is a Java application so it can run in Windows, Mac OS X, and Linux. It also has a version that can run as an Eclipse plugin.
XSLT defines many elements to describe the transformations that should be applied to a document. This article lists some of these elements. For an introduction to XSLT, see the main article.
XProc is a W3C Recommendation to define an XML transformation language to define XML Pipelines.
XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.
XQuery is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats. The language is developed by the XML Query working group of the W3C. The work is closely coordinated with the development of XSLT by the XSL Working Group; the two groups share responsibility for XPath, which is a subset of XQuery.
An XML transformation language is a programming language designed specifically to transform an input XML document into an output document which satisfies some specific goal.
Tritium is a simple scripting language for efficiently transforming structured data like HTML, XML, and JSON. It is similar in purpose to XSLT but has a syntax influenced by jQuery, Sass, and CSS versus XSLT's XML based syntax.