StAX

Last updated

Streaming API for XML (StAX) is an application programming interface (API) to read and write XML documents, originating from the Java programming language community.

Contents

Traditionally, XML APIs are either:

Both have advantages: DOM, for example, allows for random access to the document, and event driven algorithm like SAX has a small memory footprint and is typically much faster.

These two access metaphors can be thought of as polar opposites. A tree based API allows unlimited, random access and manipulation, while an event based API is a 'one shot' pass through the source document.

StAX was designed as a median between these two opposites. In the StAX metaphor, the programmatic entry point is a cursor that represents a point within the document. The application moves the cursor forward - 'pulling' the information from the parser as it needs. This is different from an event based API - such as SAX - which 'pushes' data to the application - requiring the application to maintain state between events as necessary to keep track of location within the document.

Origins

StAX has its roots in a number of incompatible pull APIs for XML, most notably XMLPULL, the authors of which (Stefan Haustein and Aleksander Slominski) collaborated with, amongst others, BEA Systems, Oracle, Sun and James Clark.

Examples

From JSR-173 Specification• Final, V1.0 (used under fair use).

Quote:

The following Java API shows the main methods for reading XML in the cursor approach.
publicinterfaceXMLStreamReader{publicintnext()throwsXMLStreamException;publicbooleanhasNext()throwsXMLStreamException;publicStringgetText();publicStringgetLocalName();publicStringgetNamespaceURI();// ...other methods not shown }
The writing side of the API has methods that correspond to the reading side for “StartElement” and “EndElement” event types.
publicinterfaceXMLStreamWriter{publicvoidwriteStartElement(StringlocalName)throwsXMLStreamException;publicvoidwriteEndElement()throwsXMLStreamException;publicvoidwriteCharacters(Stringtext)throwsXMLStreamException;// ...other methods not shown }
5.3.1 XMLStreamReader
This example illustrates how to instantiate an input factory, create a reader and iterate over the elements of an XML document.
XMLInputFactoryxmlInputFactory=XMLInputFactory.newInstance();XMLStreamReaderxmlStreamReader=xmlInputFactory.createXMLStreamReader(...);while(xmlStreamReader.hasNext()){xmlStreamReader.next();}

See also

Competing and complementary ways to process XML in Java (the order is loosely based on initial date of introduction):


Related Research Articles

<span class="mw-page-title-main">Document Object Model</span> Convention for representing and interacting with objects in HTML, XHTML, and XML documents

The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an HTML or XML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

Java Platform, Standard Edition is a computing platform for development and deployment of portable code for desktop and server environments. Java SE was formerly known as Java 2 Platform, Standard Edition (J2SE).

In computing, Java XML APIs were developed by Sun Microsystems, consisting separate computer programming application programming interfaces (APIs).

In computing, the Java API for XML Processing (JAXP), one of the Java XML application programming interfaces (APIs), provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:

SAX is an event-driven online algorithm for lexing and parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole—building the full abstract syntax tree of an XML document for convenience of the user—SAX parsers operate on each piece of the XML document sequentially, issuing parsing events while making a single pass through the input stream.

In object-oriented programming, the factory method pattern is a design pattern that uses factory methods to deal with the problem of creating objects without having to specify their exact classes. Rather than by calling a constructor, this is accomplished by invoking a factory method to create an object. Factory methods can be specified in an interface and implemented by subclasses or implemented in a base class and optionally overridden by subclasses. It is one of the 23 classic design patterns described in the book Design Patterns and is subcategorized as a creational pattern.

Expat is a stream-oriented XML 1.0 parser library, written in C, more precisely C99. As one of the first available open-source XML parsers, Expat has found a place in many open-source projects. Such projects include the Apache HTTP Server, Mozilla, Perl, Python and PHP. It is also bound in many other languages.

<span class="mw-page-title-main">Node (computer science)</span> Basic unit of a data structure

A node is a basic unit of a data structure, such as a linked list or tree data structure. Nodes contain data and also may link to other nodes. Links between nodes are often implemented by pointers.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured program

The syntax of Java is the set of rules defining how a Java program is written and interpreted.

<span class="mw-page-title-main">JDOM</span>

JDOM is an open-source Java-based document object model for XML that was designed specifically for the Java platform so that it can take advantage of its language features. JDOM integrates with Document Object Model (DOM) and Simple API for XML (SAX), supports XPath and XSLT. It uses external parsers to build documents. JDOM was developed by Jason Hunter and Brett McLaughlin starting in March 2000. It has been part of the Java Community Process as JSR 102, though that effort has since been abandoned.

XMLBeans is a Java-to-XML binding framework which is part of the Apache Software Foundation XML project.

In the Java computer programming language, an annotation is a form of syntactic metadata that can be added to Java source code. Classes, methods, variables, parameters and Java packages may be annotated. Like Javadoc tags, Java annotations can be read from source files. Unlike Javadoc tags, Java annotations can also be embedded in and read from Java class files generated by the Java compiler. This allows annotations to be retained by the Java virtual machine at run-time and read via reflection. It is possible to create meta-annotations out of the existing ones in Java.

Javolution is a real-time library aiming to make Java or Java-Like/C++ applications faster and more time predictable. Indeed, time-predictability can easily be ruined by the use of the standard library which is not acceptable for safety-critical systems. The open source Javolution library addresses these concerns for the Java platform and native applications. It provides numerous high-performance classes and utilities useful to non real-time applications as well. Such as:

In programming and software design, an event is an action or occurrence recognized by software, often originating asynchronously from the external environment, that may be handled by the software. Computer events can be generated or triggered by the system, by the user, or in other ways. Typically, events are handled synchronously with the program flow. That is, the software may have one or more dedicated places where events are handled, frequently an event loop.

JFugue is an open source programming library that allows one to program music in the Java programming language without the complexities of MIDI. It was first released in 2002 by David Koelle. Version 2 was released under a proprietary license. Versions 3 and 4 were released under the LGPL-2.1-or-later license. The current version, JFugue 5.0, was released in March 2015, under the Apache-2.0 license. Brian Eubanks has described JFugue as "useful for applications that need a quick and easy way to play music or to generate MIDI files."

XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform XML processing technologies centered on a non-extractive XML, "document-centric" parsing technique called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the following:

<span class="mw-page-title-main">XQuery API for Java</span> Application programming interface

XQuery API for Java (XQJ) refers to the common Java API for the W3C XQuery 1.0 specification.

LibSBML is an open-source software library that provides an application programming interface (API) for the SBML format. The libSBML library can be embedded in a software application or used in a web servlet as part of the application or servlet's implementation of support for reading, writing, and manipulating SBML documents and data streams. The core of libSBML is written in ISO standard C++; the library provides API for many programming languages via interfaces generated with the help of SWIG.