Developer(s) | Apache Software Foundation |
---|---|
Stable release | 2.12.2 (Xerces J) 3.2.3 (Xerces C++) / 24 January 2022 (Xerces J) 10 April 2020 (Xerces C++) |
Operating system | Cross-platform |
Type | XML parser library |
License | Apache License 2.0 |
Website | xerces |
In computing, Xerces is Apache's collection of software libraries for parsing, validating, serializing and manipulating XML. The library implements a number of standard APIs for XML parsing, including DOM, SAX and SAX2. The implementation is available in the Java, C++ and Perl programming languages.
The name "Xerces" is believed to commemorate the extinct Xerces blue butterfly (Glaucopsyche xerces). [1]
There are several language versions of the Xerces parser:
Language | Release Date | Version |
---|---|---|
Java | 2022-01-24 | 2.12.2 |
C++ | 2020-04-10 | 3.2.3 |
Perl | 2014-04-30 | 2.7.0 |
The features supported by Xerces depend on the language, the Java version having the most features.
Feature | Java [3] | C++ [4] | Perl |
---|---|---|---|
eXtensible Markup Language (XML) 1.0 Fourth Edition Recommendation | Yes | Partial | Partial |
eXtensible Markup Language (XML) 1.1 Second Edition Recommendation | Yes | Partial | Partial |
Namespaces in XML 1.1 Second Edition Recommendation | Yes | Partial | Partial |
Namespaces in XML 1.0 Second Edition Recommendation | Yes | Partial | Partial |
XML Inclusions (XInclude) Version 1.0 Second Edition Recommendation | Yes | Yes | Yes |
Simple API for XML (SAX) | Yes | Yes | Yes |
Streaming API For XML (StAX) | Yes | No | No |
DOM Level 2 Core Specification | Yes | Yes | Yes |
DOM Level 2 Traversal and Range Specification | Yes | Yes | Yes |
Document Object Model (DOM) Level 3 Core, Load and Save | Yes | Yes | Yes |
Element Traversal Specification | Yes | Yes | Yes |
XML Schema 1.0 Structures and Datatypes | Yes | Yes | Yes |
XML Schema 1.1 Structures and Datatypes | Yes | No | No |
XML Schema Definition Language (XSD): Component Designators (SCD) | Yes | No | No |
Java APIs for XML Processing (JAXP) 1.4 | Yes | No | No |
The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was officially changed to Raku in October 2019.
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
The Java programming language XML APIs developed by Sun Microsystems consist of the following separate computer-programming APIs:
In computing, the Java API for XML Processing, or JAXP, one of the Java XML Application programming interfaces, provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:
SAX is an event-driven online algorithm for lexing and parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole—building the full abstract syntax tree of an XML document for convenience of the user—SAX parsers operate on each piece of the XML document sequentially, issuing parsing events while making a single pass through the input stream.
YAML is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax which intentionally differs from Standard Generalized Markup Language (SGML). It uses both Python-style indentation to indicate nesting, and a more compact format that uses [...]
for lists and {...}
for maps thus JSON files are valid YAML 1.2.
Expat is a stream-oriented XML 1.0 parser library, written in C. As one of the first available open-source XML parsers, Expat has found a place in many open-source projects. Such projects include the Apache HTTP Server, Mozilla, Perl, Python and PHP. It is also bound in many other languages.
The Apache XML project is part of the Apache Software Foundation and focuses on XML-related projects.
JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.
Apache Log4j is a Java-based logging utility originally written by Ceki Gülcü. It is part of the Apache Logging Services, a project of the Apache Software Foundation. Log4j is one of several Java logging frameworks.
libxml2 is a software library for parsing XML documents. It is also the basis for the libxslt library which processes XSLT-1.0 stylesheets.
WURFL is a set of proprietary application programming interfaces (APIs) and an XML configuration file which contains information about device capabilities and features for a variety of mobile devices, focused on mobile device detection. Until version 2.2, WURFL was released under an "open source / public domain" license. Prior to version 2.2, device information was contributed by developers around the world and the WURFL was updated frequently, reflecting new wireless devices coming on the market. In June 2011, the founder of the WURFL project, Luca Passani, and Steve Kamerman, the author of Tera-WURFL, a popular PHP WURFL API, formed ScientiaMobile, Inc to provide commercial mobile device detection support and services using WURFL. As of August 30, 2011, the ScientiaMobile WURFL APIs are licensed under a dual-license model, using the AGPL license for non-commercial use and a proprietary commercial license. The current version of the WURFL database itself is no longer open source.
Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A model can also be queried through SPARQL 1.1.
Apache Gump is an open source continuous integration system, which aims to build and test all the open source Java projects, every night. Its aim is to make sure that all the projects are compatible, at both the API level and in terms of functionality matching specifications. It is hosted at gump.apache.org
, and runs every night on the official Sun JVM.
Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of 2020 is an open source project in the Apache Software Foundation.
XML documents typically refer to external entities, for example the public and/or system ID for the Document Type Definition. These external relationships are expressed using URIs, typically as URLs.
LibSBML is an open-source software library that provides an application programming interface (API) for the SBML format. The libSBML library can be embedded in a software application or used in a web servlet as part of the application or servlet's implementation of support for reading, writing, and manipulating SBML documents and data streams. The core of libSBML is written in ISO standard C++; the library provides API for many programming languages via interfaces generated with the help of SWIG.
Apache Attic is a project of Apache Software Foundation to provide processes to make it clear when an Apache project has reached its end-of-life. The Attic project was created in November 2008. Also the retired projects can be retained.
Apparently, the parser was named after the now extinct Xerces blue butterfly, a native of the San Francisco peninsula.
XML::Xerces is the Perl API to the Apache project's Xerces XML parser. It is implemented using the Xerces C++ API, and it provides access to most of the C++ API from Perl.