Apache Xerces

Last updated
Apache Xerces
Developer(s) Apache Software Foundation
Stable release
2.12.2 (Xerces J)
3.2.3 (Xerces C++) / 24 January 2022 (Xerces J)
10 April 2020 (Xerces C++)
Operating system Cross-platform
Type XML parser library
License Apache License 2.0
Website xerces.apache.org

In computing, Xerces is Apache's collection of software libraries for parsing, validating, serializing and manipulating XML. The library implements a number of standard APIs for XML parsing, including DOM, SAX and SAX2. The implementation is available in the Java, C++ and Perl programming languages.

Contents

The name "Xerces" is believed to commemorate the extinct Xerces blue butterfly (Glaucopsyche xerces). [1]

Xerces language versions

There are several language versions of the Xerces parser:

LanguageRelease DateVersion
Java2022-01-242.12.2
C++2020-04-103.2.3
Perl2014-04-302.7.0

Features

The features supported by Xerces depend on the language, the Java version having the most features.

FeatureJava [3] C++ [4] Perl
eXtensible Markup Language (XML) 1.0 Fourth Edition RecommendationYesPartial Partial
eXtensible Markup Language (XML) 1.1 Second Edition RecommendationYesPartial Partial
Namespaces in XML 1.1 Second Edition RecommendationYesPartial Partial
Namespaces in XML 1.0 Second Edition RecommendationYesPartial Partial
XML Inclusions (XInclude) Version 1.0 Second Edition RecommendationYesYesYes
Simple API for XML (SAX) YesYesYes
Streaming API For XML (StAX) YesNoNo
DOM Level 2 Core SpecificationYesYesYes
DOM Level 2 Traversal and Range SpecificationYesYesYes
Document Object Model (DOM) Level 3 Core, Load and SaveYesYesYes
Element Traversal SpecificationYesYesYes
XML Schema 1.0 Structures and DatatypesYesYesYes
XML Schema 1.1 Structures and DatatypesYesNoNo
XML Schema Definition Language (XSD): Component Designators (SCD)YesNoNo
Java APIs for XML Processing (JAXP) 1.4YesNoNo

See also

Related Research Articles

<span class="mw-page-title-main">Document Object Model</span> Convention for representing and interacting with objects in HTML, XHTML and XML documents

The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.

<span class="mw-page-title-main">Perl</span> Interpreted programming language first released in 1987

Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was officially changed to Raku in October 2019.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

The Java programming language XML APIs developed by Sun Microsystems consist of the following separate computer-programming APIs:

In computing, the Java API for XML Processing, or JAXP, one of the Java XML Application programming interfaces, provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:

SAX is an event-driven online algorithm for lexing and parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole—building the full abstract syntax tree of an XML document for convenience of the user—SAX parsers operate on each piece of the XML document sequentially, issuing parsing events while making a single pass through the input stream.

YAML is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax which intentionally differs from Standard Generalized Markup Language (SGML). It uses both Python-style indentation to indicate nesting, and a more compact format that uses [...] for lists and {...} for maps thus JSON files are valid YAML 1.2.

Expat is a stream-oriented XML 1.0 parser library, written in C. As one of the first available open-source XML parsers, Expat has found a place in many open-source projects. Such projects include the Apache HTTP Server, Mozilla, Perl, Python and PHP. It is also bound in many other languages.

The Apache XML project is part of the Apache Software Foundation and focuses on XML-related projects.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

<span class="mw-page-title-main">Log4j</span> Java-based logging software

Apache Log4j is a Java-based logging utility originally written by Ceki Gülcü. It is part of the Apache Logging Services, a project of the Apache Software Foundation. Log4j is one of several Java logging frameworks.

libxml2 is a software library for parsing XML documents. It is also the basis for the libxslt library which processes XSLT-1.0 stylesheets.

WURFL is a set of proprietary application programming interfaces (APIs) and an XML configuration file which contains information about device capabilities and features for a variety of mobile devices, focused on mobile device detection. Until version 2.2, WURFL was released under an "open source / public domain" license. Prior to version 2.2, device information was contributed by developers around the world and the WURFL was updated frequently, reflecting new wireless devices coming on the market. In June 2011, the founder of the WURFL project, Luca Passani, and Steve Kamerman, the author of Tera-WURFL, a popular PHP WURFL API, formed ScientiaMobile, Inc to provide commercial mobile device detection support and services using WURFL. As of August 30, 2011, the ScientiaMobile WURFL APIs are licensed under a dual-license model, using the AGPL license for non-commercial use and a proprietary commercial license. The current version of the WURFL database itself is no longer open source.

<span class="mw-page-title-main">Apache Jena</span> Open source semantic web framework for Java

Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A model can also be queried through SPARQL 1.1.

Apache Gump is an open source continuous integration system, which aims to build and test all the open source Java projects, every night. Its aim is to make sure that all the projects are compatible, at both the API level and in terms of functionality matching specifications. It is hosted at gump.apache.org, and runs every night on the official Sun JVM.

Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of 2020 is an open source project in the Apache Software Foundation.

XML documents typically refer to external entities, for example the public and/or system ID for the Document Type Definition. These external relationships are expressed using URIs, typically as URLs.

LibSBML is an open-source software library that provides an application programming interface (API) for the SBML format. The libSBML library can be embedded in a software application or used in a web servlet as part of the application or servlet's implementation of support for reading, writing, and manipulating SBML documents and data streams. The core of libSBML is written in ISO standard C++; the library provides API for many programming languages via interfaces generated with the help of SWIG.

Apache Attic is a project of Apache Software Foundation to provide processes to make it clear when an Apache project has reached its end-of-life. The Attic project was created in November 2008. Also the retired projects can be retained.

References

  1. Benz, Brian; Durant, John (7 May 2004). XML Programming Bible. John Wiley & Sons (published 2004). p. 87. ISBN   9780764555763 . Retrieved 2014-10-01. Apparently, the parser was named after the now extinct Xerces blue butterfly, a native of the San Francisco peninsula.
  2. "Apache Xerces Perl". xerces.apache.org. Retrieved 2019-12-08. XML::Xerces is the Perl API to the Apache project's Xerces XML parser. It is implemented using the Xerces C++ API, and it provides access to most of the C++ API from Perl.
  3. "Features". xerces.apache.org. Retrieved 2019-12-08.
  4. "Features". xerces.apache.org. Retrieved 2019-12-08.

Notes