Expat (software)

Last updated
Expat
Original author(s) James Clark
Developer(s) Clark Cooper, et al.
Initial release1998;26 years ago (1998)
Stable release
2.6.3 [1]   OOjs UI icon edit-ltr-progressive.svg / 4 September 2024;13 days ago (4 September 2024)
Repository
Written in C
Operating system Portable
Type XML parser library
License MIT License [2]
Website libexpat.github.io

Expat is a stream-oriented XML 1.0 parser library, written in C, more precisely C99. [3] As one of the first available open-source XML parsers, Expat has found a place in many open-source projects. Such projects include the Apache HTTP Server, Mozilla, Perl, Python and PHP. It is also bound in many other languages.

Contents

Naming

According to the original creator, the name Expat came from the fact that he was an expat at the time.[ citation needed ] The "ex" and the "pa" are for XML and parsing.

Timeline

Software developer James Clark released version 1.0 in 1998 while serving as technical lead on the XML Working Group at the World Wide Web Consortium.[ citation needed ] Clark released two more versions, 1.1 and 1.2, before turning the project over to a group led by Clark Cooper and Fred Drake in 2000. The new group released version 1.95.0 in September 2000 and continues to release new versions to incorporate bug fixes and enhancements.

Availability

GitHub hosts the Expat project. Versions exist for most[ quantify ] major[ citation needed ] operating-systems.

Deployment

To use the Expat library, programs first register handler functions with Expat. When Expat parses an XML document, it calls the registered handlers as it finds relevant tokens in the input stream. These tokens and their associated handler calls are called events. Typically, programs register handler functions for XML element start or stop events and character events. Expat provides facilities for more sophisticated event handling such as XML Namespace declarations, processing instructions and DTD events.

Expat's parsing events resemble the events defined in the Simple API for XML (SAX), but Expat is not a SAX-compliant parser. Projects incorporating the Expat library often build SAX and possibly DOM parsers on top of Expat. While Expat is mainly a stream-based (push) parser, it supports stopping and restarting parsing at arbitrary times, thus making the implementation of a pull parser relatively easy as well.

Related Research Articles

<span class="mw-page-title-main">Document Object Model</span> Convention for representing and interacting with objects in HTML, XHTML, and XML documents

The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an HTML or XML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.

In computing, the Java API for XML Processing (JAXP), one of the Java XML application programming interfaces (APIs), provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:

GNU Bison, commonly known as Bison, is a parser generator that is part of the GNU Project. Bison reads a specification in Bison syntax, warns about any parsing ambiguities, and generates a parser that reads sequences of tokens and decides whether the sequence conforms to the syntax specified by the grammar.

SAX is an event-driven online algorithm for lexing and parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole—building the full abstract syntax tree of an XML document for convenience of the user—SAX parsers operate on each piece of the XML document sequentially, issuing parsing events while making a single pass through the input stream.

HaXml is a collection of utilities for parsing, filtering, transforming, and generating Extensible Markup Language (XML) documents using the programming language Haskell.

Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

YAML is a human-readable data serialization language. It is commonly used for configuration files and in applications where data are being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

James Clark is a software engineer and creator of various open-source software including groff, expat and several XML specifications.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a commonly used data format with diverse uses in electronic data interchange, including that of web applications with servers.

An INI file is a configuration file for computer software that consists of plain text with a structure and syntax comprising key–value pairs organized in sections. The name of these configuration files comes from the filename extension INI, short for initialization, used in the MS-DOS operating system which popularized this method of software configuration. The format has become an informal standard in many contexts of configuration, but many applications on other operating systems use different file name extensions, such as conf and cfg.

Streaming API for XML (StAX) is an application programming interface (API) to read and write XML documents, originating from the Java programming language community.

Tea is a high-level scripting language for the Java environment. It combines features of Scheme, Tcl, and Java.

libxml2 is a software library for parsing XML documents. It is also the basis for the libxslt library which processes XSLT-1.0 stylesheets.

jQuery is a JavaScript library designed to simplify HTML DOM tree traversal and manipulation, as well as event handling, CSS animations, and Ajax. It is free, open-source software using the permissive MIT License. As of August 2022, jQuery is used by 77% of the 10 million most popular websites. Web analysis indicates that it is the most widely deployed JavaScript library by a large margin, having at least three to four times more usage than any other JavaScript library.

A loop-switch sequence is a programming antipattern where a clear set of steps is implemented as a switch-within-a-loop. The loop-switch sequence is a specific derivative of spaghetti code.

XML documents typically refer to external entities, for example the public and/or system ID for the Document Type Definition. These external relationships are expressed using URIs, typically as URLs.

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform XML processing technologies centered on a non-extractive XML, "document-centric" parsing technique called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the following:

LibSBML is an open-source software library that provides an application programming interface (API) for the SBML format. The libSBML library can be embedded in a software application or used in a web servlet as part of the application or servlet's implementation of support for reading, writing, and manipulating SBML documents and data streams. The core of libSBML is written in ISO standard C++; the library provides API for many programming languages via interfaces generated with the help of SWIG.

References

  1. "Release 2.6.3 · libexpat/libexpat" . Retrieved 4 September 2024.
  2. "COPYING". Github. Retrieved 16 September 2019.
  3. Pipping, Sebastian (2024-02-06). "Expat 2.6.0 released, includes security fixes". www.xml.com. Retrieved 2024-09-04.