BaseX

Last updated
BaseX
Original author(s) Christian Grün
Initial release2007
Stable release
10.7 / August 4, 2023;9 days ago (2023-08-04)
Repository
Written in Java
Platform Java SE
Available inEnglish, Dutch, French, German, Hungarian, Indonesian, Italian, Japanese, Mongolian, Romanian, Russian, Spanish [1]
Type XML database
License BSD-3-Clause [2]
Website basex.org

BaseX is a native and light-weight XML database management system and XQuery processor, developed as a community project on GitHub. [3] It is specialized in storing, querying, and visualizing large XML documents and collections. [4] BaseX is platform-independent and distributed under the BSD-3-Clause license. [2]

Contents

In contrast to other document-oriented databases, XML databases provide support for standardized query languages such as XPath and XQuery. BaseX is highly conformant to World Wide Web Consortium (W3C) specifications [5] [6] and the official Update and Full Text extensions. The included GUI enables users to interactively search, explore and analyze their data, and evaluate XPath/XQuery expressions in realtime (i.e., while the user types).

Technologies

Database layout

BaseX uses a tabular representation of XML tree structures to store XML documents. The database acts as a container for a single document or a collection of documents. The XPath Accelerator encoding scheme and Staircase Join Operator have been taken as inspiration for speeding up XPath location steps. [8] Additionally, BaseX provides several types of indices to improve the performance of path operations, attribute lookups, text comparisons and full-text searches. [9]

History

BaseX was started by Christian Grün at the University of Konstanz in 2005. In 2007, BaseX went open source and has been under the BSD-3-Clause license since then. [10] [11]

Supported systems

The BaseX server is a pure Java 1.8 application and thus runs on any system that provides a suitable Java implementation. It has been tested on Windows, Mac OS X, Linux and OpenBSD. [12] In particular, packages are available for Debian [13] and Ubuntu. [14]

Further reading

Related Research Articles

Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitrary key/data pairs as byte arrays, and supports multiple data items for a single key. Berkeley DB is not a relational database, although it has database features including database transactions, multiversion concurrency control and write-ahead logging. BDB runs on a wide variety of operating systems including most Unix-like and Windows systems, and real-time operating systems.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.

The Java programming language XML APIs developed by Sun Microsystems consist of the following separate computer-programming APIs:

eXist-db is an open source software project for NoSQL databases built on XML technology. It is classified as both a NoSQL document-oriented database system and a native XML database. Unlike most relational database management systems (RDBMS) and NoSQL databases, eXist-db provides XQuery and XSLT as its query and application programming languages.

An XML database is a data persistence software system that allows data to be specified, and sometimes stored, in XML format. This data can be queried, transformed, exported and returned to a calling system. XML databases are a flavor of document-oriented databases which are in turn a category of NoSQL database.

CPython is the reference implementation of the Python programming language. Written in C and Python, CPython is the default and most widely used implementation of the Python language.

<span class="mw-page-title-main">Metalink</span> File format that describes one or more computer files available for download

Metalink is an extensible metadata file format that describes one or more computer files available for download. It specifies files appropriate for the user's language and operating system; facilitates file verification and recovery from data corruption; and lists alternate download sources.

XQuery Update Facility is an extension to the XML Query language, XQuery. It provides expressions that can be used to make changes to instances of the XQuery 1.0 and XPath 2.0 Data Model.

Strigi was a file indexing and file search framework adopted by KDE SC. Strigi was initiated by Jos van den Oever. Strigi's goals are to be fast, use a small amount of RAM, and use flexible backends and plug-ins. A benchmark as of January 2007 showed that Strigi is faster and uses less memory than other search systems, but it lacks many of their features. Like most desktop search systems, Strigi can extract information from files, such as the length of an audio clip, the contents of a document, or the resolution of a picture; plugins determine what filetypes it is capable of handling. Strigi uses its own Jstream system which allows for deep indexing of files. Strigi is accessible via Konqueror, or by clicking on its icon, after adding it to KDE's Kicker or GNOME Panel. The graphical user interface (GUI) is named Strigiclient.

XPath is an expression language designed to support the query or transformation of XML documents. It was defined by the World Wide Web Consortium (W3C) in 1999, and can be used to compute values from the content of an XML document. Support for XPath exists in applications that support XML, such as web browsers, and many programming languages.

In software development XRX is a web application architecture based on XForms, REST and XQuery. XRX applications store data on both the web client and on the web server in XML format and do not require a translation between data formats. XRX is considered a simple and elegant application architecture due to the minimal number of translations needed to transport data between client and server systems. The XRX architecture is also tightly coupled to W3C standards to ensure XRX applications will be robust in the future. Because XRX applications leverage modern declarative languages on the client and functional languages on the server they are designed to empower non-developers who are not familiar with traditional imperative languages such as JavaScript, Java or .Net.

<span class="mw-page-title-main">WeeChat</span> IRC client

WeeChat is a free and open-source Internet Relay Chat client that is designed to be light and fast. It is released under the terms of the GNU GPL-3.0-or-later and has been developed since 2003.

XQuery is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats. The language is developed by the XML Query working group of the W3C. The work is closely coordinated with the development of XSLT by the XSL Working Group; the two groups share responsibility for XPath, which is a subset of XQuery.

<span class="mw-page-title-main">XQuery API for Java</span> Application programming interface

XQuery API for Java (XQJ) refers to the common Java API for the W3C XQuery 1.0 specification.

<span class="mw-page-title-main">XML transformation language</span> Type of programming language

An XML transformation language is a programming language designed specifically to transform an input XML document into an output document which satisfies some specific goal.

Zorba is an open source query processor written in C++, implementing

Qizx is a proprietary XML database that provides native storage for XML data.

References

  1. "Translations - BaseX Documentation".
  2. 1 2 "BaseX Open Source" . Retrieved 2021-06-28.
  3. GitHub: BaseX
  4. "Overview on database instances created with BaseX" . Retrieved 30 June 2011.
  5. "W3C: XQuery Test Suite Result Summary". World Wide Web Consortium. Retrieved 30 June 2011.
  6. "W3C: XPath and XQuery Full Text 1.0 Test Suite Result Summary". World Wide Web Consortium. Retrieved 30 June 2011.
  7. BaseX XQJ API
  8. Christian Grün; Marc Kramis; Alexander Holupirek; Marc H. Scholl; Marcel Waldvogel (30 June 2006). "Pushing XPath accelerator to its limits" (PDF). Universität Konstanz. Archived from the original (PDF) on 27 September 2011. Retrieved 30 June 2011.
  9. "Storing and Querying Large XML Instances" (PDF). Universität Konstanz. Archived from the original (PDF) on 9 October 2011. Retrieved 30 June 2011.
  10. "BaseX 5.0: XML Database with Visual Frontend". Linux Magazine . Retrieved 30 June 2011.
  11. "Open Source Kompetenzzentrum of the german Bundesverwaltungsamt" (in German). Archived from the original on 3 November 2011. Retrieved 30 June 2011.
  12. "Startup - BaseX Documentation".
  13. "Debian -- Package search results -- basex".
  14. "basex package: Ubuntu". 25 April 2023.