Xena (software)

Last updated

Xena is open-source software for use in digital preservation. Xena is short for XML Electronic Normalising for Archives.

Contents

Xena is a Java application that was developed by the National Archives of Australia. It is available free of charge under the GNU General Public License.

Version 6.1.0 was released 31 July 2013. Source code and binaries for Linux, OS X and Windows are available from SourceForge. However, as of 2018, it is no longer maintained or supported.

Mode of operation

Xena attempts to avoid digital obsolescence by converting files into an openly specified format, such as ODF or PNG. If the file format is not supported or the Binary Normalisation option is selected, Xena will perform ASCII Base64 encoding on binary files and wrap the output in XML metadata. The resulting .xena file is plain text, although the content of the data itself is not directly human-readable. The exact original file can be retrieved by stripping the metadata and reversing the Base64 encoding, using an internal viewer.

Features

Platforms supported by Xena are Microsoft Windows, Linux and Mac OS X.

Xena uses a series of plugins to identify file formats and convert them to an appropriate openly specified format.

Xena has an application programming interface which allows any reasonably skilled Java developer to develop a plugin to cover a new file type.

Xena can process individual files or whole directories. When processing a whole directory, it can preserve the original directory structure of the converted records.

Xena can create plain text versions of file formats such as TIFF, Word and PDF, with the use of Tesseract (software).

The Xena interface or Xena Viewer can be used to view or export a Xena file (extension .xena) in its target file format. These files contain the normalised file as well as any extra information relevant to the normalisation process. The Xena Viewer supports bulk export of Xena files to target file formats.

Xena can be used via its graphical user interface or the command line.

For Xena to be fully functional, it requires a local installation of the following external software:

Supported file types

Xena will recognize and process the file types listed below, plus a few others of minor importance. Unsupported file types will automatically undergo binary normalization.

Office file formats:

Graphics:

Archive Files:

Audio files:

Databases:

Other file types:

Reviews

An April 22, 2010 review in Practical e-Records rated Xena at 82/100 points. At present Xena has no target preservation format for video files. [1]

Related Research Articles

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation from 1987 until 2008 for cross-platform document interchange with Microsoft products. Prior to 2008, Microsoft published updated specifications for RTF with major revisions of Microsoft Word and Office versions.

.doc is a filename extension used for word processing documents stored on Microsoft's proprietary Microsoft Word Binary File Format. Microsoft has used the extension since 1983.

A document file format is a text or binary file format for storing documents on a storage media, especially for use by computers. There currently exist a multitude of incompatible document file formats.

<span class="mw-page-title-main">Binary file</span> Non-human-readable computer file encoded in binary form

A binary file is a computer file that is not a text file. The term "binary file" is often used as a term meaning "non-text file". Many binary file formats contain parts that can be interpreted as text; for example, some computer document files containing formatted text, such as older Microsoft Word document files, contain the text of the document but also contain formatting information in binary form.

A number of vector graphics editors exist for various platforms. Potential users of these editors will make a comparison of vector graphics editors based on factors such as the availability for the user's platform, the software license, the feature set, the merits of the user interface (UI) and the focus of the program. Some programs are more suitable for artistic work while others are better for technical drawings. Another important factor is the application's support of various vector and bitmap image formats for import and export.

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

An image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be compressed or uncompressed. If the data is compressed, it may be done so using lossy compression or lossless compression. For graphic design applications, vector formats are often used. Some image file formats support transparency.

This is an overview of software support for the OpenDocument format, an open document file format for saving and exchanging editable office documents.

This article describes the technical specifications of the OpenDocument office document standard, as developed by the OASIS industry consortium. A variety of organizations developed the standard publicly and make it publicly accessible, meaning it can be implemented by anyone without restriction. The OpenDocument format aims to provide an open alternative to proprietary document formats.

<span class="mw-page-title-main">Comic book archive</span> File format

A comic book archive or comic book reader file is a type of archive file for the purpose of sequential viewing of images, commonly for comic books. The idea was made popular by the CDisplay sequential image viewer; since then, many viewers for different platforms have been created.

The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft to store a combination of XML and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.

The following is a comparison of e-book formats used to create and publish e-books.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.

<span class="mw-page-title-main">Okular</span> Document viewer by KDE

Okular is a multiplatform document viewer developed by the KDE community and based on Qt and KDE Frameworks libraries. It is distributed as part of the KDE Applications bundle. Its origins are from KPDF and it replaces KPDF, KGhostView, KFax, KFaxview and KDVI in KDE 4. Its functionality can be embedded in other applications.

The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulae, graphics, bibliographies etc.

References

  1. "Review of XENA Normalization Software". 2010-04-22. Archived from the original on 2012-07-08.