JHOVE

Last updated
JHOVE
JhoveLogo.png
Developer(s) JSTOR, Harvard University Library and Open Preservation Foundation
Stable release
1.22 / April 29, 2019;2 years ago (2019-04-29)
Repository
Written in Java
Operating system Cross-platform
License GNU LGPL
Website jhove.openpreservation.org   OOjs UI icon edit-ltr-progressive.svg

JHOVE (JSTOR/Harvard Object Validation Environment) - pronounced "jove" - is a format-specific digital object validation API written in Java. JHOVE was a joint project of JSTOR and the Harvard University Library to develop an extensible framework for format validation. The Open Preservation Foundation took over stewardship of JHOVE in February 2015.

JHOVE is currently available for downloading as version 1.22. [1] It is licensed under the LGPLv2. The download includes both a command line and a GUI version. It is designed so that third parties can attach different "heads" to the software, and so can be integrated with other applications that need to validate files. It can be run on any Unix, Windows, or Macintosh OS X platform which supports Java 1.6. [2]

Currently supported formats are AIFF, ASCII, Bytestream, GIF, HTML, JPEG, JPEG 2000, PDF, TIFF, UTF-8, WAV, and XML. Documents are analyzed and checked for being well-formed (consistent with the basic requirements of the format) and valid (generally signifying internal consistency). JHOVE notes when a file satisfies specific profiles within formats (e.g., PDF/X, HTML 4.0).

A successor called JHOVE2 is currently available; however, it has a completely separate code base, and was last updated in 2014.

Related Research Articles

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991.

Exif Metadata standard in digital images

Exchangeable image file format is a standard that specifies the formats for images, sound, and ancillary tags used by digital cameras, scanners and other systems handling image and sound files recorded by digital cameras. The specification uses the following existing file formats with the addition of specific metadata tags: JPEG discrete cosine transform (DCT) for compressed image files, TIFF Rev. 6.0 for uncompressed image files, and RIFF WAV for audio files. It is not used in JPEG 2000 or GIF.

Digital Negative (DNG) is a patented, open, lossless raw image format developed by Adobe and used for digital photography. Adobe's license allows use without cost on the condition that the licensee prominently displays text saying it is licensed from Adobe in source and documentation, and that the license may be revoked if the licensee brings any patent action against Adobe or its affiliates related to the reading or writing of files that comply with the DNG Specification. It was launched on September 27, 2004. The launch was accompanied by the first version of the DNG specification, plus various products, including a free-of-charge DNG converter utility. All Adobe photo manipulation software released since the launch supports DNG.

UVC-based preservation is an archival strategy for handling the preservation of digital objects. It employs the use of a Universal Virtual Computer (UVC)—a virtual machine (VM) specifically designed for archival purposes, that allows both emulation and migration to a language-neutral format like XML.

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The Association for Library Collections and Technical Services Preservation and Reformatting Section of the American Library Association, defined digital preservation as combination of "policies, strategies and actions that ensure access to digital content over time." According to the Harrod's Librarian Glossary, digital preservation is the method of keeping digital material alive so that they remain usable as technological advances render original hardware and software specification obsolete.

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

DSpace

DSpace is an open source repository software package typically used for creating open access repositories for scholarly and/or published digital content. While DSpace shares some feature overlap with content management systems and document management systems, the DSpace repository software serves a specific need as a digital archives system, focused on the long-term storage, access and preservation of digital content. The optional DSpace registry lists almost three thousand repositories all over the world.

A camera raw image file contains minimally processed data from the image sensor of either a digital camera, a motion picture film scanner, or other image scanner. Raw files are named so because they are not yet processed and therefore are not ready to be printed or edited with a bitmap graphics editor. Normally, the image is processed by a raw converter in a wide-gamut internal color space where precise adjustments can be made before conversion to a "positive" file format such as TIFF or JPEG for storage, printing, or further manipulation. There are dozens of raw formats in use by different manufacturers of digital image capture equipment.

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking and encryption. The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

PRONOM is a web-based technical registry to support digital preservation services, developed by The National Archives of the United Kingdom. PRONOM was the first and remains, to date, the only operational public file format registry in the world, although the "Magic File" repository of the File Command has served this role in a less formal capacity for two decades. Other projects to develop technical registries, including the UK Digital Curation Centre's Representation Information Registry, and the Global Digital Format Registry project at Harvard University, are now in progress.

LightZone

LightZone is a free, open-source digital photo editor software application. It was originally developed as commercial software by the now-defunct Light Crafts. Its main purpose is to handle the workflow, including non-destructive editing when handling images in various RAW formats. It is comparable to Adobe Lightroom.

ExifTool Software

ExifTool is a free and open-source software program for reading, writing, and manipulating image, audio, video, and PDF metadata. It is platform independent, available as both a Perl library (Image::ExifTool) and command-line application. ExifTool is commonly incorporated into different types of digital workflows and supports many types of metadata including Exif, IPTC, XMP, JFIF, GeoTIFF, ICC Profile, Photoshop IRB, FlashPix, AFCP and ID3, as well as the manufacturer-specific metadata formats of many digital cameras.

Sidecar files, also known as buddy files or connected files, are computer files that store data which is not supported by the format of a source file.

Database preservation usually involves converting the information stored in a database to a form likely to be accessible in the long term as technology changes, without losing the initial characteristics of the data.

Xena is open-source software for use in digital preservation. Xena is short for XML Electronic Normalising for Archives.

An .ipa file is an iOS application archive file which stores an iOS app. Each .ipa file includes a binary and can only be installed on an iOS or ARM-based MacOS device. Files with the .ipa extension can be uncompressed by changing the extension to .zip and unzipping.

This article presents a comparison of digital image metadata viewers and metadata editors.

High Efficiency Image File Format (HEIF) is a container format for storing individual images and image sequences. The standard covers multimedia files that can also include other media streams, such as timed text, audio and video.

References