JHOVE

Last updated
JHOVE
Developer(s) JSTOR, Harvard University Library and Open Preservation Foundation
Stable release
1.22 / April 29, 2019;5 years ago (2019-04-29)
Repository
Written in Java
Operating system Cross-platform
License GNU LGPL
Website jhove.openpreservation.org   OOjs UI icon edit-ltr-progressive.svg

JHOVE (JSTOR/Harvard Object Validation Environment) - pronounced "jove" - is a format-specific digital object validation API written in Java. JHOVE was a joint project of JSTOR and the Harvard University Library to develop an extensible framework for format validation. The Open Preservation Foundation took over stewardship of JHOVE in February 2015.

JHOVE is currently available for downloading as version 1.22. [1] It is licensed under the LGPLv2. The download includes both a command line and a GUI version. It is designed so that third parties can attach different "heads" to the software, and so can be integrated with other applications that need to validate files. It can be run on any Unix, Windows, or Macintosh OS X platform which supports Java 1.6. [2]

Currently supported formats are AIFF, ASCII, Bytestream, GIF, HTML, JPEG, JPEG 2000, PDF, TIFF, UTF-8, WAV, and XML. Documents are analyzed and checked for being well-formed (consistent with the basic requirements of the format) and valid (generally signifying internal consistency). JHOVE notes when a file satisfies specific profiles within formats (e.g., PDF/X, HTML 4.0).

A successor called JHOVE2 is currently available; however, it has a completely separate code base, and was last updated in 2014.

Related Research Articles

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

<span class="mw-page-title-main">Jakarta EE</span> Set of specifications extending Java SE

Jakarta EE, formerly Java Platform, Enterprise Edition and Java 2 Platform, Enterprise Edition (J2EE), is a set of specifications, extending Java SE with specifications for enterprise features such as distributed computing and web services. Jakarta EE applications are run on reference runtimes, which can be microservices or application servers, which handle transactions, security, scalability, concurrency and management of the components they are deploying.

<span class="mw-page-title-main">Exif</span> Metadata standard in digital images

Exchangeable image file format is a standard that specifies formats for images, sound, and ancillary tags used by digital cameras, scanners and other systems handling image and sound files recorded by digital cameras. The specification uses the following existing encoding formats with the addition of specific metadata tags: JPEG lossy coding for compressed image files, TIFF Rev. 6.0 for uncompressed image files, and RIFF WAV for audio files. It does not support JPEG 2000 or GIF encoded images.

<span class="mw-page-title-main">WebObjects</span> Java web application server and framework originally developed by NeXT Software

WebObjects is a discontinued Java web application server and a server-based web application framework originally developed by NeXT Software, Inc.

This is a comparison of both historical and current web browsers based on developer, engine, platform(s), releases, license, and cost.

Digital Negative (DNG) is an open, lossless raw image format developed by Adobe and used for digital photography. It was launched on September 27, 2004. The launch was accompanied by the first version of the DNG specification, plus various products, including a free-of-charge DNG converter utility. All Adobe photo manipulation software released since the launch supports DNG.

In library and archival science, digital preservation is a formal process to ensure that digital information of continuing value remains accessible and usable in the long term. It involves planning, resource allocation, and application of preservation methods and technologies, and combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

A camera raw image file contains unprocessed or minimally processed data from the image sensor of either a digital camera, a motion picture film scanner, or other image scanner. Raw files are so named because they are not yet processed, and contain large amounts of potentially redundant data. Normally, the image is processed by a raw converter, in a wide-gamut internal color space where precise adjustments can be made before conversion to a viewable file format such as JPEG or PNG for storage, printing, or further manipulation. There are dozens of raw formats in use by different manufacturers of digital image capture equipment.

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking and encryption. The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.

<span class="mw-page-title-main">Heritrix</span> Web crawler designed for web archiving

Heritrix is a web crawler designed for web archiving. It was written by the Internet Archive. It is available under a free software license and written in Java. The main interface is accessible using a web browser, and there is a command-line tool that can optionally be used to initiate crawls.

<span class="mw-page-title-main">ThunderHawk</span> Web browser

ThunderHawk is a discontinued web browser from Bitstream available for a full range of operating systems in high end and mass-market mobile phones and personal digital assistants. It is basically meant for mobile operators and original equipment manufacturers and not meant to download for normal users.

<span class="mw-page-title-main">LightZone</span>

LightZone is a free, open-source digital photo editor software application. It was originally developed as commercial software by the now-defunct Light Crafts. Its main purpose is to handle the workflow, including non-destructive editing when handling images in various RAW formats. LightZone outputs JPEG files which contain metadata references to the original image file location and a record of the transformations applied during editing. It is comparable to Adobe Lightroom.

<span class="mw-page-title-main">Zipeg</span> Open source free software

Zipeg is an open source free software that extracts files from a wide range of compressed archive formats. Zipeg works under Mac OS X and Windows. It is best known for its file preview ability. It is incapable of compressing files, although it is able to extract compressed ones. Zipeg is built on top of the 7-Zip backend. Its UI is implemented in Java and is open source.

Sidecar files, also known as buddy files or connected files, are computer files that store data which is not supported by the format of a source file.

File carving is the process of reassembling computer files from fragments in the absence of filesystem metadata.

Xena is open-source software for use in digital preservation. Xena is short for XML Electronic Normalising for Archives.

High Efficiency Image File Format (HEIF) is a [[Digital container for storing individual digital images and image sequences. The standard covers multimedia files that can also include other media streams, such as timed text, audio and video.

References