Transparency (data compression)

Last updated June 02, 2024

In data compression and psychoacoustics, transparency is the result of lossy data compression accurate enough that the compressed result is perceptually indistinguishable from the uncompressed input, i.e. perceptually lossless.

A transparency threshold is a given value at which transparency is reached. It is commonly used to describe compressed data bitrates. For example, the transparency threshold for MP3 to linear PCM audio is said to be between 175 and 245 kbit/s, at 44.1 kHz, when encoded as VBR MP3 (corresponding to the -V3 and -V0 settings of the highly popular LAME MP3 encoder).^[1] This means that when an MP3 that was encoded at those bitrates is being played back, it is indistinguishable from the original PCM, and the compression is transparent to the listener.

The term transparent compression can also refer to a filesystem feature that allows compressed files to be read and written just like regular ones. In this case, the compressor is typically a general-purpose lossless compressor.

Determination

Transparency, like sound or video quality, is subjective. It depends most on the listener's familiarity with digital artifacts, their awareness that artifacts may in fact be present, and to a lesser extent, the compression method, bit rate used, input characteristics, and the listening/viewing conditions and equipment. Despite this, sometimes general consensus is formed for what compression options "should" provide transparent results for most people on most equipment. Due to the subjectivity and the changing nature of compression, recording, and playback technology, such opinions should be considered only as rough estimates rather than established fact.

Judging transparency can be difficult, due to observer bias, in which subjective like/dislike of a certain compression methodology emotionally influences their judgment. This bias is commonly referred to as placebo , although this use is slightly different from the medical use of the term.

To scientifically prove that a compression method is not transparent, double-blind tests may be useful. The ABX method is normally used, with a null hypothesis that the samples tested are the same and with an alternative hypothesis that the samples are in fact different.

All lossless data compression methods are transparent, by nature.

In image compression

Both the DSC in DisplayPort and the default settings of JPEG XL ^[2] are regarded as visually lossless. The losslessness is usually determined by a flicker test: the display initially shows the compressed and the original side-by-side, switches them around for a tiny fraction of a second and then goes back to the original. This test is more sensitive than a side-by-side comparison ("visually almost lossless"), as the human eye is highly sensitive to temporal changes in light.^[3] There is also a panning test that is purportedly more representative of sensitivity in the case of moving images than the flicker test.^[4]

Difference from a lack of artifacts

A perceptually lossless compression is always free of compression artifacts, but the inverse is not true: it is possible for a compressor to produce a signal that appears natural but with altered contents. Such a confusion is widely present in the field of radiology (specifically for the study of diagnostically acceptable irreversible compression), where visually lossless is taken to mean anywhere from artifact-free^[5] to being indistinguishable on a side-to-side view,^[6] neither being as stringent as the flicker test.

Related Research Articles

A codec is a device or computer program that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

<span class="mw-page-title-main">Lossy compression</span> Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.

<span class="mw-page-title-main">MP3</span> Digital audio format

MP3 is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg, with support from other digital scientists in other countries. Originally defined as the third audio format of the MPEG-1 standard, it was retained and further extended—defining additional bit rates and support for more audio channels—as the third audio format of the subsequent MPEG-2 standard. A third version, known as MPEG-2.5—extended to better support lower bit rates—is commonly implemented but is not a recognized standard.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high-resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.

Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC, in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal perceptible loss in quality. Improvements to the codec in the form of ATRAC3, ATRAC3plus, and ATRAC Advanced Lossless followed in 1999, 2002, and 2006 respectively.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s. It was formerly known as MPEGplus, MPEG+ or MP+.

Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for video data files, audio files, or character encoding. This is usually done in cases where a target device does not support the format or has limited storage capacity that mandates a reduced file size, or to convert incompatible or obsolete data to a better-supported or modern format.

Generation loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be considered a form of generation loss. File size increases are a common result of generation loss, as the introduction of artifacts may actually increase the entropy of the data through each generation.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

A codec listening test is a scientific study designed to compare two or more lossy audio codecs, usually with respect to perceived fidelity or compression efficiency.

<span class="mw-page-title-main">Sub-band coding</span>

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Display Stream Compression (DSC) is a VESA-developed video compression algorithm designed to enable increased display resolutions and frame rates over existing physical interfaces, and make devices smaller and lighter, with longer battery life. It is a low-latency algorithm based on delta PCM coding and YC_GC_O-R color space.

ZPEG is a motion video technology that applies a human visual acuity model to a decorrelated transform-domain space, thereby optimally reducing the redundancies in motion video by removing the subjectively imperceptible. This technology is applicable to a wide range of video processing problems such as video optimization, real-time motion video compression, subjective quality monitoring, and format conversion.

JPEG XS is an interoperable, visually lossless, low-latency and lightweight image and video coding system used in professional applications. Applications of the standard include streaming high quality content for virtual reality, drones, autonomous vehicles using cameras, gaming, and broadcasting. It was the first ISO codec ever designed for this specific purpose. JPEG XS, built on core technology from both intoPIX and Fraunhofer IIS, is formally standardized as ISO/IEC 21122 by the Joint Photographic Experts Group with the first edition published in 2019. Although not official, the XS acronym was chosen to highlight the eXtra Small and eXtra Speed characteristics of the codec. Today, the JPEG committee is still actively working on further improvements to XS, with the second edition scheduled for publication and initial efforts being launched towards a third edition.

References

↑ LAME Recommended Encoder Settings, hydrogenaudio
↑ cjxl(1) – Linux General Commands Manual
↑ "Annex B. Forced choice paradigm with interleaved images test protocol". ISO/IEC 29170-2:2015 Information technology — Advanced image coding and evaluation — Part 2: Evaluation procedure for nearly lossless coding . International Organization for Standardization.
↑ Allison, Robert; Wilcox, Laurie; Wang, Wei; Hoffman, David; Hou, Yuqian; Goel, James; Deas, Lesley; Stolitzka, Dale. Large Scale Subjective Evaluation of Display Stream Compression. The Society for Information Display's annual Display Week 2017.
↑ European Society of Radiology (April 2011). "Usability of irreversible image compression in radiological imaging. A position paper by the European Society of Radiology (ESR)". Insights into Imaging. 2 (2): 103–115. doi: 10.1007/s13244-011-0071-x . PMC 3259360 . PMID 22347940.
↑ Kim, Kil Joong; Kim, Bohyoung; Lee, Kyoung Ho; Mantiuk, Rafal; Richter, Thomas; Kang, Heung Sik (September 2013). "Use of Image Features in Predicting Visually Lossless Thresholds of JPEG2000 Compressed Body CT Images: Initial Trial". Radiology. 268 (3): 710–718. doi: 10.1148/radiol.13122015 . PMID 23630311.

Bosi, Marina; Richard E. Goldberg. Introduction to digital audio coding and standards. Springer, 2003. ISBN 1-4020-7357-7
Cvejic, Nedeljko; Tapio Seppänen. Digital audio watermarking techniques and technologies: applications and benchmarks. Idea Group Inc (IGI), 2007. ISBN 1-59904-513-3
Pohlmann, Ken C. Principles of digital audio. McGraw-Hill Professional, 2005. ISBN 0-07-144156-5
Spanias, Andreas; Ted Painter; Venkatraman Atti. Audio signal processing and coding. Wiley-Interscience, 2007. ISBN 0-471-79147-4
Syed, Mahbubur Rahman. Multimedia technologies: concepts, methodologies, tools, and applications, Volume 3. Idea Group Inc (IGI), 2008. ISBN 1-59904-953-8

External links

"Transparency", Hydrogen Audio Wiki

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[LAME_Recommended_Encoder_Settings-1] LAME Recommended Encoder Settings, hydrogenaudio

[2] cjxl(1) – Linux General Commands Manual

[3] "Annex B. Forced choice paradigm with interleaved images test protocol". ISO/IEC 29170-2:2015 Information technology — Advanced image coding and evaluation — Part 2: Evaluation procedure for nearly lossless coding . International Organization for Standardization.

[4] Allison, Robert; Wilcox, Laurie; Wang, Wei; Hoffman, David; Hou, Yuqian; Goel, James; Deas, Lesley; Stolitzka, Dale. Large Scale Subjective Evaluation of Display Stream Compression. The Society for Information Display's annual Display Week 2017.

[5] European Society of Radiology (April 2011). "Usability of irreversible image compression in radiological imaging. A position paper by the European Society of Radiology (ESR)". Insights into Imaging. 2 (2): 103–115. doi: 10.1007/s13244-011-0071-x . PMC 3259360 . PMID 22347940.

[6] Kim, Kil Joong; Kim, Bohyoung; Lee, Kyoung Ho; Mantiuk, Rafal; Richter, Thomas; Kang, Heung Sik (September 2013). "Use of Image Features in Predicting Visually Lossless Thresholds of JPEG2000 Compressed Body CT Images: Initial Trial". Radiology. 268 (3): 710–718. doi: 10.1148/radiol.13122015 . PMID 23630311.

[1]

[2]

[3]

[4]

[5]

[6]