Peak signal-to-noise ratio

Last updated

Peak signal-to-noise ratio (PSNR) is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. Because many signals have a very wide dynamic range, PSNR is usually expressed as a logarithmic quantity using the decibel scale.

Contents

PSNR is commonly used to quantify reconstruction quality for images and video subject to lossy compression.

Definition

PSNR is most easily defined via the mean squared error (MSE). Given a noise-free m×n monochrome image I and its noisy approximation K, MSE is defined as

The PSNR (in dB) is defined as

Here, MAXI is the maximum pixel value of the original image.

Application in color images

For color images with three RGB values per pixel, the definition of PSNR is the same except that the MSE is the sum over all squared value differences (now for each color, i.e. three times as many differences as in a monochrome image) divided by image size and by three. Alternately, for color images the image is converted to a different color space and PSNR is reported against each channel of that color space, e.g., YCbCr or HSL. [1] [2]

Quality estimation with PSNR

PSNR is most commonly used to measure the quality of reconstruction of lossy compression codecs (e.g., for image compression). The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs, PSNR is an approximation to human perception of reconstruction quality.

Typical values for the PSNR in lossy image and video compression are between 30 and 50 dB, provided the bit depth is 8  bits, where higher is better. The processing quality of 12-bit images is considered high when the PSNR value is 60 dB or higher. [3] [4] For 16-bit data typical values for the PSNR are between 60 and 80 dB. [5] [6] Acceptable values for wireless transmission quality loss are considered to be about 20 dB to 25 dB. [7] [8]

In the absence of noise, the two images I and K are identical, and thus the MSE is zero. In this case the PSNR is infinite (or undefined, see Division by zero). [9]

PSNR-example-base.png
Original uncompressed image
PSNR-example-comp-90.jpg
Q=90, PSNR 45.53dB
PSNR-example-comp-30.jpg
Q=30, PSNR 36.81dB
PSNR-example-comp-10.jpg
Q=10, PSNR 31.45dB
Example luma PSNR values for a cjpeg compressed image at various quality levels.

Performance comparison

Although a higher PSNR generally correlates with a higher quality reconstruction, in many cases it may not. One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content. [10]

Generally, when it comes to estimating the quality of images and videos as perceived by humans, PSNR has been shown to perform very poorly compared to other quality metrics. [10] [11]

Variants

PSNR-HVS [12] is an extension of PSNR that incorporates properties of the human visual system such as contrast perception.

PSNR-HVS-M improves on PSNR-HVS by additionally taking into account visual masking. [13] In a 2007 study, it delivered better approximations of human visual quality judgements than PSNR and SSIM by large margin. It was also shown to have a distinct advantage over DCTune and PSNR-HVS. [14]

See also

Related Research Articles

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

<span class="mw-page-title-main">Lossy compression</span> Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

<span class="mw-page-title-main">Image compression</span> Reduction of image size to save storage and transmission costs

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.

<span class="mw-page-title-main">Video codec</span> Digital video processing

A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

<span class="mw-page-title-main">FFmpeg</span> Multimedia framework

FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of video and audio files. It is widely used for format transcoding, basic editing, video scaling, video post-production effects and standards compliance.

Similarity may refer to:

Generation loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be considered a form of generation loss. File size increases are a common result of generation loss, as the introduction of artifacts may actually increase the entropy of the data through each generation.

Video quality is a characteristic of a video passed through a video transmission or processing system that describes perceived video degradation. Video processing systems may introduce some amount of distortion or artifacts in the video signal that negatively impact the user's perception of the system. For many stakeholders in video production and distribution, ensuring video quality is an important task.

<span class="mw-page-title-main">Block-matching algorithm</span> System used in computer graphics applications

A Block Matching Algorithm is a way of locating matching macroblocks in a sequence of digital video frames for the purposes of motion estimation. The underlying supposition behind motion estimation is that the patterns corresponding to objects and background in a frame of video sequence move within the frame to form corresponding objects on the subsequent frame. This can be used to discover temporal redundancy in the video sequence, increasing the effectiveness of inter-frame video compression by defining the contents of a macroblock by reference to the contents of a known macroblock which is minimally different.

The structural similarityindex measure (SSIM) is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. It is also used for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measurement or prediction of image quality is based on an initial uncompressed or distortion-free image as reference.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

<span class="mw-page-title-main">Progressive Graphics File</span> File format

PGF is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio.

Image quality can refer to the level of accuracy with which different imaging systems capture, process, store, compress, transmit and display the signals that form an image. Another definition refers to image quality as "the weighted combination of all of the visually significant attributes of an image". The difference between the two definitions is that one focuses on the characteristics of signal processing in different imaging systems and the latter on the perceptual assessments that make an image pleasant for human viewers.

A human visual system model is used by image processing, video processing and computer vision experts to deal with biological and psychological processes that are not yet fully understood. Such a model is used to simplify the behaviors of what is a very complex system. As our knowledge of the true visual system improves, the model is updated.

Uncompressed video is digital video that either has never been compressed or was generated by decompressing previously compressed digital video. It is commonly used by video cameras, video monitors, video recording devices, and in video processors that perform functions such as image resizing, image rotation, deinterlacing, and text and graphics overlay. It is conveyed over various types of baseband digital video interfaces, such as HDMI, DVI, DisplayPort and SDI. Standards also exist for the carriage of uncompressed video over computer networks.

High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main 10 profile has been incorporated into nearly all supporting hardware.

Diagnostically acceptable irreversible compression (DAIC) is the amount of lossy compression which can be used on a medical image to produce a result that does not prevent the reader from using the image to make a medical diagnosis.

<span class="mw-page-title-main">Guetzli</span> JPEG encoder

Guetzli is a freely licensed JPEG encoder that Jyrki Alakuijala, Robert Obryk, and Zoltán Szabadka have developed in Google's Zürich research branch. The encoder seeks to produce significantly smaller files than prior encoders at equivalent quality, albeit at very low speed. It is named after the Swiss German diminutive expression for biscuits, in line with the names of other compression technology from Google.

Video Multimethod Assessment Fusion (VMAF) is an objective full-reference video quality metric developed by Netflix in cooperation with the University of Southern California, The IPI/LS2N lab Nantes Université, and the Laboratory for Image and Video Engineering (LIVE) at The University of Texas at Austin. It predicts subjective video quality based on a reference and distorted video sequence. The metric can be used to evaluate the quality of different video codecs, encoders, encoding settings, or transmission variants.

References

  1. Oriani, Emanuele. "qpsnr: A quick PSNR/SSIM analyzer for Linux" . Retrieved 6 April 2011.
  2. "pnmpsnr User Manual" . Retrieved 6 April 2011.
  3. Faragallah, Osama S.; El-Hoseny, Heba; El-Shafai, Walid; El-Rahman, Wael Abd; El-Sayed, Hala S.; El-Rabaie, El-Sayed M.; El-Samie, Fathi E. Abd; Geweid, Gamal G. N. (2021). "A Comprehensive Survey Analysis for Present Solutions of Medical Image Fusion and Future Directions". IEEE Access. 9: 11358–11371. doi: 10.1109/ACCESS.2020.3048315 . ISSN   2169-3536. This paper presents a survey study of medical imaging modalities and their characteristics. In addition, different medical image fusion approaches and their appropriate quality metrics are presented.
  4. Chervyakov, Nikolay; Lyakhov, Pavel; Nagornov, Nikolay (2020-02-11). "Analysis of the Quantization Noise in Discrete Wavelet Transform Filters for 3D Medical Imaging". Applied Sciences. 10 (4): 1223. doi: 10.3390/app10041223 . ISSN   2076-3417. The image processing quality is considered high if PSNR value is greater than 60 dB for images with 12 bits per color.
  5. Welstead, Stephen T. (1999). Fractal and wavelet image compression techniques. SPIE Publication. pp. 155–156. ISBN   978-0-8194-3503-3.
  6. Raouf Hamzaoui, Dietmar Saupe (May 2006). Barni, Mauro (ed.). Fractal Image Compression. Vol. 968. CRC Press. pp. 168–169. ISBN   9780849335563 . Retrieved 5 April 2011.{{cite book}}: |journal= ignored (help)
  7. Thomos, N., Boulgouris, N. V., & Strintzis, M. G. (2006, January). Optimized Transmission of JPEG2000 Streams Over Wireless Channels. IEEE Transactions on Image Processing, 15 (1).
  8. Xiangjun, L., & Jianfei, C. Robust transmission of JPEG2000 encoded images over packet loss channels. ICME 2007 (pp. 947-950). School of Computer Engineering, Nanyang Technological University.
  9. Salomon, David (2007). Data Compression: The Complete Reference (4 ed.). Springer. p. 281. ISBN   978-1846286025 . Retrieved 26 July 2012.
  10. 1 2 Huynh-Thu, Q.; Ghanbari, M. (2008). "Scope of validity of PSNR in image/video quality assessment". Electronics Letters. 44 (13): 800. Bibcode:2008ElL....44..800H. doi:10.1049/el:20080522.
  11. Huynh-Thu, Quan; Ghanbari, Mohammed (2012-01-01). "The accuracy of PSNR in predicting video quality for different video scenes and frame rates". Telecommunication Systems. 49 (1): 35–48. doi:10.1007/s11235-010-9351-x. ISSN   1018-4864. S2CID   43713764.
  12. Egiazarian, Karen, Jaakko Astola, Nikolay Ponomarenko, Vladimir Lukin, Federica Battisti, and Marco Carli (2006). "New full-reference quality metrics based on HVS." In Proceedings of the Second International Workshop on Video Processing and Quality Metrics, vol. 4.
  13. Ponomarenko, N.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Carli, M. (February 2011). "Modified image visual quality metrics for contrast change and mean shift accounting". 2011 11th International Conference the Experience of Designing and Application of CAD Systems in Microelectronics (CADSM): 305–311.
  14. Nikolay Ponomarenko; Flavia Silvestri; Karen Egiazarian; Marco Carli; Jaakko Astola; Vladimir Lukin, "On between-coefficient contrast masking of DCT basis functions" (PDF), CD-ROM Proceedings of the Third International Workshop on Video Processing and Quality Metrics for Consumer Electronics VPQM-07, 25.–26. Januar 2007 (in German), Scottsdale AZ