Perceptual Evaluation of Video Quality

Last updated

Perceptual Evaluation of Video Quality(PEVQ) is an end-to-end (E2E) measurement algorithm to score the picture quality of a video presentation by means of a 5-point mean opinion score (MOS). It is, therefore, a video quality model. PEVQ was benchmarked by the Video Quality Experts Group (VQEG) in the course of the Multimedia Test Phase 2007–2008. Based on the performance results, in which the accuracy of PEVQ was tested against ratings obtained by human viewers, PEVQ became part of the new International Standard. [1]

Contents

Application

The measurement algorithm can be applied to analyze visible artifacts caused by a digital video encoding/decoding (or transcoding) process, radio- or IP-based transmission networks and end-user devices. Application scenarios address next generation networking and mobile services and include IPTV (Standard-definition television and HDTV), streaming video, Mobile TV, video telephony, video conferencing and video messaging.

The measurement paradigm is to assess degradations of a decoded video sequence output from the network (for example as received by a TV set top box) in comparison to the original reference picture (broadcast from the studio). Consequently, the setup is referred to as end-to-end (E2E) quality testing.

Algorithm

The development for picture quality analysis algorithms available today started with still image models which were later enhanced to also cover motion pictures. PEVQ is full-reference algorithm (see the classification of models in video quality) and analyzes the picture pixel-by-pixel after a temporal alignment (also referred to as 'temporal registration') of corresponding frames of reference and test signal. PEVQ MOS results range from 1 (bad) to 5 (excellent) and indicate the perceived quality of the decoded sequence.

PEVQ is based on modeling the behavior of the human visual system. In addition to an overall MOS score, PEVQ quantifies abnormalities in the video signal by a variety of KPIs, including PSNR, distortion indicators and lip-sync delay.

See also

Related Research Articles

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x family of video coding standards in the domain of the ITU-T.

Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitatively measure quality of service, several related aspects of the network service are often considered, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc.

Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale that a subject assigns to his opinion of the performance of a system quality". Such ratings are usually gathered in a subjective quality evaluation test, but they can also be algorithmically estimated.

Perceptual Speech Quality Measure (PSQM) is a computational and modeling algorithm defined in Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band speech codecs. It may be used to rank the performance of these speech codecs with differing speech input levels, talkers, bit rates and transcodings. P.861 was withdrawn and replaced by Recommendation ITU-T P.862 (PESQ), which contains an improved speech assessment algorithm.

Video quality is a characteristic of a video passed through a video transmission or processing system that describes perceived video degradation. Video processing systems may introduce some amount of distortion or artifacts in the video signal that negatively impact the user's perception of the system. For many stakeholders in video production and distribution, ensuring video quality is an important task.

Subjective video quality is video quality as experienced by humans. It is concerned with how video is perceived by a viewer and designates their opinion on a particular video sequence. It is related to the field of Quality of Experience. Measuring subjective video quality is necessary because objective quality assessment algorithms such as PSNR have been shown to correlate poorly with subjective ratings. Subjective ratings may also be used as ground truth to develop new algorithms.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

Quality of experience (QoE) is a measure of the delight or annoyance of a customer's experiences with a service. QoE focuses on the entire service experience; it is a holistic concept, similar to the field of user experience, but with its roots in telecommunication. QoE is an emerging multidisciplinary field based on social psychology, cognitive science, economics, and engineering science, focused on understanding overall human quality requirements.

Perceptual Evaluation of Audio Quality (PEAQ) is a standardized algorithm for objectively measuring perceived audio quality, developed in 1994–1998 by a joint venture of experts within Task Group 6Q of the International Telecommunication Union's Radiocommunication Sector (ITU-R). It was originally released as ITU-R Recommendation BS.1387 in 1998 and last updated in 2023. It utilizes software to simulate perceptual properties of the human ear and then integrates multiple model output variables into a single metric.

Latency refers to a short period of delay between when an audio signal enters a system, and when it emerges. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processing, transmission time, digital-to-analog conversion, and the speed of sound in the transmission medium.

Genista Corporation was a company that used computational models of human visual and auditory systems to measure what human viewers see and hear. The company offered quality measurement technology that estimated the experienced quality that would be measured by a mean opinion score (MOS) resulting from subjective tests using actual human test subjects.

Perceptual Evaluation of Speech Quality (PESQ) is a family of standards comprising a test methodology for automated assessment of the speech quality as experienced by a user of a telephony system. It was standardized as Recommendation ITU-T P.862 in 2001. PESQ is used for objective voice quality testing by phone manufacturers, network equipment vendors and telecom operators. Its usage requires a license. The first edition of PESQ's successor POLQA entered into force in 2011.

Image quality can refer to the level of accuracy with which different imaging systems capture, process, store, compress, transmit and display the signals that form an image. Another definition refers to image quality as "the weighted combination of all of the visually significant attributes of an image". The difference between the two definitions is that one focuses on the characteristics of signal processing in different imaging systems and the latter on the perceptual assessments that make an image pleasant for human viewers.

High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main 10 profile has been incorporated into nearly all supporting hardware.

VQuad-HD(Objective perceptual multimedia video quality measurement of HDTV) is a video quality testing technology for high definition video signals. It is a full-reference model, meaning that it requires access to the original and the degraded signal to estimate the quality.

Perceptual Objective Listening Quality Analysis (POLQA) was the working title of an ITU-T standard that covers a model to predict speech quality by means of analyzing digital speech signals. The model was standardized as Recommendation ITU-T P.863 in 2011. The second edition of the standard appeared in 2014, and the third, currently in-force edition was adopted in 2018 under the title Perceptual objective listening quality prediction.

The MOtion-tuned Video Integrity Evaluation (MOVIE) index is a model and set of algorithms for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos.

ZPEG is a motion video technology that applies a human visual acuity model to a decorrelated transform-domain space, thereby optimally reducing the redundancies in motion video by removing the subjectively imperceptible. This technology is applicable to a wide range of video processing problems such as video optimization, real-time motion video compression, subjective quality monitoring, and format conversion.

Video Multimethod Assessment Fusion (VMAF) is an objective full-reference video quality metric developed by Netflix in cooperation with the University of Southern California, The IPI/LS2N lab Nantes Université, and the Laboratory for Image and Video Engineering (LIVE) at The University of Texas at Austin. It predicts subjective video quality based on a reference and distorted video sequence. The metric can be used to evaluate the quality of different video codecs, encoders, encoding settings, or transmission variants.

References

Further reading