Nasir Ahmed | |
---|---|
Born | 1940 |
Nationality | |
Education | |
Known for |
|
Spouse | Esther Parente-Ahmed |
Children | Michael Ahmed Parente |
Awards |
|
Scientific career | |
Fields | |
Thesis | |
Doctoral advisor | Shlomo Karni |
Nasir Ahmed (born 1940) is an Indian-American electrical engineer and computer scientist. He is Professor Emeritus of Electrical and Computer Engineering at University of New Mexico (UNM). He is best known for inventing the discrete cosine transform (DCT) in the early 1970s. The DCT is the most widely used data compression transformation, the basis for most digital media standards (image, video and audio) and commonly used in digital signal processing. He also described the discrete sine transform (DST), which is related to the DCT. [1]
The discrete cosine transform (DCT) is a lossy compression algorithm that was first conceived by Ahmed while working at the Kansas State University, and he proposed the technique to the National Science Foundation in 1972. He originally intended the DCT for image compression. [2] [3] Ahmed developed a working DCT algorithm with his PhD student T. Natarajan and friend K. R. Rao in 1973, [2] and they presented their results in a January 1974 paper. [4] [5] [6] It described what is now called the type-II DCT (DCT-II), [7] : 51 as well as its inverse, the type-III DCT (a.k.a. IDCT). [4]
Ahmed was the leading author of the benchmark publication, [8] [9] Discrete Cosine Transform (with T. Natarajan and K. R. Rao), [4] which has been cited as a fundamental development in many works [10] since its publication. The basic research work and events that led to the development of the DCT were summarized in a later publication by Ahmed entitled "How I came up with the Discrete Cosine Transform". [2]
The DCT is widely used for digital image compression. [11] [12] [13] It is a core component of the 1992 JPEG image compression technology developed by the JPEG Experts Group [14] working group and standardized jointly by the ITU, [15] ISO and IEC. A tutorial discussion of how it is used to achieve digital video compression in various international standards defined by ITU and MPEG (Moving Picture Experts Group) is available in a paper by K. R. Rao and J. J. Hwang [16] : JPEG: Chapter 8; H.261: Chapter 9; MPEG-1: Chapter 10; MPEG-2: Chapter 11 which was published in 1996, and an overview was presented in two 2006 publications by Yao Wang. [17] [18] The image and video compression properties of the DCT resulted in its being an integral component of the following widely used international standard technologies:
Standard | Technologies |
---|---|
JPEG | Storage and transmission of photographic images on the World Wide Web (JPEG/JFIF); and widely used in digital cameras and other photographic image capture devices (JPEG/Exif). |
MPEG-1 Video | Video distribution on CD or via the World Wide Web. |
MPEG-2 Video (or H.262) | Storage and handling of digital images in broadcast applications: digital TV, HDTV, cable, satellite, high speed internet; video distribution on DVD. |
H.261 | First of a family of video coding standards (1988). Used primarily in older video conferencing and video telephone products. |
H.263 | Videotelephony and videoconferencing |
The form of DCT used in signal compression applications is sometimes referred to as DCT-2 in the context of a family of discrete cosine transforms, [19] or as DCT-II .
More recent standards have used integer-based transforms that have similar properties to the DCT but are explicitly based on integer processing rather than being defined by trigonometric functions. [20] As a result of these transforms having similar symmetry properties to the DCT and being, to some degree, approximations of the DCT, they have sometimes been called "integer DCT" transforms. Such transforms are used for video compression in the following technologies pertaining to more recent standards. The "integer DCT" designs are conceptually similar to the conventional DCT but are simplified to provide exactly specified decoding with reduced computational complexity.
Standard | Technologies |
---|---|
VC-1 | Windows media video 9, SMPTE 421. |
H.264/MPEG-4 AVC | The most commonly used format for recording, compression and distribution of high definition video; streaming internet video; Blu-ray Discs; HDTV broadcasts (terrestrial, cable and satellite). |
H.265/HEVC | Successor to the H.264/MPEG-4 AVC standard having substantially improved compression capability. |
H.266/VVC | Successor to HEVC having substantially improved compression capability. |
WebP Images | A graphic format that supports the lossy compression of digital images. Developed by Google. |
WebM Video | A multimedia open source format intended to be used with HTML5. Developed by Google. |
A DCT variant, the modified discrete cosine transform (MDCT), is used in modern audio compression formats such as MP3, [21] Advanced Audio Coding (AAC), and Vorbis (OGG).
The discrete sine transform (DST) is derived from the DCT, by replacing the Neumann condition at x=0 with a Dirichlet condition. [7] : 35 The DST was described in the 1974 paper by Ahmed, Natarajan and Rao. [4]
Ahmed later was involved in the development a DCT lossless compression algorithm with Giridhar Mandyam and Neeraj Magotra at the University of New Mexico in 1995. This allows the DCT technique to be used for lossless compression of images. It is a modification of the original DCT algorithm, and incorporates elements of inverse DCT and delta modulation. It is a more effective lossless compression algorithm than entropy coding. [22]
In season 5, episode 8 of NBC's This Is Us , Ahmed's story was told to highlight the importance of image and video transmission over the Internet in modern society, particularly during the COVID-19 pandemic. The episode ends with a picture of Ahmed and his wife, along with captions explaining the importance of his work, and that producers spoke to the couple over video chat to understand their story and incorporate it into the episode. [23]
Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals or sound power level is typically measured in decibels. As audio signals may be represented in either digital or analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are a sequence of numbers that represent samples of a continuous variable in a domain such as time, space, or frequency. In digital electronics, a digital signal is represented as a pulse train, which is typically generated by the switching of a transistor.
In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.
Transform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossless on its own but is used to enable better quantization, which then results in a lower quality copy of the original input.
Motion compensation in computing is an algorithmic technique used to predict a frame in a video given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.
A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.
Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics ; third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.
A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.
In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using a purely real matrix. It is equivalent to the imaginary parts of a DFT of roughly twice the length, operating on real data with odd symmetry (since the Fourier transform of a real and odd function is imaginary and odd), where in some variants the input and/or output data are shifted by half a sample.
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression. It is employed in most modern audio coding standards, including MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC), High-Definition Coding (HDC), LDAC, Dolby AC-4, and MPEG-H 3D Audio, as well as speech coding standards such as AAC-LD (LD-MDCT), G.722.1, G.729.1, CELT, and Opus.
H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.
A timeline of events related to information theory, quantum information theory and statistical physics, data compression, error correcting codes and related subjects.
Lossless JPEG is a 1993 addition to JPEG standard by the Joint Photographic Experts Group to enable lossless compression. However, the term may also be used to refer to all lossless compression schemes developed by the group, including JPEG 2000, JPEG-LS, and JPEG XL.
Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington. Academically known as K. R. Rao, he is credited with the co-invention of discrete cosine transform (DCT), along with Nasir Ahmed and T. Natarajan due to their landmark publication, Discrete Cosine Transform.
A video coding format is a content representation format of digital video content, such as in a data file or bitstream. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression in a specific video coding format is called a video codec.
An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.
JPEG XT is an image compression standard which specifies backward-compatible extensions of the base JPEG standard.