Nasir Ahmed (engineer)

Last updated

Nasir Ahmed
Nasir Ahmed.png
Nasir Ahmed in 2012
Born1940
(82 years ago)
 (1940)
Nationality
Education
Known for
Spouse(s)Esther Parente-Ahmed
ChildrenMichael Ahmed Parente
Awards
Scientific career
Fields
Thesis
Doctoral advisor Shlomo Karni

Nasir Ahmed (born 1940 in Bangalore, India) is an Indian and American electrical engineer and computer scientist. He is Professor Emeritus of Electrical and Computer Engineering at University of New Mexico (UNM). He is best known for inventing the discrete cosine transform (DCT) in the early 1970s. The DCT is the most widely used data compression transformation, the basis for most digital media standards (image, video and audio) and commonly used in digital signal processing. He also described the discrete sine transform (DST), which is related to the DCT. [1]

Contents

Discrete cosine transform (DCT)

The discrete cosine transform (DCT) is a lossy compression algorithm that was first conceived by Ahmed while working at the Kansas State University, and he proposed the technique to the National Science Foundation in 1972. He originally intended the DCT for image compression. [2] [3] Ahmed developed a working DCT algorithm with his PhD student T. Natarajan and friend K. R. Rao in 1973, [2] and they presented their results in a January 1974 paper. [4] [5] [6] It described what is now called the type-II DCT (DCT-II), [7] :51 as well as its inverse, the type-III DCT (a.k.a. IDCT). [4]

Ahmed was the leading author of the benchmark publication, [8] [9] Discrete Cosine Transform (with T. Natarajan and K. R. Rao), [4] which has been cited as a fundamental development in many works [10] since its publication. The basic research work and events that led to the development of the DCT were summarized in a later publication by Ahmed entitled "How I came up with the Discrete Cosine Transform". [2]

The DCT is widely used for digital image compression. [11] [12] [13] It is a core component of the 1992 JPEG image compression technology developed by the JPEG Experts Group [14] working group and standardized jointly by the ITU, [15] ISO and IEC. A tutorial discussion of how it is used to achieve digital video compression in various international standards defined by ITU and MPEG (Moving Picture Experts Group) is available in a paper by K. R. Rao and J. J. Hwang [16] :JPEG: Chapter 8; H.261: Chapter 9; MPEG-1: Chapter 10; MPEG-2: Chapter 11 which was published in 1996, and an overview was presented in two 2006 publications by Yao Wang. [17] [18] The image and video compression properties of the DCT resulted in its being an integral component of the following widely used international standard technologies:

StandardTechnologies
JPEG Storage and transmission of photographic images on the World Wide Web (JPEG/JFIF); and widely used in digital cameras and other photographic image capture devices (JPEG/Exif).
MPEG-1 VideoVideo distribution on CD or via the World Wide Web.
MPEG-2 Video (or H.262)Storage and handling of digital images in broadcast applications: digital TV, HDTV, cable, satellite, high speed internet; video distribution on DVD.
H.261 First of a family of video coding standards (1988). Used primarily in older video conferencing and video telephone products.
H.263 Videotelephony and videoconferencing

The form of DCT used in signal compression applications is sometimes referred to as DCT-2 in the context of a family of discrete cosine transforms, [19] or as DCT-II .

More recent standards have used integer-based transforms that have similar properties to the DCT but are explicitly based on integer processing rather than being defined by trigonometric functions. [20] As a result of these transforms having similar symmetry properties to the DCT and being, to some degree, approximations of the DCT, they have sometimes been called "integer DCT" transforms. Such transforms are used for video compression in the following technologies pertaining to more recent standards. The "integer DCT" designs are conceptually similar to the conventional DCT but are simplified to provide exactly specified decoding with reduced computational complexity.

StandardTechnologies
VC-1 Windows media video 9, SMPTE 421.
H.264/MPEG-4 AVC The most commonly used format for recording, compression and distribution of high definition video; streaming internet video; Blu-ray Discs; HDTV broadcasts (terrestrial, cable and satellite).
H.265/HEVC Successor to the H.264/MPEG-4 AVC standard having substantially improved compression capability.
H.266/VVC Successor to HEVC having substantially improved compression capability.
WebP ImagesA graphic format that support the lossy compression of digital images. Developed by Google.
WebM VideoA multimedia open source format intended to be used with HTML5. Developed by Google.

A DCT variant, the modified discrete cosine transform (MDCT), is used in modern audio compression formats such as MP3, [21] Advanced Audio Coding (AAC), and Vorbis (OGG).

The discrete sine transform (DST) is derived from the DCT, by replacing the Neumann condition at x=0 with a Dirichlet condition. [7] : 35 The DST was described in the 1974 DCT paper by Ahmed, Natarajan and Rao. [4]

Ahmed later was involved in the development a DCT lossless compression algorithm with Giridhar Mandyam and Neeraj Magotra at the University of New Mexico in 1995. This allows the DCT technique to be used for lossless compression of images. It is a modification of the original DCT algorithm, and incorporates elements of inverse DCT and delta modulation. It is a more effective lossless compression algorithm than entropy coding. [22]

Background

Books

In season 5, episode 8 of NBC's This Is Us , Ahmed's story was told to highlight the importance of image and video transmission over the Internet in modern society, particularly during the COVID-19 pandemic. The episode ends with a picture of Ahmed and his wife, along with captions explaining the importance of his work, and that producers spoke to the couple over video chat to understand their story and incorporate it into the episode. [23]

Related Research Articles

Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals is typically measured in decibels. As audio signals may be represented in either digital or analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

JPEG Lossy compression method for reducing the size of digital images

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

Lossy compression Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

Video Electronic moving image

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems which, in turn, were replaced by flat panel displays of several types.

Image compression Reduction of image size to save storage and transmission costs

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.

Transform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossless on its own but is used to enable better quantization, which then results in a lower quality copy of the original input.

Motion compensation Video compression technique, used to efficiently predict and generate video frames

Motion compensation is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.

Compression artifact Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using a purely real matrix. It is equivalent to the imaginary parts of a DFT of roughly twice the length, operating on real data with odd symmetry, where in some variants the input and/or output data are shifted by half a sample.

The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression. It is employed in most modern audio coding standards, including MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC), High-Definition Coding (HDC), LDAC, Dolby AC-4, and MPEG-H 3D Audio, as well as speech coding standards such as AAC-LD (LD-MDCT), G.722.1, G.729.1, CELT, and Opus.

H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.

A timeline of events related to  information theory,  quantum information theory and statistical physics,  data compression,  error correcting codes and related subjects.

Lossless JPEG is a 1993 addition to JPEG standard by the Joint Photographic Experts Group to enable lossless compression. However, the term may also be used to refer to all lossless compression schemes developed by the group, including JPEG 2000 and JPEG-LS.

K. R. Rao Indian-American electrical engineer

Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington. Academically known as K. R. Rao, he is credited with the co-invention of discrete cosine transform (DCT), along with Nasir Ahmed and T. Natarajan due to their landmark publication, Discrete Cosine Transform.

A video coding format is a content representation format for storage or transmission of digital video content. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. Examples of video coding formats include H.262, MPEG-4 Part 2, H.264, HEVC (H.265), Theora, RealVideo RV40, VP9, and AV1. A specific software or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec; an example of a video codec is Xvid, which is one of several different codecs which implements encoding and decoding videos in the MPEG-4 Part 2 video coding format in software.

Audio coding format Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

JPEG XT is an image compression standard which specifies backward-compatible extensions of the base JPEG standard.

References

  1. "Who is Nasir Ahmed? Real love story of Indian-American engineer on 'This Is Us' who is credited for .jpg algorithm". meaww.com. Retrieved 8 April 2022.
  2. 1 2 3 Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing . 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.
  3. Stanković, Radomir S.; Astola, Jaakko T. (2012). "Reminiscences of the Early Work in DCT: Interview with K.R. Rao" (PDF). Reprints from the Early Days of Information Sciences. Tampere International Center for Signal Processing. 60. ISBN   978-9521528187. ISSN   1456-2774. Archived (PDF) from the original on 30 December 2021. Retrieved 30 December 2021 via ETHW.
  4. 1 2 3 4 ; Natarajan, T. Raj; Rao, K.R. (1 January 1974). "Discrete Cosine Transform". IEEE Transactions on Computers . IEEE Computer Society. C-23 (1): 90–93. doi:10.1109/T-C.1974.223784. eISSN   1557-9956. ISSN   0018-9340. LCCN   75642478. OCLC   1799331. S2CID   39023640.
  5. Rao, K. Ramamohan; Yip, Patrick C. (11 September 1990). Discrete Cosine Transform: Algorithms, Advantages, Applications. Signal, Image and Speech Processing. Academic Press. arXiv: 1109.0337 . doi:10.1016/c2009-0-22279-3. ISBN   978-0125802031. LCCN   89029800. OCLC   1008648293. OL   2207570M. S2CID   12270940.
  6. "T.81 – Digital compression and coding of continuous-tone still images – requirements and guidelines" (PDF). CCITT. September 1992. Retrieved 12 July 2019.
  7. 1 2 Britanak, Vladimir; Yip, Patrick C.; Rao, K. R. (6 November 2006). Discrete Cosine and Sine Transforms: General Properties, Fast Algorithms and Integer Approximations. Academic Press. ISBN   978-0123736246. LCCN   2006931102. OCLC   220853454. OL   18495589M. S2CID   118873224.
  8. Selected Papers on Visual Communication: Technology and Applications, (SPIE Press Book), Editors T. Russell Hsing and Andrew G. Tescher, April 1990, pp. 145-149 .
  9. Selected Papers and Tutorial in Digital Image Processing and Analysis, Volume 1, Digital Image Processing and Analysis, (IEEE Computer Society Press), Editors R. Chellappa and A. A. Sawchuk, June 1985, p. 47.
  10. DCT citations via Google Scholar .
  11. Andrew B. Watson (1994). "Image Compression Using the Discrete Cosine Transform" (PDF). Mathematica Journal. 4 (1): 81–88.
  12. image compression.
  13. Transform coding.
  14. Wallace, G. K. (February 1992). "The JPEG Still Image Compression Standard" (PDF). IEEE Transactions on Consumer Electronics. 38 (1)..
  15. CCITT 1992 .
  16. Rao, K. R.; Hwang, J. J. (18 July 1996). Techniques and Standards for Image, Video, and Audio Coding. Prentice Hall. ISBN   978-0133099072. LCCN   96015550. OCLC   34617596. OL   978319M. S2CID   56983045.
  17. Yao Wang, Video Coding Standards: Part I, 2006
  18. Yao Wang, Video Coding Standards: Part II, 2006
  19. Gilbert Strang (1999). "The Discrete Cosine Transform" (PDF). SIAM Review. 41 (1): 135–147. Bibcode:1999SIAMR..41..135S. doi: 10.1137/S0036144598336745 .
  20. Lee, Jae-Beom; Kalva, Hari (2008). The VC-1 and H.264 Video Compression Standards for Broadband Video Services. Springer Science+Business Media, LLC. pp. 217–245.
  21. Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression" (PDF). University of Utah . Retrieved 14 July 2019.
  22. Mandyam, Giridhar D.; Ahmed, Nasir; Magotra, Neeraj (17 April 1995). "DCT-based scheme for lossless image compression". Digital Video Compression: Algorithms and Technologies 1995. SPIE. 2419: 474–478. Bibcode:1995SPIE.2419..474M. doi:10.1117/12.206386. S2CID   13894279.
  23. Mizoguchi, Karen (16 February 2021). "How This Is Us Honored the Real-Life 'Genius' Who Made It Possible for the Pearsons to Stay Connected amid COVID". People.com . Retrieved 21 March 2022.