Transcoding

Last updated

Transcoding is the direct digital-to-digital conversion of one encoding to another, [1] such as for video data files, audio files (e.g., MP3, WAV), or character encoding (e.g., UTF-8, ISO/IEC 8859). This is usually done in cases where a target device (or workflow) does not support the format or has limited storage capacity that mandates a reduced file size, [2] or to convert incompatible or obsolete data to a better-supported or modern format.

Contents

In the analog video world, transcoding can be performed just while files are being searched, as well as for presentation. For example, Cineon and DPX files have been widely used as a common format for digital cinema, but the data size of a two-hour movie is about 8 terabytes (TB). [2] That large size can increase the cost and difficulty of handling movie files. However, transcoding into a JPEG2000 lossless format has better compression performance than other lossless coding technologies, and in many cases, JPEG2000 can compress images to half-size. [2]

Transcoding is commonly a lossy process, introducing generation loss; however, transcoding can be lossless if the output is either losslessly compressed or uncompressed. [2] The process of transcoding into a lossy format introduces varying degrees of generation loss, while the transcoding from lossy to lossless or uncompressed is technically a lossless conversion because no information is lost; however, when the conversion is irreversible, it is then more correctly known as destructive.

Process

Transcoding is a two-step process in which the original data is decoded to an intermediate uncompressed format (e.g., PCM for audio; YUV for video), which is then encoded into the target format.

Re-encoding/recoding

One may also re-encode data in the same format, for a number of reasons:

Editing
If one wishes to edit data in a compressed format (for instance, perform image editing on a JPEG image), one will generally decode it, edit it, then re-encode it. This re-encoding causes digital generation loss; thus if one wishes to edit a file repeatedly, one should only decode it once, and make all edits on that copy, rather than repeatedly re-encoding it. Similarly, if encoding to a lossy format is required, it should be deferred until the data is finalised, e.g. after mastering.
Lower bitrate
Transrating is a process similar to transcoding in which files are coded to a lower bitrate without changing video formats; [3] this can include sample rate conversion, but may use an identical sampling rate with higher compression. This allows one to fit given media into smaller storage space (for instance, fitting a DVD onto a Video CD), or over a lower bandwidth channel.
Image scaling
Changing the picture size of video is known as transsizing, and is used if the output resolution differs from the resolution of the media. On a powerful enough device, image scaling can be done on playback, but it can also be done by re-encoding, particularly as part of transrating (such as a downsampled image requiring a lower bitrate).

One can also use formats with bitrate peeling, that allow one to easily lower the bitrate without re-encoding, but quality is often lower than a re-encode. For example, in Vorbis bitrate peeling as of 2008, the quality is inferior to re-encoding.

Drawbacks

The key drawback of transcoding in lossy formats is decreased quality. Compression artifacts are cumulative, so transcoding causes a progressive loss of quality with each successive generation, known as digital generation loss. For this reason, transcoding (in lossy formats) is generally discouraged unless unavoidable.

For users wanting to be able to re-encode audio into any format, and for digital audio editing, it is best to retain a master copy in a lossless format (such as FLAC, ALAC, TTA, WavPack, and others) that take around half the storage space needed when compared to original uncompressed PCM formats (such as WAV, and AIFF), as lossless formats usually have the added benefit of having meta data options, which are either completely missing or very limited in PCM formats. These lossless formats can be transcoded to PCM formats or transcoded directly from one lossless format to another lossless format, without any loss in quality. They can be transcoded into a lossy format, but these copies will then not be able to be transcoded into another format of any kind (PCM, lossless, or lossy) without a subsequent loss of quality.

For image editing users are advised to capture or save images in a raw or uncompressed format, and then edit a copy of that master version, only converting to lossy formats if smaller file sized images are needed for final distribution. As with audio, transcoding from lossy format to another format of any type will result in a loss of quality.

For video editing, (for video converting), images are normally compressed directly during the recording process due to the huge file sizes that would be created if they were not, and because the huge storage demands being too cumbersome for the user otherwise. However, the amount of compression used at the recording stage can be highly variable, and is dependent on a number of factors, including the quality of images being recorded (e.g. analog or digital, standard def. or high def., etc.), and type of equipment available to the user, which is often related to budget constraints – as highest quality digital video equipment, and storage space, may be expensive. Effectively this means that any transcoding will involve some cumulative image loss, and hence the most practical solution insofar as minimizing loss of quality is for the original recording to be deemed the master copy, and for desired subsequent transcoded versions, which will often be in a different format and smaller file size, to be transcoded only from that master copy.

Usage

Although transcoding can be found in many areas of content adaptation, it is commonly used in the area of mobile phone content adaptation. In this case, transcoding is a must, due to the diversity of mobile devices and their capabilities. This diversity requires an intermediate state of content adaptation in order to make sure that the source content will adequately function on the target device to which it is sent.

Transcoding video from most consumer digital cameras can reduce the file size significantly while keeping the quality about the same. This is possible because most consumer cameras are real-time, power-constrained devices having neither the processing power nor the robust power supplies of desktop CPUs.

One of the most popular technologies in which transcoding is used is the Multimedia Messaging Service (MMS), which is the technology used to send or receive messages with media (image, sound, text and video) between mobile phones. For example, when a camera phone is used to take a digital picture, a high-quality image of usually at least 640x480 pixels is created. When sending the image to another phone, this high resolution image might be transcoded to a lower resolution image with fewer colors in order to better fit the target device's screen size and color limitations. This size and color reduction improves the user experience on the target device, and is sometimes the only way for content to be sent between different mobile devices.

Transcoding is extensively used by home theatre PC software to reduce the usage of disk space by video files. The most common operation in this application is the transcoding of MPEG-2 files to the MPEG-4 or H.264 format.

Real-time transcoding in a many-to-many way (any input format to any output format) is becoming a necessity to provide true search capability for any multimedia content on any mobile device, with over 500 million videos on the web and a plethora of mobile devices.

History

Before the advent of semiconductors and integrated circuits, realtime resolution and frame rate transcoding between different analog video standards was achieved by a CRT/camera tube combination. The CRT part does not write onto a phosphor, but onto a thin, dielectric target; the camera part reads the deposited charge pattern at a different scan rate from the back side of this target. [4] The setup could also be used as a genlock.

See also

Concepts
Comparison

Citations

  1. Margaret Rouse. "transcoding". Archived from the original on 2018-01-14. Retrieved 2018-01-14.
  2. 1 2 3 4 "Advancements in Compression and Transcoding: 2008 and Beyond", Society of Motion Picture and Television Engineers (SMPTE), 2008, webpage: SMPTE-spm.
  3. Branson, Ryan (6 July 2015) (6 July 2015). "Why is Bit Rate Important When Converting Videos to MP3?". Online Video Converter. Retrieved 10 August 2015.{{cite web}}: CS1 maint: numeric names: authors list (link)
  4. "GEC 7828 Scan conversion tube data sheet" (PDF). General Electric Corporation. 10 April 1961. Retrieved 21 April 2017.

General and cited references

Related Research Articles

An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.

The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being simply 8-bit μ-law-encoded data at an 8000 Hz sample rate. Hardware from other vendors often used sample rates as high as 8192 Hz, often integer multiples of video clock signal frequencies. Newer files have a header that consists of six unsigned 32-bit words, an optional information chunk which is always of non-zero size, and then the data.

A codec is a device or computer program that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

<span class="mw-page-title-main">Digital video</span> Digital electronic representation of moving visual images

Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises a series of digital images displayed in rapid succession, usually at 24, 30, or 60 frames per second. Digital video has many advantages such as easy copying, multicasting, sharing and storage.

<span class="mw-page-title-main">Lossy compression</span> Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

<span class="mw-page-title-main">FLAC</span> Lossless digital audio coding format

FLAC is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software package that includes a codec implementation. Digital audio compressed by FLAC's algorithm can typically be reduced to between 50 and 70 percent of its original size and decompresses to an identical copy of the original audio data.

Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as Advanced Audio Coding, MP3, Vorbis, and Opus. Therefore, it may be decompressed to a file that is identical to the source material.

In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time.

<span class="mw-page-title-main">Sound quality</span> Assessment of the audio output from an electronic device

Sound quality is typically an assessment of the accuracy, fidelity, or intelligibility of audio output from an electronic device. Quality can be measured objectively, such as when tools are used to gauge the accuracy with which the device reproduces an original sound; or it can be measured subjectively, such as when human listeners respond to the sound or gauge its perceived similarity to another sound.

Generation loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be considered a form of generation loss. File size increases are a common result of generation loss, as the introduction of artifacts may actually increase the entropy of the data through each generation.

Bitrate peeling is a technique used in Ogg Vorbis audio encoded streams, wherein a stream can be encoded at one bitrate but can be served at that or any lower bitrate.

An image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be compressed or uncompressed. If the data is compressed, it may be done so using lossy compression or lossless compression. For graphic design applications, vector formats are often used. Some image file formats support transparency.

Data conversion is the conversion of computer data from one format to another. Throughout a computer environment, data is encoded in a variety of ways. For example, computer hardware is built on the basis of certain standards, which requires that data contains, for example, parity bit checks. Similarly, the operating system is predicated on certain standards for data and file handling. Furthermore, each computer program handles data in a different manner. Whenever any one of these variables is changed, data must be converted in some way before it can be used by a different computer, operating system or program. Even different versions of these elements usually involve different data structures. For example, the changing of bits from one format to another, usually for the purpose of application interoperability or of the capability of using new features, is merely a data conversion. Data conversions may be as simple as the conversion of a text file from one character encoding system to another; or more complex, such as the conversion of office file formats, or the conversion of image formats and audio file formats.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

JPEG XR is an image compression standard for continuous tone photographic images, based on the HD Photo specifications that Microsoft originally developed and patented. It supports both lossy and lossless compression, and is the preferred image format for Ecma-388 Open XML Paper Specification documents.

Uncompressed video is digital video that either has never been compressed or was generated by decompressing previously compressed digital video. It is commonly used by video cameras, video monitors, video recording devices, and in video processors that perform functions such as image resizing, image rotation, deinterlacing, and text and graphics overlay. It is conveyed over various types of baseband digital video interfaces, such as HDMI, DVI, DisplayPort and SDI. Standards also exist for the carriage of uncompressed video over computer networks.

MPEG-1 Audio Layer III HD was an audio compression codec developed by Technicolor, formerly known as Thomson.

Apple ProRes is a high quality, "visually lossless" lossy video compression format developed by Apple Inc. for use in post-production that supports video resolution up to 8K. It is the successor of the Apple Intermediate Codec and was introduced in 2007 with Final Cut Studio 2. Much like the H.26x and MPEG standards, the ProRes family of codecs use compression algorithms based on the discrete cosine transform (DCT). ProRes is widely used as a final format delivery method for HD broadcast files in commercials, features, Blu-ray and streaming.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.