Digital video

Last updated

Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images with analog signals. Digital video comprises a series of digital images displayed in rapid succession.

Video electronic medium for the recording, copying and broadcasting of moving visual images

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

Digital data, in information theory and information systems, is the discrete, discontinuous representation of information or works. Numbers and letters are commonly used representations.

An analog signal is any continuous signal for which the time-varying feature (variable) of the signal is a representation of some other time varying quantity, i.e., analogous to another time varying signal. For example, in an analog audio signal, the instantaneous voltage of the signal varies continuously with the pressure of the sound waves. It differs from a digital signal, in which the continuous quantity is a representation of a sequence of discrete values which can only take on one of a finite number of values. The term analog signal usually refers to electrical signals; however, mechanical, pneumatic, hydraulic, human speech, and other systems may also convey or be considered analog signals.


Digital video was first introduced commercially in 1986 with the Sony D1 format[ citation needed ], which recorded an uncompressed standard definition component v

Component video video signal that has been split into two or more component channels

Component video is a video signal that has been split into two or more component channels. In popular use, it refers to a type of component analog video (CAV) information that is transmitted or stored as three separate signals. Component video can be contrasted with composite video in which all the video information is combined into a single line level signal that is used in analog television. Like composite, component-video cables do not carry audio and are often paired with audio cables.

ideo signal in digital form. In addition to uncompressed formats, popular compressed digital video formats today include H.264 and MPEG-4. Modern interconnect standards for digital video include HDMI, DisplayPort, Digital Visual Interface (DVI) and serial digital interface (SDI).

In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.

MPEG-4 is a method of defining compression of audio and visual (AV) digital data. It was introduced in late 1998 and designated a standard for a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of AV data for web and CD distribution, voice and broadcast television applications.

HDMI communications protocol

HDMI is a proprietary audio/video interface for transmitting uncompressed video data and compressed or uncompressed digital audio data from an HDMI-compliant source device, such as a display controller, to a compatible computer monitor, video projector, digital television, or digital audio device. HDMI is a digital replacement for analog video standards.

Digital video can be copied with no degradation in quality. In contrast, when analog sources are copied, they experience generation loss. Digital video can be stored on digital media such as Blu-ray Disc, on computer data storage or streamed over the Internet to end users who watch content on a desktop computer screen or a digital smart TV. In everyday practice, digital video content such as TV shows and movies also includes a digital audio soundtrack.

Generation loss is the loss of quality between subsequent copies or transcodes of data. Anything that reduces the quality of the representation when copying, and would cause further reduction in quality on making a copy of the copy, can be considered a form of generation loss. File size increases are a common result of generation loss, as the introduction of artifacts may actually increase the entropy of the data through each generation.

Computer data storage technology consisting of computer components and recording media used to retain digital data

Computer data storage, often called storage or memory, is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.

Internet Global system of connected computer networks

The Internet is the global system of interconnected computer networks that use the Internet protocol suite (TCP/IP) to link devices worldwide. It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries a vast range of information resources and services, such as the inter-linked hypertext documents and applications of the World Wide Web (WWW), electronic mail, telephony, and file sharing.


Starting in the late 1970s to the early 1980s, several types of video production equipment that were digital in their internal workings were introduced. These included time base correctors (TBC) [lower-alpha 1] and digital video effects (DVE) units. [lower-alpha 2] They operated by taking a standard analog composite video input and digitizing it internally. This made it easier to either correct or enhance the video signal, as in the case of a TBC, or to manipulate and add effects to the video, in the case of a DVE unit. The digitized and processed video information was then converted back to standard analog video for output.

Video production is the process of producing video content. It is the equivalent of filmmaking, but with images recorded digitally instead of on film stock. There are three stages of video production: pre-production, production, and post-production. Pre-production involves all of the planning aspects of the video production process before filming begins. This includes scriptwriting, scheduling, logistics, and other administrative duties. Production is the phase of video production which captures the video content and involves filming the subject(s) of the video. Post-production is the action of selectively combining those video clips through video editing into a finished product that tells a story or communicates a message in either a live event setting, or after an event has occurred (post-production).

Composite video analog video transmission

Composite video is an analog video transmission that carries standard definition video typically at 480i or 576i resolution as a single channel. Video information is encoded on one channel, unlike the higher-quality S-video and the even higher-quality component video. In all of these video formats, audio is carried on a separate connection.

Later on in the 1970s, manufacturers of professional video broadcast equipment, such as Bosch (through their Fernseh division) and Ampex developed prototype digital videotape recorders (VTR) in their research and development labs. Bosch's machine used a modified 1 inch type B videotape transport, and recorded an early form of CCIR 601 digital video. Ampex's prototype digital video recorder used a modified 2 inch Quadruplex videotape VTR (an Ampex AVR-3), but fitted with custom digital video electronics, and a special "octaplex" 8-head headwheel (regular analog 2" Quad machines only used 4 heads). Like standard 2" Quad, the audio on the Ampex prototype digital machine, nicknamed by its developers as "Annie", still recorded the audio in analog as linear tracks on the tape. None of these machines from these manufacturers were ever marketed commercially.

Robert Bosch GmbH, or Bosch, is a world leading multinational engineering and electronics company headquartered in Gerlingen, near Stuttgart, Germany. The company was founded by Robert Bosch in Stuttgart in 1886. Bosch is 92% owned by Robert Bosch Stiftung.

The Fernseh AG television company was registered in Berlin on July 3, 1929 by John Logie Baird, Robert Bosch and other partners with an initial capital of 100,000 Reichsmark. Fernseh AG did research and manufacturing of television equipment.

Ampex company

Ampex is an American electronics company founded in 1944 by Alexander M. Poniatoff as a spin-off of Dalmo-Victor. The name AMPEX is a portmanteau, created by its founder, which stands for Alexander M. Poniatoff Excellence. Today, Ampex operates as Ampex Data Systems Corporation, a subsidiary of Delta Information Systems, and consists of two business units. The Silicon Valley unit, known internally as Ampex Data Systems (ADS), manufactures ruggedized, high-capacity, high-performance digital data storage systems capable of functioning in harsh environments on land, in the air, at sea, and in space. The Colorado Springs, Colorado unit, referred to as Ampex Intelligent Systems (AIS), serves as a laboratory and hub for the company’s line of industrial control system cyber security products and services and its artificial intelligence/machine learning technology which is available across all of the company’s products.

Digital video was first introduced commercially in 1986 with the Sony D1 format, which recorded an uncompressed standard definition component video signal in digital form. Component video connections required 3 cables and most television facilities were wired for composite NTSC or PAL video using one cable. Due this incompatibility and also due to the cost of the recorder, D1 was used primarily by large television networks and other component-video capable video studios.

In 1988, Sony and Ampex co-developed and released the D2 digital videocassette format, which recorded video digitally without compression in ITU-601 format, much like D1. But D2 had the major difference of encoding the video in composite form to the NTSC standard, thereby only requiring single-cable composite video connections to and from a D2 VCR, making it a perfect fit for the majority of television facilities at the time. D2 was a successful format in the television broadcast industry throughout the late '80s and the '90s. D2 was also widely used in that era as the master tape format for mastering laserdiscs. [lower-alpha 3]

D1 & D2 would eventually be replaced by cheaper systems using video compression, most notably Sony's Digital Betacam [lower-alpha 4] that were introduced into the network's television studios. Other examples of digital video formats utilizing compression were Ampex's DCT (the first to employ such when introduced in 1992), the industry-standard DV and MiniDV and its professional variations, Sony's DVCAM and Panasonic's DVCPRO, and Betacam SX, a lower-cost variant of Digital Betacam using MPEG-2 compression.[ citation needed ]

One of the first digital video products to run on personal computers was PACo: The PICS Animation Compiler from The Company of Science & Art in Providence, RI, which was developed starting in 1990 and first shipped in May 1991. PACo could stream unlimited-length video with synchronized sound from a single file (with the ".CAV" file extension) on CD-ROM. Creation required a Mac; playback was possible on Macs, PCs, and Sun SPARCstations. [1]

QuickTime, Apple Computer's multimedia framework appeared in June 1991. Audio Video Interleave from Microsoft followed in 1992. Initial consumer-level content creation tools were crude, requiring an analog video source to be digitized to a computer-readable format. While low-quality at first, consumer digital video increased rapidly in quality, first with the introduction of playback standards such as MPEG-1 and MPEG-2 (adopted for use in television transmission and DVD media), and then the introduction of the DV tape format allowing recordings in the format to be transferred direct to digital video files using a FireWire port on an editing computer. This simplified the process, allowing non-linear editing systems (NLE) to be deployed cheaply and widely on desktop computers with no external playback or recording equipment needed.

The widespread adoption of digital video and accompanying compression formats has reduced the bandwidth needed for a high-definition video signal (with HDV and AVCHD, as well as several commercial variants such as DVCPRO-HD, all using less bandwidth than a standard definition analog signal). These savings have increased the number of channels available on cable television and direct broadcast satellite systems, created opportunities for spectrum reallocation of terrestrial television broadcast frequencies, made tapeless camcorders based on flash memory possible among other innovations and efficiencies.


Digital video comprises a series of digital images displayed in rapid succession. In the context of video these images are called frames. [lower-alpha 5] The rate at which frames are displayed is known as the frame rate and is measured in frames per second (FPS). Every frame is an orthogonal bitmap digital image and so comprises a raster of pixels. Pixels have only one property, their color. The color of a pixel is represented by a fixed number of bits. The more bits the more subtle variations of colors can be reproduced. This is called the color depth of the video.


In interlaced video each frame is composed of two halves of an image. The first half contains only the odd-numbered lines of a full frame. The second half contains only the even-numbered lines. Those halves are referred to individually as fields. Two consecutive fields compose a full frame. If an interlaced video has a frame rate of 30 frames per second the field rate is 60 fields per second. All the properties discussed here apply equally to interlaced video but one should be careful not to confuse the fields-per-second rate with the frames-per-second rate.

Bit rate and BPP

By its definition, bit rate is a measure of the rate of information content of the digital video stream. In the case of uncompressed video, bit rate corresponds directly to the quality of the video. (Bit rate is proportional to every property that affects the video quality.) Bit rate is an important property when transmitting video because the transmission link must be capable of supporting that bit rate. Bit rate is also important when dealing with the storage of video because, as shown above, the video size is proportional to the bit rate and the duration. Bit rate of uncompressed video is too high for most practical applications. Video compression is used to greatly reduce the bit rate. BPP is a measure of the efficiency of compression. A true-color video with no compression at all may have a BPP of 24 bits/pixel. Chroma subsampling can reduce the BPP to 16 or 12 bits/pixel. Applying jpeg compression on every frame can reduce the BPP to 8 or even 1 bits/pixel. Applying video compression algorithms like MPEG1, MPEG2 or MPEG4 allows for fractional BPP values.

Constant bit rate versus variable bit rate

As noted above, BPP represents the average bits per pixel. There are compression algorithms that keep the BPP almost constant throughout the entire duration of the video. In this case, we also get video output with a constant bit rate (CBR). This CBR video is suitable for real-time, non-buffered, fixed bandwidth video streaming (e.g. in videoconferencing). As not all frames can be compressed at the same level, because quality is more severely impacted for scenes of high complexity, some algorithms try to constantly adjust the BPP. They keep it high while compressing complex scenes and low for less demanding scenes. This way, one gets the best quality at the smallest average bit rate (and the smallest file size, accordingly). When using this method, the bit rate is variable because it tracks the variations of the BPP.

Technical overview

Standard film stocks such as 16 mm and 35 mm record at 24 frames per second. For video, there are two frame rate standards: NTSC, which shoot at 30/1.001 (about 29.97) frames per second or 59.94 fields per second, and PAL, 25 frames per second or 50 fields per second. Digital video cameras come in two different image capture formats: interlaced and deinterlaced / progressive scan. Interlaced cameras record the image in alternating sets of lines: the odd-numbered lines are scanned, and then the even-numbered lines are scanned, then the odd-numbered lines are scanned again, and so on. One set of odd or even lines is referred to as a "field", and a consecutive pairing of two fields of opposite parity is called a frame. Deinterlaced cameras records each frame as distinct, with all scan lines being captured at the same moment in time. Thus, interlaced video captures samples the scene motion twice as often as progressive video does, for the same number of frames per second. Progressive-scan camcorders generally produce a slightly sharper image. However, motion may not be as smooth as interlaced video which uses 50 or 59.94 fields per second, particularly if they employ the 24 frames per second standard of film.

Digital video can be copied with no degradation in quality. No matter how many generations of a digital source is copied, it will still be as clear as the original first generation of digital footage. However a change in parameters like frame size as well as a change of the digital format can decrease the quality of the video due to new calculations that have to be made. Digital video can be manipulated and edited to follow an order or sequence on an NLE, or non-linear editing workstation, a computer-based device intended to edit video and audio. More and more, videos are edited on readily available, increasingly affordable consumer-grade computer hardware and software. However, such editing systems require ample disk space for video footage. The many video formats and parameters to be set make it quite impossible to come up with a specific number for how many minutes need how much time.

Digital video has a significantly lower cost than 35 mm film. In comparison to the high cost of film stock, the tape stock (or other electronic media used for digital video recording, such as flash memory or hard disk drive) used for recording digital video is very inexpensive. Digital video also allows footage to be viewed on location without the expensive chemical processing required by film. Also physical deliveries of tapes and broadcasts do not apply anymore. Digital television (including higher quality HDTV) started to spread in most developed countries in early 2000s. Digital video is also used in modern mobile phones and video conferencing systems. Digital video is also used for Internet distribution of media, including streaming video and peer-to-peer movie distribution. However even within Europe are lots of TV-Stations not broadcasting in HD, due to restricted budgets for new equipment for processing HD.

Many types of video compression exist for serving digital video over the internet and on optical disks. The file sizes of digital video used for professional editing are generally not practical for these purposes, and the video requires further compression with codecs such as Sorenson, H.264 and more recently Apple ProRes especially for HD. Probably the most widely used formats for delivering video over the internet are MPEG4, Quicktime, Flash and Windows Media, while MPEG2 is used almost exclusively for DVDs, providing an exceptional image in minimal size but resulting in a high level of CPU consumption to decompress.

As of 2011, the highest resolution demonstrated for digital video generation is 35 megapixels (8192 x 4320). The highest speed is attained in industrial and scientific high speed cameras that are capable of filming 1024x1024 video at up to 1 million frames per second for brief periods of recording.


An example video can have a duration (T) of 1 hour (3600sec), a frame size of 640x480 (WxH) at a color depth of 24 bits and a frame rate of 25fps. This example video has the following properties:

The most important properties are bit rate and video size. The formulas relating those two with all other properties are:

BR = W * H * CD * FPS VS = BR * T = W * H * CD * FPS * T (units are: BR in bit/s, W and H in pixels, CD in bits, VS in bits, T in seconds)

while some secondary formulas are:

pixels_per_frame = W * H pixels_per_second = W * H * FPS bits_per_frame = W * H * CD

The above are accurate for uncompressed video. Because of the relatively high bit rate of uncompressed video, video compression is extensively used. In the case of compressed video each frame requires a small percentage of the original bits. Assuming a compression algorithm that shrinks the input data by a factor of CF, the bit rate and video size would equal to:

BR = W * H * CD * FPS / CF VS = BR * T / CF

Note that it is not necessary that all frames are equally compressed by a factor of CF. In practice they are not, so CF is the average factor of compression for all the frames taken together.

The above equation for the bit rate can be rewritten by combining the compression factor and the color depth like this:

BR = W * H * ( CD / CF ) * FPS

The value (CD / CF) represents the average bits per pixel (BPP). As an example, if there is a color depth of 12 bits/pixel and an algorithm that compresses at 40x, then BPP equals 0.3 (12/40). So in the case of compressed video the formula for bit rate is:

BR = W * H * BPP * FPS

The same formula is valid for uncompressed video because in that case one can assume that the "compression" factor is 1 and that the average bits per pixel equal the color depth.

Interfaces and cables

Many interfaces have been designed specifically to handle the requirements of uncompressed digital video (from roughly 400 Mbit/s to 10 Gbit/s):

The following interface has been designed for carrying MPEG-Transport compressed video:

Compressed video is also carried using UDP-IP over Ethernet. Two approaches exist for this:

Alternative methods of carrying video over IP include:

Storage formats


All current formats, which are listed below, are PCM based.



See also


  1. For example the Thomson-CSF 9100 Digital Video Processor, an internally all-digital full-frame TBC introduced in 1980.
  2. For example the Ampex ADO, and the Nippon Electric Corporation (NEC) DVE.
  3. Prior to D2, most laserdiscs were mastered using analog 1" Type C videotape
  4. Digital Betacam is still heavily used as an electronic field production (EFP) recording format by professional television producers
  5. In fact the still images correspond to frames only in the case of progressive scan video. In interlaced video they correspond to fields. See section about interlacing for clarification.

Related Research Articles

MPEG-2 standard for the generic coding of moving pictures

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

Motion compensation

Motion compensation is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

DV format for storing digital video

DV is a format for storing digital video.

A video codec is an electronic circuit or software that compresses or decompresses digital video. It converts uncompressed video to a compressed format or vice versa. In the context of video compression, "codec" is a concatenation of "encoder" and "decoder"—a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.

In multimedia, Motion JPEG is a video compression format in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image. Originally developed for multimedia PC applications, M-JPEG is now used by video-capture devices such as digital cameras, IP cameras, and webcams, as well as by non-linear video editing systems. It is natively supported by the QuickTime Player, the PlayStation console, and web browsers such as Safari, Google Chrome, Mozilla Firefox and Microsoft Edge.

Advanced Television Systems Committee (ATSC) standards are a set of standards for digital television transmission over terrestrial, cable, and satellite networks. It is largely a replacement for the analog NTSC standard, and like that standard, used mostly in the United States, Mexico and Canada. Other former users of NTSC, like Japan, have not used ATSC during their digital television transition because they adopted their own system called ISDB.

D-1 (Sony)

D-1 or 4:2:2 Component Digital is a SMPTE digital recording video standard, introduced in 1986 through efforts by SMPTE engineering committees. It started as a Sony and Bosch - BTS product and was the first major professional digital video format. SMPTE standardized the format within ITU-R 601, also known as Rec. 601, which was derived from SMPTE 125M and EBU 3246-E standards.

Camcorder video camera with built-in video recorder

A camcorder is an electronic device originally combining a video camera and a videocassette recorder.


Betacam is a family of half-inch professional videocassette products developed by Sony in 1982. In colloquial use, "Betacam" singly is often used to refer to a Betacam camcorder, a Betacam tape, a Betacam video recorder or the format itself.

CIF, also known as FCIF, is a standardized format for the picture resolution, frame rate, color space, and color subsampling of digital video sequences used in video teleconferencing systems. It was first defined in the H.261 standard in 1988.

H.262 or MPEG-2 Part 2 is a video coding format developed and maintained jointly by ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical. The standard is available for a fee from the ITU-T and ISO.


HDCAM, introduced in 1997, is a high-definition video digital recording videocassette version of digital Betacam, using an 8-bit discrete cosine transform (DCT) compressed 3:1:1 recording, in 1080i-compatible down-sampled resolution of 1440×1080, and adding 24p and 23.976 progressive segmented frame (PsF) modes to later models. The HDCAM codec uses rectangular pixels and as such the recorded 1440×1080 content is upsampled to 1920×1080 on playback. The recorded video bit rate is 144 Mbit/s. Audio is also similar, with four channels of AES3 20-bit, 48 kHz digital audio.

1080p set of HDTV high-definition video

1080p is a set of HDTV high-definition video modes characterized by 1,920 pixels displayed across the screen horizontally and 1,080 pixels down the screen vertically; the p stands for progressive scan, i.e. non-interlaced. The term usually assumes a widescreen aspect ratio of 16:9, implying a resolution of 2.1 megapixels. It is often marketed as full HD, to contrast 1080p with 720p resolution screens.

Closed-circuit television camera Camera used for closed-circuit television

A closed-circuit television camera can produce images or recordings for surveillance or other private purposes. Cameras can be either video cameras, or digital stills cameras. Walter Bruch was the inventor of the CCTV camera.

High-definition television (HDTV) is a television system providing an image resolution that is of substantially higher resolution than that of standard-definition television. This can be either analog or digital. HDTV is the current standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television, Blu-rays, and streaming video.

The Apple Intermediate Codec is a high-quality 8-bit 4:2:0 video codec used mainly as a less processor-intensive way of working with long-GOP MPEG-2 footage such as HDV. It is recommended for use with all HD workflows in Final Cut Express, iMovie, and until Final Cut Pro version 5. The Apple Intermediate Codec abbreviated AIC is designed by Apple Inc. to be an intermediate format in an HDV and AVCHD workflow. It features high performance and quality, being less processor intensive to work with than other editing formats. Unlike native MPEG-2 based HDV - and similar to the standard-definition DV codec - the Apple Intermediate Codec does not use temporal compression, enabling every frame to be decoded immediately without decoding other frames. As a result of this, the Apple Intermediate Codec takes three to four times more space than HDV.

Uncompressed video is digital video that either has never been compressed or was generated by decompressing previously compressed digital video. It is commonly used by video cameras, video monitors, video recording devices, and in video processors that perform functions such as image resizing, image rotation, deinterlacing, and text and graphics overlay. It is conveyed over various types of baseband digital video interfaces, such as HDMI, DVI, DisplayPort and SDI. Standards also exist for carriage of uncompressed video over computer networks.


  1. "CoSA Lives: The Story of the Company Behind After Effects". Archived from the original on 2011-02-27. Retrieved 2009-11-16.
  2. the term video size is used instead of just size in order to avoid confusion with the frame size