Common Intermediate Format

Last updated

CIF (Common Intermediate Format or Common Interchange Format), also known as FCIF (Full Common Intermediate Format), is a standardized format for the picture resolution, frame rate, color space, and color subsampling of digital video sequences used in video teleconferencing systems. It was first defined in the H.261 standard in 1988.

Contents

Comparison of CIF formats and D-1 CIF and D1 definitions comparison.svg
Comparison of CIF formats and D-1

As the word "common" in its name implies, CIF was designed as a common compromise format to be relatively easy to convert for use either with PAL or NTSC standard displays and cameras. CIF defines a video sequence with a resolution of 352 × 288, which has a simple relationship to the PAL picture size, but with a frame rate of 30000/1001 (roughly 29.97) frames per second like NTSC, with color encoded using a YCbCr representation with 4:2:0 color sampling. It was designed as a compromise between PAL and NTSC schemes, since it uses a picture size that corresponds most easily to PAL, but uses the frame rate of NTSC. The compromise was established as a way to reach international agreement so that video conferencing systems in different countries could communicate with each other without needing two separate modes for displaying the received video.

Technical details

The simple way to convert NTSC video to CIF is to capture every other field (e.g., the top fields) of interlaced video, downsample it by 2:1 horizontally to convert 704 samples per line to 352 samples per line, and upsample it vertically by a ratio of 6:5 vertically to convert 240 lines to 288 lines. The simple way to convert PAL video to CIF is to similarly capture every other field, downsample it horizontally by 2:1, and introduce some jitter in the frame rate by skipping or repeating frames as necessary. Since H.261 systems typically operated at low bit rates, they also typically operated at low frame rates by skipping many of the camera source frames, so introducing some jitter in the frame rate tended not to be noticeable. More sophisticated conversion schemes (e.g., using deinterlacing to improve the vertical resolution from an NTSC camera) could also be used in higher quality systems.

In contrast to the CIF compromise that originated with the H.261 standard, there are two variants of the SIF ( Source Input Format ) that was first defined in the MPEG-1 standard. SIF is otherwise very similar to CIF. SIF on 525-line ("NTSC") based systems is 352 × 240 with a frame rate of 30000/1001 frames per second, and on 625-line ("PAL") based systems, it has the same picture size as CIF (352 × 288) but with a frame rate of 25 frames per second.

Some references to CIF are intended to refer only to its resolution (352 × 288), without intending to refer to its frame rate.

The YCbCr color representation had been previously defined in the first standard digital video source format, CCIR 601, in 1982. However, CCIR 601 uses 4:2:2 color sampling, which subsamples the Cb and Cr components only horizontally. H.261 additionally used vertical color subsampling, resulting in what is known as 4:2:0.

QCIF means "Quarter CIF". To have one quarter of the area, as "quarter" implies, the height and width of the frame are halved.

Terms also used are SQCIF (Sub Quarter CIF, sometimes Sub-QCIF), 4CIF (4 × CIF), 9CIF (9 × CIF) and 16CIF (16 × CIF). The resolutions for all of these formats are summarized in the table below.

FormatVideo Resolution Storage aspect ratio (SAR)
SQCIF128 × 964:3
QCIF176 × 14411:9
SIF(525)352 x 16011:5 (≈13:6)
CIF/SIF(625)352 × 28811:9
4SIF(525)704 x 32011:5 (≈13:6)
4CIF/4SIF(625)704 × 57611:9
9CIF1056 × 86411:9
16CIF1408 × 115211:9

xCIF pixels are not square, instead having a ″native″ aspect ratio (pixel aspect ratio (PAR)) of 12:11 (PAR = DAR  :  SAR  = 4/3 : 11/9 = 12/11), as with the standard for 625-line systems (see CCIR 601). On square-pixel displays (e.g., computer screens and many modern televisions) xCIF rasters should be rescaled so that the picture covers a 4:3 area, in order to avoid a "stretched" look: CIF content expanded horizontally by 12:11 results in a 4:3 raster of 384 × 288 square pixels (384 = 352 * 12/11). (This can happen on larger graphics displays of any aspect ratio in a window of 384 × 288 square pixels or enlarged to full screen on any larger 4:3 graphic display.)[ citation needed ]

The CIF and QCIF picture dimensions were specifically chosen to be multiples of 16 because of the way that discrete cosine transform based video compression/decompression was handled in H.261, using 16 × 16 macroblocks and 8 × 8 transform blocks. So a CIF-size image (352 × 288) contains 22 × 18 macroblocks and a QCIF image (176 × 144) contains 11 × 9 macroblocks. The 16 × 16 macroblock concept was later also used in other compression standards such as MPEG-1, MPEG-2, MPEG-4 Part 2, H.263, and H.264/MPEG-4 AVC.

See also

Related Research Articles

<span class="mw-page-title-main">Digital video</span> Digital electronic representation of moving visual images

Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises a series of digital images displayed in rapid succession, usually at 24, 25, 30, or 60 frames per second. Digital video has many advantages such as easy copying, multicasting, sharing and storage.

H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x family of video coding standards in the domain of the ITU-T.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">PAL</span> Colour encoding system for analogue television

Phase Alternating Line (PAL) is a colour encoding system for analog television. It was one of three major analogue colour television standards, the others being NTSC and SECAM. In most countries it was broadcast at 625 lines, 50 fields per second, and associated with CCIR analogue broadcast television systems B, D, G, H, I or K. The articles on analog broadcast television systems further describe frame rates, image resolution, and audio modulation.

<span class="mw-page-title-main">Video</span> Electronic moving image

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems, which, in turn, were replaced by flat-panel displays of several types.

<span class="mw-page-title-main">Interlaced video</span> Technique for doubling the perceived frame rate of a video display

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured consecutively. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the characteristics of the human visual system.

<span class="mw-page-title-main">Chroma subsampling</span> Practice of encoding color images

Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.

<span class="mw-page-title-main">Rec. 601</span> Standard from the International Telecommunication Union

ITU-R Recommendation BT.601, more commonly known by the abbreviations Rec. 601 or BT.601, is a standard originally issued in 1982 by the CCIR for encoding interlaced analog video signals in digital video form. It includes methods of encoding 525-line 60 Hz and 625-line 50 Hz signals, both with an active region covering 720 luminance samples and 360 chrominance samples per line. The color encoding system is known as YCbCr 4:2:2.

<span class="mw-page-title-main">ATSC standards</span> Standards for digital television in the US

Advanced Television Systems Committee (ATSC) standards are an international set of standards for broadcast and digital television transmission over terrestrial, cable and satellite networks. It is largely a replacement for the analog NTSC standard and, like that standard, is used mostly in the United States, Mexico, Canada, South Korea and Trinidad & Tobago. Several former NTSC users, such as Japan, have not used ATSC during their digital television transition, because they adopted other systems such as ISDB developed by Japan, and DVB developed in Europe, for example.

H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.

<span class="mw-page-title-main">White Book (CD standard)</span> CD standard for storing still pictures and motion video

The White Book refers to a standard of compact disc that stores not only sound but also still pictures and motion video. It was released in 1993 by Sony, Philips, Matsushita, and JVC. These discs, most commonly found in Asia, are usually called "Video CDs" (VCD). In some ways, VCD can be thought of as the successor to the Laserdisc and the predecessor to DVD. Note that Video CD should not be confused with CD Video which was an earlier and entirely different format.

Broadcasttelevision systems are the encoding or formatting systems for the transmission and reception of terrestrial television signals.

<span class="mw-page-title-main">480i</span> Standard-definition video mode

480i is the video mode used for standard-definition digital video in the Caribbean, Japan, South Korea, Taiwan, Philippines, Myanmar, Western Sahara, and most of the Americas. The other common standard definition digital standard, used in the rest of the world, is 576i.

<span class="mw-page-title-main">576i</span> Standard-definition video mode

576i is a standard-definition digital video mode, originally used for digitizing 625 line analogue television in most countries of the world where the utility frequency for electric power distribution is 50 Hz. Because of its close association with the legacy colour encoding systems, it is often referred to as PAL, PAL/SECAM or SECAM when compared to its 60 Hz NTSC-colour-encoded counterpart, 480i.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

Source Input Format (SIF) defined in MPEG-1, is a video format that was developed to allow the storage and transmission of digital video.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

High-definition television (HDTV) describes a television or video system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since at least 1933; in more recent times, it refers to the generation following standard-definition television (SDTV). It is the standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television.

<span class="mw-page-title-main">DVD-Video</span> Format used to store digital video on DVD discs

DVD-Video is a consumer video format used to store digital video on DVDs. DVD-Video was the dominant consumer home video format in Asia, North America, Europe, and Australia in the 2000s until it was supplanted by the high-definition Blu-ray Disc; both receive competition as delivery methods by streaming services such as Netflix and Disney+. Discs using the DVD-Video specification require a DVD drive and an MPEG-2 decoder. Commercial DVD movies are encoded using a combination of MPEG-2 compressed video and audio of varying formats. Typically, the data rate for DVD movies ranges from 3 to 9.5 Mbit/s, and the bit rate is usually adaptive. DVD-Video was first available in Japan on November 1, 1996, followed by a release on March 26, 1997, in the United States—to line up with the 69th Academy Awards that same day.

References