Common Intermediate Format

Last updated

CIF (Common Intermediate Format or Common Interchange Format), also known as FCIF (Full Common Intermediate Format), is a standardized format for the picture resolution, frame rate, color space, and color subsampling of digital video sequences used in video teleconferencing systems. It was first defined in the H.261 standard in 1988.

Contents

Comparison of CIF formats and D-1 CIF and D1 definitions comparison.svg
Comparison of CIF formats and D-1


As the word "common" in its name implies, CIF was designed as a common compromise format to be relatively easy to convert for use either with PAL or NTSC standard displays and cameras. CIF defines a video sequence with a resolution of 352 × 288, which has a simple relationship to the PAL picture size, but with a frame rate of 30000/1001 (roughly 29.97) frames per second like NTSC, with color encoded using a YCbCr representation with 4:2:0 color sampling. It was designed as a compromise between PAL and NTSC schemes, since it uses a picture size that corresponds most easily to PAL, but uses the frame rate of NTSC. The compromise was established as a way to reach international agreement so that video conferencing systems in different countries could communicate with each other without needing two separate modes for displaying the received video.

Technical details

The simple way to convert NTSC video to CIF is to capture every other field (e.g., the top fields) of interlaced video, downsample it by 2:1 horizontally to convert 704 samples per line to 352 samples per line, and upsample it vertically by a ratio of 6:5 vertically to convert 240 lines to 288 lines. The simple way to convert PAL video to CIF is to similarly capture every other field, downsample it horizontally by 2:1, and introduce some jitter in the frame rate by skipping or repeating frames as necessary. Since H.261 systems typically operated at low bit rates, they also typically operated at low frame rates by skipping many of the camera source frames, so introducing some jitter in the frame rate tended not to be noticeable. More sophisticated conversion schemes (e.g., using deinterlacing to improve the vertical resolution from an NTSC camera) could also be used in higher quality systems.

In contrast to the CIF compromise that originated with the H.261 standard, there are two variants of the SIF ( Source Input Format ) that was first defined in the MPEG-1 standard. SIF is otherwise very similar to CIF. SIF on 525-line ("NTSC") based systems is 352 × 240 with a frame rate of 30000/1001 frames per second, and on 625-line ("PAL") based systems, it has the same picture size as CIF (352 × 288) but with a frame rate of 25 frames per second.

Some references to CIF are intended to refer only to its resolution (352 × 288), without intending to refer to its frame rate.

The YCbCr color representation had been previously defined in the first standard digital video source format, CCIR 601, in 1982. However, CCIR 601 uses 4:2:2 color sampling, which subsamples the Cb and Cr components only horizontally. H.261 additionally used vertical color subsampling, resulting in what is known as 4:2:0.

QCIF means "Quarter CIF". To have one quarter of the area, as "quarter" implies, the height and width of the frame are halved.

Terms also used are SQCIF (Sub Quarter CIF, sometimes Sub-QCIF), 4CIF (4 × CIF), 9CIF (9 × CIF) and 16CIF (16 × CIF). The resolutions for all of these formats are summarized in the table below.

FormatVideo Resolution Storage aspect ratio (SAR)
SQCIF128 × 964:3
QCIF176 × 14411:9
SIF(525)352 x 24022:15 (≈13:9)
CIF/SIF(625)352 × 28811:9
4SIF(525)704 x 48022:15 (≈13:9)
4CIF/4SIF(625)704 × 57611:9
9CIF1056 × 86411:9
16CIF1408 × 115211:9

xCIF pixels are not square, instead having a ″native″ aspect ratio (pixel aspect ratio (PAR)) of 12:11 (PAR = DAR  :  SAR  = 4/3 : 11/9 = 12/11), as with the standard for 625-line systems (see CCIR 601). On square-pixel displays (e.g., computer screens and many modern televisions) xCIF rasters should be rescaled so that the picture covers a 4:3 area, in order to avoid a "stretched" look: CIF content expanded horizontally by 12:11 results in a 4:3 raster of 384 × 288 square pixels (384 = 352 * 12/11). (This can happen on larger graphics displays of any aspect ratio in a window of 384 × 288 square pixels or enlarged to full screen on any larger 4:3 graphic display.)[ citation needed ]

The CIF and QCIF picture dimensions were specifically chosen to be multiples of 16 because of the way that discrete cosine transform based video compression/decompression was handled in H.261, using 16 × 16 macroblocks and 8 × 8 transform blocks. So a CIF-size image (352 × 288) contains 22 × 18 macroblocks and a QCIF image (176 × 144) contains 11 × 9 macroblocks. The 16 × 16 macroblock concept was later also used in other compression standards such as MPEG-1, MPEG-2, MPEG-4 Part 2, H.263, and H.264/MPEG-4 AVC.

Related Research Articles

<span class="mw-page-title-main">Digital video</span> Digital electronic representation of moving visual images

Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises a series of digital images displayed in rapid succession, usually at 24, 30, or 60 frames per second. Digital video has many advantages such as easy copying, multicasting, sharing and storage.

H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x family of video coding standards in the domain of the ITU-T.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">PAL</span> Colour encoding system for analogue television

Phase Alternating Line (PAL) is a colour encoding system for analog television. It was one of three major analogue colour television standards, the others being NTSC and SECAM. In most countries it was broadcast at 625 lines, 50 fields per second, and associated with CCIR analogue broadcast television systems B, D, G, H, I or K. The articles on analog broadcast television systems further describe frame rates, image resolution, and audio modulation.

<span class="mw-page-title-main">Standard-definition television</span> Digital TV with similar definition to analog broadcasts

Standard-definition television is a television system that uses a resolution that is not considered to be either high or enhanced definition. Standard refers to offering a similar resolution to the analog broadcast systems used when it was introduced.

<span class="mw-page-title-main">Video</span> Electronic moving image

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems, which, in turn, were replaced by flat-panel displays of several types.

<span class="mw-page-title-main">Interlaced video</span> Technique for doubling the perceived frame rate of a video display

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured consecutively. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the characteristics of the human visual system.

<span class="mw-page-title-main">Chroma subsampling</span> Practice of encoding images

Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.

<span class="mw-page-title-main">Rec. 601</span> Standard from the International Telecommunication Union

ITU-R Recommendation BT.601, more commonly known by the abbreviations Rec. 601 or BT.601, is a standard originally issued in 1982 by the CCIR for encoding interlaced analog video signals in digital video form. It includes methods of encoding 525-line 60 Hz and 625-line 50 Hz signals, both with an active region covering 720 luminance samples and 360 chrominance samples per line. The color encoding system is known as YCbCr 4:2:2.

<span class="mw-page-title-main">ATSC standards</span> Standards for digital television in the US

Advanced Television Systems Committee (ATSC) standards are an International set of standards for broadcast and digital television transmission over terrestrial, cable and satellite networks. It is largely a replacement for the analog NTSC standard and, like that standard, is used mostly in the United States, Mexico, Canada, South Korea and Trinidad & Tobago. Several former NTSC users, such as Japan, have not used ATSC during their digital television transition, because they adopted other systems such as ISDB developed by Japan, and DVB developed in Europe, for example.

H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.

<span class="mw-page-title-main">White Book (CD standard)</span> CD standard for storing still pictures and motion video

The White Book refers to a standard of compact disc that stores not only sound but also still pictures and motion video. It was released in 1993 by Sony, Philips, Matsushita, and JVC. These discs, most commonly found in Asia, are usually called "Video CDs" (VCD). In some ways, VCD can be thought of as the successor to the Laserdisc and the predecessor to DVD. Note that Video CD should not be confused with CD Video which was an earlier and entirely different format.

Broadcasttelevision systems are the encoding or formatting systems for the transmission and reception of terrestrial television signals.

1080i is a combination of frame resolution and scan type. 1080i is used in high-definition television (HDTV) and high-definition video. The number "1080" refers to the number of horizontal lines on the screen. The "i" is an abbreviation for "interlaced"; this indicates that only the even lines of each frame, then only the odd lines, are drawn alternately, so that only half the number of lines are ever updated at once. A related display resolution is 1080p, which also has 1080 lines of resolution; the "p" refers to progressive scan, which indicates that each full frame appears on the screen in sequence.

<span class="mw-page-title-main">480i</span> Standard-definition video mode

480i is the video mode used for standard-definition digital video in the Caribbean, Japan, South Korea, Taiwan, Philippines, Myanmar, Western Sahara, and most of the Americas. The other common standard definition digital standard, used in the rest of the world, is 576i.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

Source Input Format (SIF) defined in MPEG-1, is a video format that was developed to allow the storage and transmission of digital video.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

High-definition television (HDTV) describes a television or video system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since at least 1933; in more recent times, it refers to the generation following standard-definition television (SDTV). It is currently the standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television.

References

See also