Common Intermediate Format

Last updated

CIF (Common Intermediate Format or Common Interchange Format), also known as FCIF (Full Common Intermediate Format), is a standardized format for the picture resolution, frame rate, color space, and color subsampling of digital video sequences used in video teleconferencing systems. It was first defined in the H.261 standard in 1988.

Contents

Comparison of CIF formats and D-1 CIF and D1 definitions comparison.svg
Comparison of CIF formats and D-1


As the word "common" in its name implies, CIF was designed as a common compromise format to be relatively easy to convert for use either with PAL or NTSC standard displays and cameras. CIF defines a video sequence with a resolution of 352 × 288, which has a simple relationship to the PAL picture size, but with a frame rate of 30000/1001 (roughly 29.97) frames per second like NTSC, with color encoded using a YCbCr representation with 4:2:0 color sampling. It was designed as a compromise between PAL and NTSC schemes, since it uses a picture size that corresponds most easily to PAL, but uses the frame rate of NTSC. The compromise was established as a way to reach international agreement so that video conferencing systems in different countries could communicate with each other without needing two separate modes for displaying the received video.

Technical details

The simple way to convert NTSC video to CIF is to capture every other field (e.g., the top fields) of interlaced video, downsample it by 2:1 horizontally to convert 704 samples per line to 352 samples per line, and upsample it vertically by a ratio of 6:5 vertically to convert 240 lines to 288 lines. The simple way to convert PAL video to CIF is to similarly capture every other field, downsample it horizontally by 2:1, and introduce some jitter in the frame rate by skipping or repeating frames as necessary. Since H.261 systems typically operated at low bit rates, they also typically operated at low frame rates by skipping many of the camera source frames, so introducing some jitter in the frame rate tended not to be noticeable. More sophisticated conversion schemes (e.g., using deinterlacing to improve the vertical resolution from an NTSC camera) could also be used in higher quality systems.

In contrast to the CIF compromise that originated with the H.261 standard, there are two variants of the SIF ( Source Input Format ) that was first defined in the MPEG-1 standard. SIF is otherwise very similar to CIF. SIF on 525-line ("NTSC") based systems is 352 × 240 with a frame rate of 30000/1001 frames per second, and on 625-line ("PAL") based systems, it has the same picture size as CIF (352 × 288) but with a frame rate of 25 frames per second.

Some references to CIF are intended to refer only to its resolution (352 × 288), without intending to refer to its frame rate.

The YCbCr color representation had been previously defined in the first standard digital video source format, CCIR 601, in 1982. However, CCIR 601 uses 4:2:2 color sampling, which subsamples the Cb and Cr components only horizontally. H.261 additionally used vertical color subsampling, resulting in what is known as 4:2:0.

QCIF means "Quarter CIF". To have one quarter of the area, as "quarter" implies, the height and width of the frame are halved.

Terms also used are SQCIF (Sub Quarter CIF, sometimes Sub-QCIF), 4CIF (4 × CIF), 9CIF (9 × CIF) and 16CIF (16 × CIF). The resolutions for all of these formats are summarized in the table below.

FormatVideo Resolution Storage aspect ratio (SAR)
SQCIF128 × 964:3
QCIF176 × 14411:9
SIF(525)352 x 24022:15 (≈13:9)
CIF/SIF(625)352 × 28811:9
4SIF(525)704 x 48022:15 (≈13:9)
4CIF/4SIF(625)704 × 57611:9
9CIF1056 × 86411:9
16CIF1408 × 115211:9

xCIF pixels are not square, instead having a ″native″ aspect ratio (pixel aspect ratio (PAR)) of 12:11 (PAR = DAR  :  SAR  = 4/3 : 11/9 = 12/11), as with the standard for 625-line systems (see CCIR 601). On square-pixel displays (e.g., computer screens and many modern televisions) xCIF rasters should be rescaled so that the picture covers a 4:3 area, in order to avoid a "stretched" look: CIF content expanded horizontally by 12:11 results in a 4:3 raster of 384 × 288 square pixels (384 = 352 * 12/11). (This can happen on larger graphics displays of any aspect ratio in a window of 384 × 288 square pixels or enlarged to full screen on any larger 4:3 graphic display. A citation is needed that there are displays with a resolution of 384 × 288 pixels)

The CIF and QCIF picture dimensions were specifically chosen to be multiples of 16 because of the way that discrete cosine transform based video compression/decompression was handled in H.261, using 16 × 16 macroblocks and 8 × 8 transform blocks. So a CIF-size image (352 × 288) contains 22 × 18 macroblocks and a QCIF image (176 × 144) contains 11 × 9 macroblocks. The 16 × 16 macroblock concept was later also used in other compression standards such as MPEG-1, MPEG-2, MPEG-4 Part 2, H.263, and H.264/MPEG-4 AVC.

Related Research Articles

<span class="mw-page-title-main">Digital video</span> Digital electronic representation of moving visual images

Digital video is an electronic representation of moving visual images (video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprises a series of digital images displayed in rapid succession, usually at 24 frames per second. Digital video has many advantages such as easy copying, multicasting, sharing and storage.

H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x family of video coding standards in the domain of the ITU-T.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">Standard-definition television</span> Original analog television systems

Standard-definition television is a television system which uses a resolution that is not considered to be either high or enhanced definition. "Standard" refers to it being the prevailing specification for broadcast television in the mid- to late-20th century, and compatible with legacy analog broadcast systems.

<span class="mw-page-title-main">Video</span> Electronic moving image

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems which, in turn, were replaced by flat panel displays of several types.

<span class="mw-page-title-main">Interlaced video</span> Technique for doubling the perceived frame rate of a video display

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured consecutively. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the phi phenomenon.

<span class="mw-page-title-main">ATSC standards</span> Standards for digital television in the US

Advanced Television Systems Committee (ATSC) standards are an American set of standards for digital television transmission over terrestrial, cable and satellite networks. It is largely a replacement for the analog NTSC standard and, like that standard, is used mostly in the United States, Mexico, Canada, and South Korea. Several former NTSC users, such as Japan, have not used ATSC during their digital television transition, because they adopted other systems such as ISDB developed by Japan, and DVB developed in Europe, for example.

H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.

<span class="mw-page-title-main">White Book (CD standard)</span> CD standard for storing still pictures and motion music

The White Book refers to a standard of compact disc that stores not only sound but also still pictures and motion video. It was released in 1993 by Sony, Philips, Matsushita, and JVC. These discs, most commonly found in Asia, are usually called "Video CDs" (VCD). In some ways, VCD can be thought of as the successor to the Laserdisc and the predecessor to DVD. Note that Video CD should not be confused with CD Video which was an earlier and entirely different format.

Broadcasttelevision systems are the encoding or formatting systems for the transmission and reception of terrestrial television signals.

<span class="mw-page-title-main">Display resolution</span> Number of distinct pixels in each dimension that can be displayed

The display resolution or display modes of a digital television, computer monitor or display device is the number of distinct pixels in each dimension that can be displayed. It can be an ambiguous term especially as the displayed resolution is controlled by different factors in cathode ray tube (CRT) displays, flat-panel displays and projection displays using fixed picture-element (pixel) arrays.

1080i is a combination of frame resolution and scan type. 1080i is used in high-definition television (HDTV) and high-definition video. The number "1080" refers to the number of horizontal lines on the screen. The "i" is an abbreviation for "interlaced"; this indicates that only the even lines, then the odd lines of each frame are drawn alternately, so that only half the number of actual image frames are used to produce video. A related display resolution is 1080p, which also has 1080 lines of resolution; the "p" refers to progressive scan, which indicates that the lines of resolution for each frame are "drawn" on the screen in sequence.

<span class="mw-page-title-main">576i</span> Standard-definition video mode

576i is a standard-definition digital video mode, originally used for digitizing analog television in most countries of the world where the utility frequency for electric power distribution is 50 Hz. Because of its close association with the legacy color encoding systems, it is often referred to as PAL, PAL/SECAM or SECAM when compared to its 60 Hz NTSC-colour-encoded counterpart, 480i.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

Source Input Format (SIF) defined in MPEG-1, is a video format that was developed to allow the storage and transmission of digital video.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

High-definition television describes a television system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the generation following standard-definition television (SDTV), often abbreviated to HDTV or HD-TV. It is the current de facto standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television and Blu-ray Discs.

<span class="mw-page-title-main">DVD-Video</span> Format used to store digital video on DVD discs

DVD-Video is a consumer video format used to store digital video on DVD discs. DVD-Video was the dominant consumer home video format in Asia, North America, Europe, and Australia in the 2000s until it was supplanted by the high-definition Blu-ray Disc. Discs using the DVD-Video specification require a DVD drive and an MPEG-2 decoder. Commercial DVD movies are encoded using a combination of MPEG-2 compressed video and audio of varying formats. Typically, the data rate for DVD movies ranges from 3 to 9.5 Mbit/s, and the bit rate is usually adaptive. DVD-Video was first available in Japan on November 1, 1996, followed by a release on March 24, 1997 in the United States—to line up with the 69th Academy Awards that same day.

<span class="mw-page-title-main">Super Video CD</span> Video CD-based optical disc format

Super Video CD is a digital format for storing video on standard compact discs. SVCD was intended as a successor to Video CD and an alternative to DVD-Video, and falls somewhere between both in terms of technical capability and picture quality.

References

See also