# Three-two pull down

Last updated

Three-two pull down (3:2 pull down) is a term used in filmmaking and television production for the post-production process of transferring film to video.

Filmmaking is the process of making a film, generally in the sense of films intended for extensive theatrical exhibition. Filmmaking involves a number of discrete stages including an initial story, idea, or commission, through screenwriting, casting, shooting, sound recording and reproduction, editing, and screening the finished product before an audience that may result in a film release and exhibition. Filmmaking takes place in many places around the world in a range of economic, social, and political contexts, and using a variety of technologies and cinematic techniques. Typically, it involves a large number of people, and can take from a few months to several years to complete.

Post-production is part of the process of filmmaking, video production, and photography. Post-production includes all stages of production occurring after shooting or recording individual program segments.

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

## Contents

It converts 24 frames per second into 29.97 frames per second. Roughly speaking, converting every 4 frames into 5 frames plus a slight slow down in speed. Film runs at a standard rate of 24 frames per second, whereas NTSC video has a signal frame rate of 29.97 frames per second. Every interlaced video frame has two fields for each frame. The three-two pull down is where the telecine adds a third video field (a half frame) to every second video frame, but the untrained eye cannot see the addition of this extra video field. In the figure, the film frames A-D are true or original images since they have been photographed as a complete frame. The AB and D frames on the right, the NTSC footage, are original frames. The third and fourth frames have been created by blending movie fields from different frames.

NTSC, named after the National Television System Committee, is the analog television color system that was used in North America from 1954 and until digital conversion, was used in most of the Americas ; Myanmar; South Korea; Taiwan; Philippines; Japan; and some Pacific island nations and territories.

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured at two different times. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the phi phenomenon.

In filmmaking, video production, animation, and related fields, a frame is one of the many still images which compose the complete moving picture. The term is derived from the fact that, from the beginning of modern filmmaking toward the end of the 20th century, and in many places still up to the present, the single images have been recorded on a strip of photographic film that quickly increased in length, historically; each image on such a strip looks rather like a framed picture when examined individually.

## Video

### 2:3

In the United States and other countries where television uses the 59.94 Hz vertical scanning frequency, video is broadcast at 29.97 frame/s. For the film's motion to be accurately rendered on the video signal, a telecine must use a technique called the 2:3 pull down (or a variant called 3:2 pull down) to convert from 24 to 29.97 frame/s.

The term “pulldown” comes from the mechanical process of “pulling” (physically moving) the film downward within the film portion of the transport mechanism, to advance it from one frame to the next at a repetitive rate (nominally 24 frames/s). This is accomplished in two steps.

The first step is to slow down the film motion by 1/1000 to 23.976 frames/s (or 24 frames every 1.001 seconds). The difference in speed is imperceptible to the viewer. For a two-hour film, play time is extended by 7.2 seconds.

The second step of the 2:3 pulldown is distributing cinema frames into video fields. At 23.976 frame/s, there are four frames of film for every five frames of 29.97 Hz video:

${\displaystyle {\frac {23.976}{29.97}}={\frac {4}{5}}}$

These four frames needs to be “stretched” into five frames by exploiting the interlaced nature of video. Since an interlaced video frame is made up of two incomplete fields (one for the odd-numbered lines of the image, and one for the even-numbered lines), conceptually four frames needs to be used in ten fields (to produce five frames).

The term "2:3" comes from the pattern for producing fields in the new video frames. The pattern of 2-3 is an abbreviation of the actual pattern of 2-3-2-3, which indicates that the first film frame is used in 2 fields, the second film frame is used in 3 fields, the third film frame is used in 2 fields, and the fourth film frame is used in 3 fieldsproducing a total of 10 fields, or 5 video frames. If the four film frames are called A, B, C and D, the five video frames produces are A1-A2, B1-B2, B2-C1, C2-D1 and D2-D2. That is, frame A is used 2 times (in both fields of the first video frame); frame B is used 3 times (in both fields of the second video frame and in one of the fields of the third video frame); frame C is used 2 times (in the other field of the third video frame, and in one of the fields of the fourth video frame); and frame D is used 3 times (in the other field of the fourth video frame, and in both fields of the fifth video frame). The 2-3-2-3 cycle repeats itself completely after four film frames have been exposed.

### 3:2

The alternative "3:2" pattern is similar to the one shown above, except it is shifted by one frame. For instance, a cycle that starts with film frame B yields a 3:2 pattern: B1-B2, B2-C1, C2-D1, D2-D2, A1-A2 or 3-2-3-2 or simply 3-2. In other words, there is no difference between the 2-3 and 3-2 patterns. In fact, the "3-2" notation is misleading because according to SMPTE standards for every four-frame film sequence the first frame is scanned twice, not three times. [1]

### Modern alternatives

The above method is a "classic" 2:3, which was used before frame buffers allowed for holding more than one frame. It has the disadvantage of creating two dirty frames (which are a mix from two different film frames) and three clean frames (which matches an unmodified film frame) in every five video frames.

The preferred method for doing a 2:3 creates only one dirty frame in every five (i.e. 3:3:2:2 or 2:3:3:2 or 2:2:3:3); while this method has a slight bit more judder, it allows for easier upconversion (the dirty frame can be dropped without losing information) and a better overall compression when encoding. The 2:3:3:2 pattern is supported by the Panasonic DVX-100B video camera under the name "Advanced Pulldown". Note that just fields are displayed—no frames hence no dirty frames—in interlaced display such as on a CRT. Dirty frames may appear in other methods of displaying the interlaced video.

The Panasonic AG-DVX100 was released in December 2002. Its 60 Hz version was the first consumer-affordable digital camcorder capable of recording video at 24 progressive frames per second.

## Audio

The rate of NTSC video (initially color, only, but soon thereafter monochrome and color) is 29.97 frames per second, or one-thousandth slower than 30 frame/s, due to the NTSC color encoding process which mandated that the line rate be a sub-multiple of the 3.579545 MHz color "burst" frequency, or 15734.2637 Hz (29.9700 Hz, frame rate), rather than the (60 Hz) ac "line locked" line rate of 15750.0000… Hz (30.0000… Hz, frame rate). This seemingly odd relationship proved to be essential to eliminating moiré and other image defects.

Although slight, the sync will catch up, and the audio can end up being several seconds out of sync with the image. In order to correct this error, the telecine can either pull up or pull down. A pull up will speed up the sound by 0.1%, used for transferring video to film. A pull down will slow the audio speed down by 0.1%, necessary for transferring film to video.

## Related Research Articles

SMPTE timecode is a set of cooperating standards to label individual frames of video or film with a timecode. The system is defined by the Society of Motion Picture and Television Engineers in the SMPTE 12M specification. SMPTE revised the standard in 2008, turning it into a two-part document: SMPTE 12M-1 and SMPTE 12M-2, including new explanations and clarifications.

Telecine is the process of transferring motion picture film into video and is performed in a color suite. The term is also used to refer to the equipment used in the post-production process. Telecine enables a motion picture, captured originally on film stock, to be viewed with standard video equipment, such as television sets, video cassette recorders (VCR), DVD, Blu-ray Disc or computers. Initially, this allowed television broadcasters to produce programmes using film, usually 16mm stock, but transmit them in the same format, and quality, as other forms of television production. Furthermore, telecine allows film producers, television producers and film distributors working in the film industry to release their products on video and allows producers to use video production equipment to complete their filmmaking projects. Within the film industry, it is also referred to as a TK, because TC is already used to designate timecode.

Terrestrial television systems are encoding or formatting standards for the transmission and reception of terrestrial television signals. There were three main analogue television systems in use around the world until the late 2010s (expected): NTSC, PAL, and SECAM. Now in digital terrestrial television (DTT), there are four main systems in use around the world: ATSC, DVB, ISDB and DTMB.

Enhanced-definition television, or extended-definition television (EDTV) is an American Consumer Electronics Association (CEA) marketing shorthand term for certain digital television (DTV) formats and devices. Specifically, this term defines formats that deliver a picture superior to that of standard-definition television (SDTV) but not as detailed as high-definition television (HDTV).

The refresh rate is the number of times in a second that a display hardware updates its buffer. This is distinct from the measure of frame rate. The refresh rate includes the repeated drawing of identical frames, while frame rate measures how often a video source can feed an entire frame of new data to a display.

In video technology, 24p refers to a video format that operates at 24 frames per second frame rate with progressive scanning. Originally, 24p was used in the non-linear editing of film-originated material. Today, 24p formats are being increasingly used for aesthetic reasons in image acquisition, delivering film-like motion characteristics. Some vendors advertise 24p products as a cheaper alternative to film acquisition.

Deinterlacing is the process of converting interlaced video, such as common analog television signals or 1080i format HDTV signals, into a non-interlaced form.

720p is a progressive HDTV signal format with 720 horizontal lines and an aspect ratio (AR) of 16:9, normally known as widescreen HDTV (1.78:1). All major HDTV broadcasting standards include a 720p format, which has a resolution of 1280×720; however, there are other formats, including HDV Playback and AVCHD for camcorders, that use 720p images with the standard HDTV resolution. The frame rate is standards-dependent, and for conventional broadcasting appears in 50 progressive frames per second in former PAL/SECAM countries, and 59.94 frames per second in former NTSC countries.

1080i is an abbreviation referring to a combination of frame resolution and scan type, used in high-definition television (HDTV) and high-definition video. The number "1080" refers to the number of horizontal lines on the screen. The "i" is an abbreviation for "interlaced"; this indicates that only the odd lines, then the even lines of each frame are drawn alternately, so that only half the number of actual image frames are used to produce video. A related display resolution is 1080p, which also has 1080 lines of resolution; the "p" refers to progressive scan, which indicates that the lines of resolution for each frame are "drawn" in on the screen sequence.

576i is a standard-definition video mode originally used for broadcast television in most countries of the world where the utility frequency for electric power distribution is 50 Hz. Because of its close association with the color encoding system, it is often referred to as simply PAL, PAL/SECAM or SECAM when compared to its 60 Hz NTSC-color-encoded counterpart, 480i. In digital applications it is usually referred to as "576i"; in analogue contexts it is often called "625 lines", and the aspect ratio is usually 4:3 in analogue transmission and 16:9 in digital transmission.

Film-out is the process in the computer graphics, video production and filmmaking disciplines of transferring images or animation from videotape or digital files to a traditional film print. "Film-out" is a broad term that encompasses the conversion of frame rates, color correction, as well as the actual printing, also called scanning or recording.

1080p is a set of HDTV high-definition video modes characterized by 1,920 pixels displayed across the screen horizontally and 1,080 pixels down the screen vertically; the p stands for progressive scan, i.e. non-interlaced. The term usually assumes a widescreen aspect ratio of 16:9, implying a resolution of 2.1 megapixels. It is often marketed as full HD, to contrast 1080p with 720p resolution screens.

Progressive segmented Frame is a scheme designed to acquire, store, modify, and distribute progressive scan video using interlaced equipment.

Cinema Tools is a software program for filmmakers, to use in conjunction with Final Cut Pro. It facilitates the creation of an integrated film database, allowing the management of film material through telecine.

This article discusses moving image capture, transmission and presentation from today's technical and creative points of view; concentrating on aspects of frame rates.

Television standards conversion is the process of changing one type of television system to another. The most common is from NTSC to PAL or the other way around. This is done so television programs in one nation may be viewed in a nation with a different standard. The video is fed through a video standards converter that changes the video to a different video system.

High-definition television (HDTV) is a television system providing an image resolution that is of substantially higher resolution than that of standard-definition television. This can be either analog or digital. HDTV is the current standard video format used in most broadcasts: terrestrial broadcast television, cable television, satellite television, Blu-rays, and streaming video.

## References

1. Charles Poynton, Digital Video and HDTV: Algorithms and Interfaces., page 430