Frame rate

Last updated

Frame rate, most commonly expressed in frames per second or FPS, is typically the frequency (rate) at which consecutive images (frames) are captured or displayed. This definition applies to film and video cameras, computer animation, and motion capture systems. In these contexts, frame rate may be used interchangeably with frame frequency and refresh rate, which are expressed in hertz. Additionally, in the context of computer graphics performance, FPS is the rate at which a system, particularly a GPU, is able to generate frames, and refresh rate is the frequency at which a display shows completed frames. [1] In electronic camera specifications frame rate refers to the maximum possible rate frames could be captured, but in practice, other settings (such as exposure time) may reduce the actual frequency to a lower number than the frame rate.

Contents

Human vision

The temporal sensitivity and resolution of human vision varies depending on the type and characteristics of visual stimulus, and it differs between individuals. The human visual system can process 10 to 12 images per second and perceive them individually, while higher rates are perceived as motion. [2] Modulated light (such as a computer display) is perceived as stable by the majority of participants in studies when the rate is higher than 50 Hz. This perception of modulated light as steady is known as the flicker fusion threshold. However, when the modulated light is non-uniform and contains an image, the flicker fusion threshold can be much higher, in the hundreds of hertz. [3] With regard to image recognition, people have been found to recognize a specific image in an unbroken series of different images, each of which lasts as little as 13 milliseconds. [4] Persistence of vision sometimes accounts for very short single-millisecond visual stimulus having a perceived duration of between 100 ms and 400 ms. Multiple stimuli that are very short are sometimes perceived as a single stimulus, such as a 10 ms green flash of light immediately followed by a 10 ms red flash of light perceived as a single yellow flash of light. [5]

Film and video

Silent film

Early silent films had stated frame rates anywhere from 16 to 24 frames per second (fps), [6] but since the cameras were hand-cranked, the rate often changed during the scene to fit the mood. Projectionists could also change the frame rate in the theater by adjusting a rheostat controlling the voltage powering the film-carrying mechanism in the projector. [7] Film companies often intended that theaters show their silent films at higher frame rates than they were filmed at. [8] These frame rates were enough for the sense of motion, but it was perceived as jerky motion. To minimize the perceived flicker, projectors employed dual- and triple-blade shutters, so each frame was displayed two or three times, increasing the flicker rate to 48 or 72 hertz and reducing eye strain. Thomas Edison said that 46 frames per second was the minimum needed for the eye to perceive motion: "Anything less will strain the eye." [9] [10] In the mid to late 1920s, the frame rate for silent film increased to 20–26 FPS. [9]

Sound film

When sound film was introduced in 1926, variations in film speed were no longer tolerated, as the human ear is more sensitive than the eye to changes in frequency. Many theaters had shown silent films at 22 to 26 FPS, which is why the industry chose 24 FPS for sound film as a compromise. [11] From 1927 to 1930, as various studios updated equipment, the rate of 24 FPS became standard for 35 mm sound film. [2] At 24 FPS, the film travels through the projector at a rate of 456 millimetres (18.0 in) per second. This allowed simple two-blade shutters to give a projected series of images at 48 per second, satisfying Edison's recommendation. Many modern 35 mm film projectors use three-blade shutters to give 72 images per second—each frame is flashed on screen three times. [9]

Animation

This animated cartoon of a galloping horse is displayed at 12 drawings per second, and the fast motion is on the edge of being objectionably jerky. Animhorse.gif
This animated cartoon of a galloping horse is displayed at 12 drawings per second, and the fast motion is on the edge of being objectionably jerky.

In drawn animation, moving characters are often shot "on twos", that is to say, one drawing is shown for every two frames of film (which usually runs at 24 frame per second), meaning there are only 12 drawings per second. [12] Even though the image update rate is low, the fluidity is satisfactory for most subjects. However, when a character is required to perform a quick movement, it is usually necessary to revert to animating "on ones", as "twos" are too slow to convey the motion adequately. A blend of the two techniques keeps the eye fooled without unnecessary production cost. [13]

Animation for most "Saturday morning cartoons" was produced as cheaply as possible and was most often shot on "threes" or even "fours", i.e. three or four frames per drawing. This translates to only 8 or 6 drawings per second respectively. Anime is also usually drawn on threes or twos. [14] [15]

Modern video standards

Due to the mains frequency of electric grids, analog television broadcast was developed with frame rates of 50 Hz (most of the world) or 60 Hz (Canada, US, Mexico, Philippines, Japan, South Korea). The frequency of the electricity grid was extremely stable and therefore it was logical to use for synchronization.

The introduction of color television technology made it necessary to lower that 60 FPS frequency by 0.1% to avoid "dot crawl", a display artifact appearing on legacy black-and-white displays, showing up on highly-color-saturated surfaces. It was found that by lowering the frame rate by 0.1%, the undesirable effect was minimized.

As of 2021, video transmission standards in North America, Japan, and South Korea are still based on 60/1.001  59.94 images per second. Two sizes of images are typically used: 1920×1080 ("1080i/p") and 1280×720 ("720p"). Confusingly, interlaced formats are customarily stated at 1/2 their image rate, 29.97/25 FPS, and double their image height, but these statements are purely custom; in each format, 60 images per second are produced. A resolution of 1080i produces 59.94 or 50 1920×540 images, each squashed to half-height in the photographic process and stretched back to fill the screen on playback in a television set. The 720p format produces 59.94/50 or 29.97/25 1280×720p images, not squeezed, so that no expansion or squeezing of the image is necessary. This confusion was industry-wide in the early days of digital video software, with much software being written incorrectly, the developers believing that only 29.97 images were expected each second, which was incorrect. While it was true that each picture element was polled and sent only 29.97 times per second, the pixel location immediately below that one was polled 1/60 of a second later, part of a completely separate image for the next 1/60-second frame.

At its native 24 FPS rate, film could not be displayed on 60 Hz video without the necessary pulldown process, often leading to "judder": to convert 24 frames per second into 60 frames per second, every odd frame is repeated, playing twice, while every even frame is tripled. This creates uneven motion, appearing stroboscopic. Other conversions have similar uneven frame doubling. Newer video standards support 120, 240, or 300 frames per second, so frames can be evenly sampled for standard frame rates such as 24, 48 and 60 FPS film or 25, 30, 50 or 60 FPS video. Of course these higher frame rates may also be displayed at their native rates. [16] [17]

Electronic camera specifications

In electronic camera specifications frame rate refers to the maximum possible rate frames that can be captured (e.g. if the exposure time were set to near-zero), but in practice, other settings (such as exposure time) may reduce the actual frequency to a lower number than the frame rate. [18]

Computer games

In computer video games, frame rate plays an important part in the experience as, unlike film, games are rendered in real-time. 60 frames per second has for a long time been considered the "optimal" frame rate for smoothly animated game play. [19] Video games designed for PAL markets, before the sixth generation of video game consoles, had lower frame rates by design due to the 50 Hz output. This noticably made fast-paced games, such as racing or fighting games, run slower. [20]

Frame rate up-conversion

Frame rate up-conversion (FRC) is the process of increasing the temporal resolution of a video sequence by synthesizing one or more intermediate frames between two consecutive frames. A low frame rate causes aliasing, yields abrupt motion artifacts, and degrades the video quality. Consequently, the temporal resolution is an important factor affecting video quality. Algorithms for FRC are widely used in applications, including visual quality enhancement, video compression and slow-motion video generation.

Low frame rate video Low frame rate example.gif
Low frame rate video
Video with 4 times increased frame rate High frame rate video.gif
Video with 4 times increased frame rate

Methods

Most FRC methods can be categorized into optical flow or kernel-based [21] [22] and pixel hallucination-based methods. [23] [24]

Flow-based FRC

Flow-based methods linearly combine predicted optical flows between two input frames to approximate flows from the target intermediate frame to the input frames. They also propose flow reversal (projection) for more accurate image warping. Moreover, there are algorithms that give different weights of overlapped flow vectors depending on the object depth of the scene via a flow projection layer.

Pixel hallucination-based FRC

Pixel hallucination-based methods use deformable convolution to the center frame generator by replacing optical flows with offset vectors. There are algorithms that also interpolate middle frames with the help of deformable convolution in the feature domain. However, since these methods directly hallucinate pixels unlike the flow-based FRC methods, the predicted frames tend to be blurry when fast-moving objects are present.

See also

Related Research Articles

<span class="mw-page-title-main">NTSC</span> Analog television system

NTSC is the first American standard for analog television, published and adopted in 1941. In 1961, it was assigned the designation System M. It is also known as EIA standard 170.

<span class="mw-page-title-main">Interlaced video</span> Technique for doubling the perceived frame rate of a video display

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured consecutively. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the characteristics of the human visual system.

<span class="mw-page-title-main">Telecine</span> Process for broadcasting content stored on film stock

Telecine, or TK, is the process of transferring film into video and is performed in a color suite. The term is also used to refer to the equipment used in this post-production process.

The flicker fusion threshold, also known as critical flicker frequency or flicker fusion rate, is the frequency at which a flickering light appears steady to the average human observer. It is a concept studied in vision science, more specifically in the psychophysics of visual perception. A traditional term for "flicker fusion" is "persistence of vision", but this has also been used to describe positive afterimages or motion blur. Although flicker can be detected for many waveforms representing time-variant fluctuations of intensity, it is conventionally, and most easily, studied in terms of sinusoidal modulation of intensity.

The refresh rate, also known as vertical refresh rate or vertical scan rate in reference to terminology originating with the cathode-ray tubes (CRTs), is the number of times per second that a raster-based display device displays a new image. This is independent from frame rate, which describes how many images are stored or generated every second by the device driving the display. On CRT displays, higher refresh rates produce less flickering, thereby reducing eye strain. In other technologies such as liquid-crystal displays, the refresh rate affects only how often the image can potentially be updated.

Deinterlacing is the process of converting interlaced video into a non-interlaced or progressive form. Interlaced video signals are commonly found in analog television, VHS, Laserdisc, digital television (HDTV) when in the 1080i format, some DVD titles, and a smaller number of Blu-ray discs.

Flicker is a visible change in brightness between cycles displayed on video displays. It applies to the refresh interval on cathode-ray tube (CRT) televisions and computer monitors, as well as plasma computer displays and televisions.

HD-MAC was a broadcast television standard proposed by the European Commission in 1986, as part of Eureka 95 project. It belongs to the MAC - Multiplexed Analogue Components standard family. It is an early attempt by the EEC to provide High-definition television (HDTV) in Europe. It is a complex mix of analogue signal, multiplexed with digital sound, and assistance data for decoding (DATV). The video signal was encoded with a modified D2-MAC encoder.

<span class="mw-page-title-main">Stroboscopic effect</span> Visual phenomenon

The stroboscopic effect is a visual phenomenon caused by aliasing that occurs when continuous rotational or other cyclic motion is represented by a series of short or instantaneous samples at a sampling rate close to the period of the motion. It accounts for the "wagon-wheel effect", so-called because in video, spoked wheels sometimes appear to be turning backwards.

<span class="mw-page-title-main">576i</span> Standard-definition video mode

576i is a standard-definition digital video mode, originally used for digitizing 625 line analogue television in most countries of the world where the utility frequency for electric power distribution is 50 Hz. Because of its close association with the legacy colour encoding systems, it is often referred to as PAL, PAL/SECAM or SECAM when compared to its 60 Hz NTSC-colour-encoded counterpart, 480i.

<span class="mw-page-title-main">Active shutter 3D system</span> Method of displaying stereoscopic 3D images

An active shutter 3D system is a technique of displaying stereoscopic 3D images. It works by only presenting the image intended for the left eye while blocking the right eye's view, then presenting the right-eye image while blocking the left eye, and repeating this so rapidly that the interruptions do not interfere with the perceived fusion of the two images into a single 3D image.

In filmmaking, video production, animation, and related fields, a frame is one of the many still images which compose the complete moving picture. The term is derived from the historical development of film stock, in which the sequentially recorded single images look like a framed picture when examined individually.

Flicker-free is a term given to video displays, primarily cathode ray tubes, operating at a high refresh rate to reduce or eliminate the perception of screen flicker. For televisions, this involves operating at a 100 Hz or 120 Hz hertz field rate to eliminate flicker, compared to standard televisions that operate at 50 Hz or 60 Hz (NTSC), most simply done by displaying each field twice, rather than once. For computer displays, this is usually a refresh rate of 70–90 Hz, sometimes 100 Hz or higher. This should not be confused with motion interpolation, though they may be combined – see implementation, below.

High-motion is the characteristic of video or film footage displayed possessing a sufficiently high frame rate that moving images do not blur or strobe even when tracked closely by the eye. The most common forms of high motion are NTSC and PAL video at their native display rates. Movie film does not portray high motion even when shown on television monitors.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

Display motion blur, also called HDTV blur and LCD motion blur, refers to several visual artifacts that are frequently found on modern consumer high-definition television sets and flat-panel displays for computers.

<span class="mw-page-title-main">Motion interpolation</span> Form of video processing

Motion interpolation or motion-compensated frame interpolation (MCFI) is a form of video processing in which intermediate film, video or animation frames are generated between existing ones by means of interpolation, in an attempt to make animation more fluid, to compensate for display motion blur, and for fake slow motion effects.

<span class="mw-page-title-main">Soap opera effect</span> Video side effect where image is hyperreal

The soap opera effect (SOE) is a byproduct of the perceived increase in frame rate where motion interpolation may introduce a "video-look". The image has been described as "too realistic" or "too smooth" and therefore undesirable for viewing films.

In motion picture technology—either film or video—high frame rate (HFR) refers to higher frame rates than typical prior practice.

References

  1. Tamasi, Tony (3 December 2019). "What is Frame Rate and Why is it Important to PC Gaming?" . Retrieved 12 February 2023.
  2. 1 2 Read, Paul; Meyer, Mark-Paul; Gamma Group (2000). Restoration of motion picture film. Conservation and Museology. Butterworth-Heinemann. pp. 24–26. ISBN   978-0-7506-2793-1.
  3. James Davis (1986), "Humans perceive flicker artefacts at 500 Hz", Sci. Rep., 5: 7861, doi:10.1038/srep07861, PMC   4314649 , PMID   25644611
  4. Potter, Mary C. (December 28, 2013). "Detecting meaning in RSVP at 13 ms per picture" (PDF). Attention, Perception, & Psychophysics. 76 (2): 270–279. doi:10.3758/s13414-013-0605-z. hdl: 1721.1/107157 . PMID   24374558. S2CID   180862. Archived (PDF) from the original on 2022-10-09.
  5. Robert Efron (1973). "Conservation of temporal information by perceptual systems". Perception & Psychophysics. 14 (3): 518–530. doi: 10.3758/bf03211193 .
  6. Brown, Julie (2014). "Audio-visual Palimpsests: Resynchronizing Silent Films with 'Special' Music". In David Neumeyer (ed.). The Oxford Handbook of Film Music Studies. Oxford University Press. p. 588. ISBN   978-0195328493.
  7. Kerr, Walter (1975). Silent Clowns . Knopf. p.  36. ISBN   978-0394469072.
  8. Card, James (1994). Seductive cinema: the art of silent film . Knopf. p.  53. ISBN   978-0394572185.
  9. 1 2 3 Brownlow, Kevin (Summer 1980). "Silent Films: What Was the Right Speed?". Sight & Sound . 49 (3): 164–167. Archived from the original on 8 July 2011. Retrieved 2 May 2012.
  10. Elsaesser, Thomas; Barker, Adam (1990). Early cinema: space, frame, narrative. BFI Publishing. p. 284. ISBN   978-0-85170-244-5.
  11. TWiT Netcast Network (2017-03-30), How 24 FPS Became Standard, archived from the original on 2021-11-04, retrieved 2017-03-31
  12. Chew, Johnny. "What Are Ones, Twos, and Threes in Animation?". Lifewire . Retrieved August 8, 2018.
  13. Whitaker, Harold; Sito, John Halas; updated by Tim (2009). Timing for animation (2nd ed.). Amsterdam: Elsevier/Focal Press. p. 52. ISBN   978-0240521602 . Retrieved August 8, 2018.{{cite book}}: CS1 maint: multiple names: authors list (link)
  14. "Shot on threes (ones, twos, etc.)". Anime News Network.
  15. CLIP STUDIO (12 February 2016). "CLIP STUDIO PAINT アニメーション機能の使い方". Archived from the original on 2021-11-04 via YouTube.
  16. High Frame-Rate Television, BBC White Paper WHP 169, September 2008, M. Armstrong, D. Flynn, M. Hammond, PAWAN Jahajpuria S. Jolly, R. Salmon.
  17. Jon Fingas (November 27, 2014), "James Cameron's 'Avatar' sequels will stick to 48 frames per second", Engadget , retrieved April 15, 2017
  18. Whaley, Sean (21 November 2018). "What is Frame Rate and Why is it Important to PC Gaming?" . Retrieved 5 August 2021.
  19. Andy Edser (2024-03-26). "I used to be a frame rate snob but owning a Steam Deck has made me realise the error of my ways". PC Gamer. Retrieved 2024-10-01.
  20. Writer, John Linneman Senior Staff; Foundry, Digital (2018-12-02). "PlayStation Classic review: the games are great but the emulation is really poor". Eurogamer.net. Retrieved 2024-10-01.
  21. Simon, Niklaus; Long, Mai; Feng, Liu (2017). Video frame interpolation via adaptive separable convolution. ICCV. arXiv: 1708.01692 .
  22. Huaizu, Jiang; Deqing, Sun; Varun, Jampani; Ming-Hsuan, Yang; Erik, Learned-Miller; Jan, Kautz (2018). Super slomo: High quality estimation of multiple intermediate frames for video interpolation. ICCV. arXiv: 1712.00080 .
  23. Shurui, Gui; Chaoyue, Wang; Qihua, Chen; Dacheng, Tao (2020). "Featureflow: Robust video interpolation via structure-to-texture generation". 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. pp. 14001–14010. doi:10.1109/CVPR42600.2020.01402. ISBN   978-1-7281-7169-2.
  24. Myungsub, Choi; Heewon, Kim; Bohyung, Han; Ning, Xu; Kyoung, Mu Lee (2020). "Channel Attention is All You Need for Video Frame Interpolation". Proceedings of the AAAI Conference on Artificial Intelligence. 34 (7). AAAI: 10663–10671. doi: 10.1609/aaai.v34i07.6693 .

(Wayback Machine copy)