Motiongram

Last updated
Overview of the process of creating a motiongram Overview of the creation of motiongrams.jpg
Overview of the process of creating a motiongram
Motiongram of a 5-minute dance sequence Motiongrams of a dancer's free movement improvisation for 5 minutes.jpg
Motiongram of a 5-minute dance sequence
Example of how motiongrams can be used in a comparative study of dancers' movements (5 minutes). Motiongrams of three dancers' free movement improvisation for 5 minutes.jpg
Example of how motiongrams can be used in a comparative study of dancers' movements (5 minutes).
Example of how motiongrams can be used in a comparative study of dancers' movements (40 seconds). Motiongrams of three dancers' free movement improvisation for 40 seconds.jpg
Example of how motiongrams can be used in a comparative study of dancers' movements (40 seconds).

A motiongram is a spatiotemporal display of motion, created from a video recording. Motiongrams are created using a video processing technique, in which a motion image is collapsed into a series of 1 pixel wide stripes and plotted next to one another.

Contents

Challenge

The traditional way of representing (human) motion visually, is either by displaying individual frames from a video file as a keyframe display or by doing one or more forms of feature extraction and subsequent plotting of the resultant data. Neither of these are ideal to give a good impression of the actual motion happening in the sequence. Keyframe displays only show postures and not motion, while feature graphs often rely on multiple analysis steps. Motiongrams are useful as an intermediary step to show spatiotemporal information of a movement sequence, but with no need for doing a full analysis of the data.

History

The term motiongram was first proposed by Alexander Refsum Jensenius in 2006, [1] and the technique has later been refined by him and others. The technique was inspired by the pioneering work of Muybridge and Marey from the 19th century, as well as the slit-scan community in the 20th century. Motiongrams resemble slit-scan photographs, but are different in that they are created from frame-differenced motion-images. The result is that only motion is shown in the final display.

Implementation

An overview of creating a motiongram is shown in Figure 1. The process starts by reading a video stream, and doing simple image adjustments, e.g., changing the brightness and contrast. The next step involves producing the motion image by calculating the absolute frame difference between successive video frames, followed by applying some noise removal algorithms. The motiongram is created by calculating the normalized mean value for each row or column in the motion image. This means that for each image matrix of size MxN, a Mx1 or 1xN matrix is calculated. Drawing these 1 pixel wide “stripes” next to each other over time results in the final motiongram.

The motiongram technique has been implemented as modules in the open framework Jamoma for Max, as well as for the EyesWeb platform and in Matlab. There are also two standalone applications for OSX and Windows: VideoAnalysis (non-realtime) and AudioVideoAnalysis (realtime).

Examples

Motiongrams allow for quick navigation in video material and for comparative analysis of motion qualities. Although quite rough, it is easy to see differences in the quantity of motion and similarities in upward/downward patterns between motion sequences.

Limitations

Since motiongrams are made by averaging over either the rows or columns of a video file, only one dimension will be shown in the final motiongram. Thus a horizontal motiongram will mainly represent vertical motion, while a vertical motiongram will mainly represent horizontal motion.

Related Research Articles

<span class="mw-page-title-main">Analog television</span> Television that uses analog signals

Analog television is the original television technology that uses analog signals to transmit video and audio. In an analog television broadcast, the brightness, colors and sound are represented by amplitude, phase and frequency of an analog signal.

<span class="mw-page-title-main">Video</span> Electronic moving image

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems, which, in turn, were replaced by flat-panel displays of several types.

<span class="mw-page-title-main">Motion compensation</span> Video compression technique, used to efficiently predict and generate video frames

Motion compensation in computing is an algorithmic technique used to predict a frame in a video given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

<span class="mw-page-title-main">Interlaced video</span> Technique for doubling the perceived frame rate of a video display

Interlaced video is a technique for doubling the perceived frame rate of a video display without consuming extra bandwidth. The interlaced signal contains two fields of a video frame captured consecutively. This enhances motion perception to the viewer, and reduces flicker by taking advantage of the characteristics of the human visual system.

<span class="mw-page-title-main">Telecine</span> Process for broadcasting content stored on film stock

Telecine is the process of transferring film into video and is performed in a color suite. The term is also used to refer to the equipment used in this post-production process.

Anamorphic widescreen is a process by which a comparatively wide widescreen image is horizontally compressed to fit into a storage medium with a narrower aspect ratio, reducing the horizontal resolution of the image while keeping its full original vertical resolution. Compatible play-back equipment can then expand the horizontal dimension to show the original widescreen image. This is typically used to allow one to store widescreen images on a medium that was originally intended for a narrower ratio, while using as much of the frame – and therefore recording as much detail – as possible.

The refresh rate, also known as vertical refresh rate or vertical scan rate in reference to terminology originating with the cathode-ray tubes (CRTs), is the number of times per second that a raster-based display device displays a new image. This is independent from frame rate, which describes how many images are stored or generated every second by the device driving the display. On CRT displays, higher refresh rates produce less flickering, thereby reducing eye strain. In other technologies such as liquid-crystal displays, the refresh rate affects only how often the image can potentially be updated.

<span class="mw-page-title-main">Display resolution</span> Width and height of a display in pixels

The display resolution or display modes of a digital television, computer monitor, or other display device is the number of distinct pixels in each dimension that can be displayed. It can be an ambiguous term especially as the displayed resolution is controlled by different factors in cathode ray tube (CRT) displays, flat-panel displays and projection displays using fixed picture-element (pixel) arrays.

Deinterlacing is the process of converting interlaced video into a non-interlaced or progressive form. Interlaced video signals are commonly found in analog television, VHS, Laserdisc, digital television (HDTV) when in the 1080i format, some DVD titles, and a smaller number of Blu-ray discs.

1080i is a combination of frame resolution and scan type. 1080i is used in high-definition television (HDTV) and high-definition video. The number "1080" refers to the number of horizontal lines on the screen. The "i" is an abbreviation for "interlaced"; this indicates that only the even lines of each frame, then only the odd lines, are drawn alternately, so that only half the number of lines are ever updated at once. A related display resolution is 1080p, which also has 1080 lines of resolution; the "p" refers to progressive scan, which indicates that each full frame appears on the screen in sequence.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

In filmmaking, video production, animation, and related fields, a frame is one of the many still images which compose the complete moving picture. The term is derived from the historical development of film stock, in which the sequentially recorded single images look like a framed picture when examined individually.

<span class="mw-page-title-main">Image stitching</span> Combining multiple photographic images with overlapping fields of view

Image stitching or photo stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Commonly performed through the use of computer software, most approaches to image stitching require nearly exact overlaps between images and identical exposures to produce seamless results, although some stitching algorithms actually benefit from differently exposed images by doing high-dynamic-range imaging in regions of overlap. Some digital cameras can stitch their photos internally.

This article discusses moving image capture, transmission and presentation from today's technical and creative points of view; concentrating on aspects of frame rates.

<span class="mw-page-title-main">Raster scan</span> Rectangular pattern of image capture and reconstruction

A raster scan, or raster scanning, is the rectangular pattern of image capture and reconstruction in television. By analogy, the term is used for raster graphics, the pattern of image storage and transmission used in most computer bitmap image systems. The word raster comes from the Latin word rastrum, which is derived from radere ; see also rastrum, an instrument for drawing musical staff lines. The pattern left by the lines of a rake, when drawn straight, resembles the parallel lines of a raster: this line-by-line scanning is what creates a raster. It is a systematic process of covering the area progressively, one line at a time. Although often a great deal faster, it is similar in the most general sense to how one's gaze travels when one reads lines of text.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

Display motion blur, also called HDTV blur and LCD motion blur, refers to several visual artifacts that are frequently found on modern consumer high-definition television sets and flat panel displays for computers.

Display lag is a phenomenon associated with most types of liquid crystal displays (LCDs) like smartphones and computers and nearly all types of high-definition televisions (HDTVs). It refers to latency, or lag between when the signal is sent to the display and when the display starts to show that signal. This lag time has been measured as high as 68 ms, or the equivalent of 3-4 frames on a 60 Hz display. Display lag is not to be confused with pixel response time, which is the amount of time it takes for a pixel to change from one brightness value to another. Currently the majority of manufacturers quote the pixel response time, but neglect to report display lag.

<span class="mw-page-title-main">Strip photography</span> Type of photographic technique

Strip photography, or slit photography, is a photographic technique of capturing a two-dimensional image as a sequence of one-dimensional images over time, in contrast to a normal photo which is a single two-dimensional image at one point in time. A moving scene is recorded, over a period of time, using a camera that observes a narrow strip rather than the full field. If the subject is moving through this observed strip at constant speed, they will appear in the finished photo as a visible object. Stationary objects, like the background, will be the same the whole way across the photo and appear as stripes along the time axis; see examples on this page.

This glossary defines terms that are used in the document "Defining Video Quality Requirements: A Guide for Public Safety", developed by the Video Quality in Public Safety (VQIPS) Working Group. It contains terminology and explanations of concepts relevant to the video industry. The purpose of the glossary is to inform the reader of commonly used vocabulary terms in the video domain. This glossary was compiled from various industry sources.

References

  1. "Using motiongrams in the study of musical gestures". International Computer Music Association. Retrieved 1 August 2012.[ permanent dead link ]