Motion JPEG

Last updated

Motion JPEG (M-JPEG or MJPEG) is a video compression format in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image.

Contents

Originally developed for multimedia PC applications, Motion JPEG enjoys broad client support: most major web browsers and players provide native support, and plug-ins are available for the rest. Software and devices using the M-JPEG standard include web browsers, media players, game consoles, digital cameras, IP cameras, webcams, streaming servers, video cameras, and non-linear video editors.[ citation needed ]

History

Motion JPEG was originally developed for multimedia PC applications.[ citation needed ]

Early implementations of MJPEG were generally implemented in Hardware. C-Cube was an early proponent with their CL550 JPEG codec been used in several hardware implementations. It was announced [1] that the NeXTdimension from NeXT would ship with an onboard CL550 to implement MJPEG. This was however later shelved and wasn't included in the final product that was shipped. [2]

Apple provided a Software implementation of MJPEG in their QuickTime Player in the mid-1990s. [3]

Design

M-JPEG is an intraframe-only compression scheme (compared with the more computationally intensive technique of interframe prediction). Whereas modern interframe video formats, such as MPEG1, MPEG2 and H.264/MPEG-4 AVC, achieve real-world compression ratios of 1:50 or better, M-JPEG's lack of interframe prediction limits its efficiency to 1:20 or lower, depending on the tolerance to spatial artifacting in the compressed output. Because frames are compressed independently of one another, M-JPEG imposes lower processing and memory requirements on hardware devices.

As a purely intraframe compression scheme, the image quality of M-JPEG is directly a function of each video frame's static (spatial) complexity. Frames with large smooth transitions or monotone surfaces compress well and are more likely to hold their original details with few visible compression artifacts. Frames exhibiting complex textures, fine curves and lines (such as writing on a newspaper) are prone to exhibit discrete cosine transform (DCT) artifacts such as ringing, smudging and macroblocking. M-JPEG-compressed video is also insensitive to motion complexity, i.e. variation over time. It is neither hindered by highly random motion (such as the water-surface turbulence in a large waterfall), nor helped by the absence of motion (such as static landscape shot by tripod), which are two opposite extremes commonly used to test interframe video formats.

For QuickTime formats, Apple has defined two types of coding: MJPEG-A and MJPEG-B. MJPEG-B no longer retains valid JPEG Interchange Files within it, hence it is not possible to take a frame into a JPEG file without slightly modifying the headers.

JPEG is inefficient, using more bits to deliver similar quality, compared to more modern formats (such as JPEG 2000 and H.264/MPEG-4 AVC). Since the development of the original JPEG standard in the early 1990s, technology improvements have been made not only to the JPEG format but to the interframe compression schemas possible as well.

Features

Motion JPEG is simple to implement because it uses a mature compression standard (JPEG) with well-developed libraries, and it is an intraframe method of compression.[ citation needed ]

It tolerates rapidly changing motion in the video stream, whereas compression schemes using interframe compression can often experience unacceptable quality loss when the video content changes significantly between each frame.[ citation needed ]

Minimal hardware is required because it is not computationally intensive.[ citation needed ]

Standardisation

Unlike the video formats specified in international standards such as MPEG-2 and the format specified in the JPEG still-picture coding standard, there is no document that defines a single exact format that is universally recognized as a complete specification of “Motion JPEG” for use in all contexts. This raises compatibility concerns about file outputs from different manufacturers. However, each particular file format usually has some standard on how M-JPEG is encoded. For example, Microsoft documents their standard format to store M-JPEG in AVI files, [4] Apple documents how M-JPEG is stored in QuickTime files, RFC 2435 describes how M-JPEG is implemented in an RTP stream, and an M-JPEG CodecID is planned for the Matroska file format. [5]

Applications

M-JPEG is now used by video-capture devices such as digital cameras, IP cameras, and webcams, as well as by non-linear video editing systems. It is natively supported by the QuickTime Player, the PlayStation console, and web browsers such as Safari, Google Chrome, Mozilla Firefox and Microsoft Edge.

Video editing

M-JPEG is frequently used in non-linear video editing systems. Modern desktop CPUs are powerful enough to work with high-definition video, so no special hardware is required, and they in turn offer native random-access to any frame.

Game consoles

The PlayStation game console integrated M-JPEG like decompression hardware for in-game FMV sequences, while the PlayStation Portable handheld game console can play M-JPEG from the Memory Stick Pro Duo under the .avi extension with a resolution of 480×272. Both can record clips in M-JPEG with its Go!Cam camera.

Nintendo's Wii game console, as well as VTech's InnoTab, can play M-JPEG-encoded videos on SD card using its Photo Channel. The SanDisk Sansa e200 and the Zen V digital audio players play short M-JPEG videos. Recent firmware updates to the Nintendo 3DS can now record and play "3D-AVI" M-JPEG-encoded files, which is the same format used in the Fujifilm FinePix Real 3D series, from a SD card in 320×240 resolution so long as the video duration is 10 minutes or less.

Digital cameras

Prior to the recent rise in MPEG-4 encoding in consumer devices, a progressive scan form of M-JPEG saw widespread use in the “movie” modes of digital still cameras, allowing video encoding and playback through the integrated JPEG compression hardware with only a software modification. The resultant quality is still inferior compared to a similar-sized MPEG, particularly as the sound (when included) was uncompressed PCM and recorded at a low sample rate or low-compression, low processor-demand ADPCM.

To keep file sizes and transfer rates under control, frame sizes and rates, along with sound sampling rates, are kept relatively low with very high levels of compression for each individual frame. Resolutions of 160×120 or 320×240 are common sizes, typically at 10, 12 or 15 frames per second, with picture quality equivalent to a JPEG setting of “50” with mono ADPCM sound sampled at ~8 kHz. This results in a very basic, but serviceable video output at a similar storage cost to MPEG (~120 kB/s video rate, ~8 kB/s audio – or approx 1 Mbit/s at 320×240 resolution), but with minimal processing overheads. This video is typically stored in Microsoft's AVI or Apple's QuickTime Movie container files. In these files MPEGs, are viewable natively on most operating systems, however sometimes an additional codec must be installed.

The AMV video format, common on cheap "MP4" players, is a modified version of M-JPEG.

In addition to portable players (which are mainly "consumers" of the video), many video-enabled digital cameras use M-JPEG for video-capture. For instance:

A video that was recorded on a Canon 5D mark IV in DCI 4K using motion jpeg

Many network-enabled cameras provide M-JPEG streams that network clients can connect to. Mozilla and Webkit-based browsers have native support for viewing these M-JPEG streams.

Some network-enabled cameras provide their own M-JPEG interfaces as part of the normal feature set. For cameras that don't provide this feature natively, a server can be used to transcode the camera pictures into an M-JPEG stream and then provide that stream to other network clients.

Media players

Apple announced on September 1, 2010 that their newest version of the Apple TV would support M-JPEG up to 35 Mbit/s, 1280 by 720 pixels, 30 frames per second, audio in μlaw, PCM stereo audio in .avi file format.

Certain media players such as the Netgear NeoTV 550 do not support the playback of M-JPEG.

Video streaming

HTTP streaming separates each image into individual HTTP replies on a specified marker. HTTP streaming creates packets of a sequence of JPEG images that can be received by clients such as QuickTime or VLC.

In response to a GET request for a MJPEG file or stream, the server streams the sequence of JPEG frames over HTTP. A special mime-type content type multipart/x-mixed-replace;boundary=<boundary-name> informs the client to expect several parts (frames) as an answer delimited by <boundary-name>. This boundary name is expressly disclosed within the MIME-type declaration itself. The TCP connection is not closed as long as the client wants to receive new frames and the server wants to provide new frames. Two basic implementations of a M-JPEG streaming server are cambozola and MJPG-Streamer . The more robust ffmpeg-server also provides M-JPEG streaming support.

Native web browser support includes: Safari, Google Chrome, Microsoft Edge [8] and Firefox. [9] Other browsers, such as Internet Explorer can display M-JPEG streams with the help of external plugins. Cambozola is an applet that can show M-JPEG streams in Java-enabled browsers. M-JPEG is also natively supported by PlayStation and QuickTime. Most commonly, M-JPEG is used in IP based security cameras. [10]

Successors

Technology improvements can be found in the designs of H.263v2 Annex I and MPEG-4 Part 2, that use frequency-domain prediction of transform coefficient values, and in H.264/MPEG-4 AVC, that use spatial prediction and adaptive transform block size techniques. There are also more sophisticated entropy coding than what was practical when the first JPEG design was developed. All of these new developments make M-JPEG an inefficient recording mechanism.

See also

Related Research Articles

A codec is a device or computer program that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

Audio Video Interleave is a proprietary multimedia container format and Windows standard introduced by Microsoft in November 1992 as part of its Video for Windows software. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback. Like the DVD video format, AVI files support multiple streaming audio and video, although these features are seldom used.

<span class="mw-page-title-main">Motion compensation</span> Video compression technique, used to efficiently predict and generate video frames

Motion compensation in computing is an algorithmic technique used to predict a frame in a video given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

<span class="mw-page-title-main">Video codec</span> Digital video processing

A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

Windows Media Video (WMV) is a series of video codecs and their corresponding video coding formats developed by Microsoft. It is part of the Windows Media framework. WMV consists of three distinct codecs: The original video compression technology known as WMV, was originally designed for Internet streaming applications, as a competitor to RealVideo. The other compression technologies, WMV Screen and WMV Image, cater for specialized content. After standardization by the Society of Motion Picture and Television Engineers (SMPTE), WMV version 9 was adapted for physical-delivery formats such as HD DVD and Blu-ray Disc and became known as VC-1. Microsoft also developed a digital container format called Advanced Systems Format to store video encoded by Windows Media Video.

<span class="mw-page-title-main">VirtualDub</span>

VirtualDub is a free and open-source video capture and video processing utility for Microsoft Windows written by Avery Lee. It is designed to process linear video streams, including filtering and recompression. It uses AVI container format to store captured video. The first version of VirtualDub, written for Windows 95, to be released on SourceForge was uploaded on August 20, 2000.

Cinepak is a lossy video codec developed by Peter Barrett at SuperMac Technologies, and released in 1991 with the Video Spigot, and then in 1992 as part of Apple Computer's QuickTime video suite. One of the first video compression tools to achieve full motion video on CD-ROM, it was designed to encode 320×240 resolution video at 1× CD-ROM transfer rates. The original name of this codec was Compact Video, which is why its FourCC identifier is CVID. The codec was ported to Microsoft Windows in 1993. It was also used on fourth- and fifth-generation game consoles, such as the Atari Jaguar CD, Sega CD, Sega Saturn, and 3DO. libavcodec includes a Cinepak decoder and an encoder, both licensed under the terms of the LGPL.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

A container format or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Notable examples of container formats include archive files and formats used for multimedia playback. Among the earliest cross-platform container formats were Distinguished Encoding Rules and the 1985 Interchange File Format.

These tables compare features of multimedia container formats, most often used for storing or streaming digital video or digital audio content. To see which multimedia players support which container format, look at comparison of media players.

Flash Video is a container file format used to deliver digital video content over the Internet using Adobe Flash Player version 6 and newer. Flash Video content may also be embedded within SWF files. There are two different Flash Video file formats: FLV and F4V. The audio and video data within FLV files are encoded in the same way as SWF files. The F4V file format is based on the ISO base media file format, starting with Flash Player 9 update 3. Both formats are supported in Adobe Flash Player and developed by Adobe Systems. FLV was originally developed by Macromedia. In the early 2000s, Flash Video was the de facto standard for web-based streaming video. Users include Hulu, VEVO, Yahoo! Video, metacafe, Reuters.com, and many other news providers.

Media Foundation (MF) is a COM-based multimedia framework pipeline and infrastructure platform for digital media in Windows Vista, Windows 7, Windows 8, Windows 8.1, Windows 10, and Windows 11. It is the intended replacement for Microsoft DirectShow, Windows Media SDK, DirectX Media Objects (DMOs) and all other so-called "legacy" multimedia APIs such as Audio Compression Manager (ACM) and Video for Windows (VfW). The existing DirectShow technology is intended to be replaced by Media Foundation step-by-step, starting with a few features. For some time there will be a co-existence of Media Foundation and DirectShow. Media Foundation will not be available for previous Windows versions, including Windows XP.

The following is a list of H.264/MPEG-4 AVC products and implementations.

A video coding format is a content representation format of digital video content, such as in a data file or bitstream. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression in a specific video coding format is called a video codec.

Apple Video is a lossy video compression and decompression algorithm (codec) developed by Apple Inc. and first released as part of QuickTime 1.0 in 1991. The codec is also known as QuickTime Video, by its FourCC RPZA and the name Road Pizza. When used in the AVI container, the FourCC AZPR is also used.

Apple ProRes is a high quality, "visually lossless" lossy video compression format developed by Apple Inc. for use in post-production that supports video resolution up to 8K. It is the successor of the Apple Intermediate Codec and was introduced in 2007 with Final Cut Studio 2. Much like the H.26x and MPEG standards, the ProRes family of codecs use compression algorithms based on the discrete cosine transform (DCT). ProRes is widely used as a final format delivery method for HD broadcast files in commercials, features, Blu-ray and streaming.

Better Portable Graphics (BPG) is a file format for coding digital images, which was created by programmer Fabrice Bellard in 2014. He has proposed it as a replacement for the JPEG image format as the more compression-efficient alternative in terms of image quality or file size. It is based on the intra-frame encoding of the High Efficiency Video Coding (HEVC) video compression standard. Tests on photographic images in July 2014 found that BPG produced smaller files for a given quality than JPEG, JPEG XR and WebP.

References

  1. "New Machines from NeXT (U-M computing News, Volume 5, Jan 1990)". 1990.
  2. "The NeXTdimension Compendium (compiled from June-August 1993)".
  3. "Developer's Guide: QuickTime for Macintosh Version 2.5" (PDF). Archived from the original (PDF) on 2022-07-16.
  4. "BMPDIB.TXT". www.fileformat.info.
  5. "Codec Mappings".
  6. "Press Release Details". www.usa.canon.com. Retrieved 2016-11-06.
  7. "Specifications & Features - Canon EOS 5D Mark IV - Canon UK". www.canon.co.uk. 2016-09-19. Retrieved 2016-11-06.
  8. "Dev guide: Video - Microsoft Edge Development". developer.microsoft.com. Retrieved 2016-08-25.
  9. M-JPEG streams sent to early versions of Mozilla Firefox had to be enclosed within an HTTP document to avoid flickering. See Bug 625012 (fixed in 2014).
  10. Martins, Claudemir (2017-04-25). "How CCTV codecs work (CCTV codec easily explained) by Learn CCTV". Learn CCTV.com. Retrieved 2023-10-22.