2D-plus-depth

Last updated
2D image and its depth map, lighter depth map areas are regarded as closer to a viewer 2D plus depth.png
2D image and its depth map, lighter depth map areas are regarded as closer to a viewer

2D-plus-Depthis a stereoscopic video coding format that is used for 3D displays, such as Philips WOWvx. Philips discontinued work on the WOWvx line in 2009, citing "current market developments". [1] Currently, this Philips technology is used by SeeCubic company, led by former key 3D engineers and scientists of Philips. They offer autostereoscopic 3D displays which use the 2D-plus-Depth format for 3D video input. [2]

Contents

Overview

The 2D-plus-Depth format is described in a Philips' white paper [3] and articles. [4]

Each 2D image frame is supplemented with a greyscale depth map which indicates if a specific pixel in the 2D image needs to be shown in front of the display (white) or behind the screen plane (black). The 256 greyscales can build a smooth gradient of depth within the image. Processing within the monitor used this input to render the multiview images.

Supported by various companies across the display industry, 2D-plus-Depth has been standardized in MPEG as an extension for 3D filed under ISO/IEC FDIS 23002-3:2007(E). [5]

There is also an extension on the 2D-plus-Depth format called the WOWvx Declipse format. It is described in the same Philips' white paper "3D Interface Specifications". In this advanced format, two more planes are added to the original 2D image and its depth map for each frame: the background areas covered by foreground objects and their respective depth map. So, each frame in the Declipse format is described with an image containing four parts, or quadrants. This extension improves potential visual quality by providing data for more correct and precise filling of the uncovered occlusion areas created by shifting foreground objects during the multiview generation process.

Advantages

2D-plus-Depth has the advantage that it has a limited bandwidth increase compared to 2D (compressed greyscale increases bandwidth 5–20%) so that it can be used in existing distribution infrastructures.

2D-plus-Depth offers flexibility and compatibility with existing production equipment and compression tools. [6] [7]

It allows applications to use different 3D display screen sizes and designs in the same system.

Another advantage is that depth maps are created in the course of 2D-to-stereo 3D conversion using almost any approach to this video transformation. [8] That is why there is the respective 2D-plus-Depth representation of a converted stereo footage in almost all cases. Considering the lack of 3D content shot in stereo and the number of converted 3D films, this is a big benefit.

Disadvantages

2D-plus-Depth is not compatible with existing 2D or 3D-Ready displays. The format has been criticized due to the limited amount of depth that can be displayed in an 8-bit greyscale.

2d-plus-Depth cannot handle transparency (semi-transparent objects in the scene) and occlusion (an object blocking the view of another). The 2d plus DOT format takes these factors into account. [9] Additionally, it cannot handle reflection, refraction (beyond simple transparency) and other optical phenomena.

Creation of accurate 2D-plus-Depth can be costly and difficult, though recent advances in range imaging have made this process more accessible. [10] [11]

2d-plus-Depth lacks the potential increase in resolution of using two complete images.

Depth cannot be reliably estimated for a monocular video in most cases. Notable exceptions are camera motion scenes when object motion is static or almost absent, and landscape scenes when depth map can be approximated well enough with a gradient. This allows automatic depth estimation. [12] [13] In general case only semi-automatic approach is viable for 2D to 2D-plus-Depth conversion. Philips developed a 3D content creation software suite named BlueBox [14] which includes semi-automated conversion of 2D content into 2D-plus-Depth format and automatic generation of 2D-plus-Depth from stereo. A similar semiautomatic approach to high quality 2D to 2D-plus-Depth conversion is implemented in YUVsoft's 2D to 3D Suite, available as a set of plugins for After Effects and NUKE video compositing software. [15]

Stereoscopic to 2D-plus-Depth conversion involves several algorithms including scene change detection, segmentation, motion estimation and image matching. Automatic stereo to 2D+Depth conversion is now possible due to new high performance software and GPU technology, even in live real-time mode. [16]

Alternatives

Other 3D formats are stereo (left-right or alternating frames) and multiview 3D format (such as Multiview Video Coding and Scalable Video Coding), and 2D plus Delta.

Related Research Articles

<span class="mw-page-title-main">Stereoscopy</span> Technique for creating or enhancing the illusion of depth in an image

Stereoscopy is a technique for creating or enhancing the illusion of depth in an image by means of stereopsis for binocular vision. The word stereoscopy derives from Greek στερεός (stereos) 'firm, solid', and σκοπέω (skopeō) 'to look, to see'. Any stereoscopic image is called a stereogram. Originally, stereogram referred to a pair of stereo images which could be viewed using a stereoscope.

<span class="mw-page-title-main">3D display</span> Display device

A 3D display is a display device capable of conveying depth to the viewer. Many 3D displays are stereoscopic displays, which produce a basic 3D effect by means of stereopsis, but can cause eye strain and visual fatigue. Newer 3D displays such as holographic and light field displays produce a more realistic 3D effect by combining stereopsis and accurate focal length for the displayed content. Newer 3D displays in this manner cause less visual fatigue than classical stereoscopic displays.

A volumetric display device is a display device that forms a visual representation of an object in three physical dimensions, as opposed to the planar image of traditional screens that simulate depth through a number of different visual effects. One definition offered by pioneers in the field is that volumetric displays create 3D imagery via the emission, scattering, or relaying of illumination from well-defined regions in (x,y,z) space.

<span class="mw-page-title-main">Anaglyph 3D</span> Method of representing images in 3D

Anaglyph 3D is the stereoscopic 3D effect achieved by means of encoding each eye's image using filters of different colors, typically red and cyan. Anaglyph 3D images contain two differently filtered colored images, one for each eye. When viewed through the "color-coded" "anaglyph glasses", each of the two images reaches the eye it's intended for, revealing an integrated stereoscopic image. The visual cortex of the brain fuses this into the perception of a three-dimensional scene or composition.

<span class="mw-page-title-main">3D scanning</span> Scanning of an object or environment to collect data on its shape

3D scanning is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance. The collected data can then be used to construct digital 3D models.

<span class="mw-page-title-main">Autostereoscopy</span> Any method of displaying stereoscopic images without the use of special headgear or glasses

Autostereoscopy is any method of displaying stereoscopic images without the use of special headgear, glasses, something that affects vision, or anything for eyes on the part of the viewer. Because headgear is not required, it is also called "glasses-free 3D" or "glassesless 3D". There are two broad approaches currently used to accommodate motion parallax and wider viewing angles: eye-tracking, and multiple views so that the display does not need to sense where the viewer's eyes are located. Examples of autostereoscopic displays technology include lenticular lens, parallax barrier, and may include Integral imaging, but notably do not include volumetric display or holographic displays.

<span class="mw-page-title-main">3D computer graphics</span> Graphics that use a three-dimensional representation of geometric data

3D computer graphics, sometimes called CGI, 3-D-CGI or three-dimensional computer graphics, are graphics that use a three-dimensional representation of geometric data that is stored in the computer for the purposes of performing calculations and rendering digital images, usually 2D images but sometimes 3D images. The resulting images may be stored for viewing later or displayed in real time.

3DMLW is a discontinued open-source project, and a XML-based Markup Language for representing interactive 3D and 2D content on the World Wide Web.

<span class="mw-page-title-main">3D television</span> Television that conveys depth perception to the viewer

3D television (3DTV) is television that conveys depth perception to the viewer by employing techniques such as stereoscopic display, multi-view display, 2D-plus-depth, or any other form of 3D display. Most modern 3D television sets use an active shutter 3D system or a polarized 3D system, and some are autostereoscopic without the need of glasses. As of 2017, most 3D TV sets and services are no longer available from manufacturers.

Multi view Video Coding is a stereoscopic video coding standard for video compression that allows for encoding of video sequences captured simultaneously from multiple camera angles in a single video stream. It uses the 2D plus Delta method and is an amendment to the H.264 video compression standard, developed jointly by MPEG and VCEG, with contributions from a number of companies, primarily Panasonic and LG Electronics.

2D Plus Delta is a method of encoding a 3D image and is listed as a part of MPEG2 and MPEG4 standards, specifically on the H.264 implementation of the Multiview Video Coding extension. This technology originally started as a proprietary method for Stereoscopic Video Coding and content deployment that utilizes the left or right channel as the 2D version and the optimized difference or disparity (Delta) between that image channel view and a second eye image view is injected into the video stream as user data, secondary stream, independent stream, enhancement layer or NALu for deployment. The Delta data can be either a spatial stereo disparity, temporal predictive, bidirectional, or optimized motion compensation.

3D video coding is one of the processing stages required to manifest stereoscopic content into a home. There are three techniques which are used to achieve stereoscopic video:

  1. Color shifting (anaglyph)
  2. Pixel subsampling
  3. Enhanced video stream coding

A variety of computer graphic techniques have been used to display video game content throughout the history of video games. The predominance of individual techniques have evolved over time, primarily due to hardware advances and restrictions such as the processing power of central or graphics processing units.

<span class="mw-page-title-main">Fujifilm FinePix Real 3D</span>

The Fujifilm FinePix Real 3D W series is a line of consumer-grade digital cameras designed to capture stereoscopic images that recreate the perception of 3D depth, having both still and video formats while retaining standard 2D still image and video modes. The cameras feature a pair of lenses, and an autostereoscopic display which directs pixels of the two offset images to the user's left and right eyes simultaneously. Methods are included for extending or contracting the stereoscopic baseline, albeit with an asynchronous timer or manually depressing the shutter twice. The dual-lens architecture also enables novel modes such as simultaneous near and far zoom capture of a 2D image. The remainder of the camera is similar to other compact digital cameras.

iClone is a real-time 3D animation and rendering software program. Real-time playback is enabled by using a 3D videogame engine for instant on-screen rendering.

Nvidia 3D Vision is a discontinued stereoscopic gaming kit from Nvidia which consists of LC shutter glasses and driver software which enables stereoscopic vision for any Direct3D game, with various degrees of compatibility. There have been many examples of shutter glasses. Electrically controlled mechanical shutter glasses date back to the middle of the 20th century. LCD shutter glasses appeared in the 1980s, one example of which is Sega's SegaScope. This was available for Sega's game console, the Master System. The NVIDIA 3D Vision gaming kit introduced in 2008 made this technology available for mainstream consumers and PC gamers.

<span class="mw-page-title-main">DVB 3D-TV</span>

DVB 3D-TV is a new standard that partially came out at the end of 2010 which included techniques and procedures to send a three-dimensional video signal through actual DVB transmission standards. Currently there is a commercial requirement text for 3D TV broadcasters and Set-top box manufacturers, but no technical information is in there.

<span class="mw-page-title-main">Wiggle stereoscopy</span> 3-D image display method

Wiggle stereoscopy is an example of stereoscopy in which left and right images of a stereogram are animated. This technique is also called wiggle 3-D, wobble 3-D, wigglegram, or sometimes Piku-Piku.

2D to 3D video conversion is the process of transforming 2D ("flat") film to 3D form, which in almost all cases is stereo, so it is the process of creating imagery for each eye from one 2D image.

<span class="mw-page-title-main">3D reconstruction from multiple images</span> Creation of a 3D model from a set of images

3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.

References

  1. "Philips Decides to Shut Down 3D Operation, March 27, 2009". Archived from the original on 2010-08-23. Retrieved 2010-06-03.
  2. "Dimenco's 3D Displays & Technology". Archived from the original on 2014-03-17. Retrieved 2014-03-17.
  3. "Philips 3D Solutions: 3D Interface Specifications White Paper" (PDF). Archived from the original (PDF) on 2011-07-15. Retrieved 2010-06-03.
  4. Redert, Andre; et al. (2006). Philips 3D Solutions: From Content Creation to Visualization. Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06). doi:10.1109/3DPVT.2006.107.
  5. Preview of "ISO/IEC 23002-3. Information technology — MPEG video technologies — Part 3: Representation of auxiliary video and supplemental information"
  6. "2D+Z and its advantages". Archived from the original on 2014-03-17. Retrieved 2014-03-17.
  7. Adaptation and optimization of coding algorithms for mobile 3DTV
  8. 2D to 3D Conversions by Scott Squires
  9. Bernard F. Coll, Faisal Ishtiaq, Kevin O'Connell (2010) "3DTV at home: Status, challenges and solutions for delivering a high quality experience" Archived 2011-07-19 at the Wayback Machine Proceedings VPQM 2010
  10. Wilson, Andrew (July 1, 2004). “CMOS single-chip sensor captures 3-D images” Archived 2013-02-05 at archive.today . Vision Systems.
  11. Spare, James (August, 2004). “Machine vision: Adding the Third Dimension. Electronic perception technology is a new way to make direct, rather than derived, depth measurements of target objects” Archived 2006-10-18 at the Wayback Machine . Sensors.
  12. Automatic 2D to 2D-plus-Depth conversion sample for a camera motion scene
  13. Automatic depth estimation from scene geometry
  14. "BlueBox -- a 3D content creation service suite generating 3D content in the 2D-plus-Depth format". Archived from the original on 2011-07-15. Retrieved 2010-06-03.
  15. YUVsoft 2D to 3D Suite software for human-guided 2D-to-stereo 3D conversion
  16. An example of automatic stereo to 2D+Depth conversion