360 video projection

Last updated

A 360 video projection is any of many ways to map a spherical field of view to a flat image. It is used to encode and deliver the effect of a spherical, 360-degree image to viewers such as needed for 360-degree videos and for virtual reality. A 360 video projection is a specialized form of a map projection, with characteristics tuned for the efficient representation, transmission, and display of 360° fields of view.

Contents

Different projections

Equirectangular

Example of equirectangular projection SonyCenter 360panorama.jpg
Example of equirectangular projection

An equirectangular projection simply maps the yaw and pitch (longitude and latitude) of a sphere linearly to a rectangular image. It produces a signature curved look. In addition, the distribution of pixel density (which can be visualized with Tissot's indicatrix) is suboptimal, with the usually more important "equator" getting the lowest density.

Cube Map

Cube mapping records the environment as the six faces of a cube. The image distortion is markedly reduced, especially when looking at the faces head-on. Still, the edges and corners of faces receive more pixels than the center.

Equi-Angular Cubemap (EAC)

The Equi-Angular Cubemap (EAC) projection is a variant of the cubemap that distributes the pixels evenly by angle. This keeps the density of information consistent, regardless of which direction the viewer is looking. It was detailed by Google on March 14, 2017. [1] [2] In January 2018 the company started using the spherical projection to stream 360 degree videos on YouTube. [3]

GoPro adopted the EAC format in 2019 when they released the GoPro MAX. [4] [5] They noted that EAC enabled then in using 25% less pixels by packing the equivalent of 5376x2688 pixels into an EAC projection of 4032x2688 pixels. This projection was then split horizontally in two streams of 4032x1344 and encoded, which could be decoded by regular UHD decoders.

Mainstream video tools have not yet added support for EAC formats, such as GoPro's .360. A custom fork of FFmpeg [6] [7] and a tool called max2sphere [8] however do enable .360 processing.

Pyramid format

The Pyramid projection is a variation of the cubemap using a pyramid geometry. The video is rendered in multiple viewports (in Facebook's case 30) where the base of the pyramid contains the full resolution and is right in front of the viewer, while the sides are rendered with a gradually decreasing resolution. It was detailed by Facebook on January 21, 2016, mainly aimed at VR video. [9] The company claims an 80% reduction in bandwidth with this projection, with the disadvantage that many more viewports need to be rendered and stored.

See also

Related Research Articles

<span class="mw-page-title-main">FFmpeg</span> Multimedia framework

FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of video and audio files. It is widely used for format transcoding, basic editing, video scaling, video post-production effects and standards compliance.

IPIX is an imaging technology company headquartered in Cohoes, New York. It supplies hardware and software for producing, publishing, embellishing, and collaborating with spherical imagery.

QuickTime VR is an image file format developed by Apple Inc. for QuickTime, and discontinued along with QuickTime 7. It allows the creation and viewing of VR photography, photographically captured panoramas, and the viewing of objects photographed from multiple angles. It functions as plugins for the QuickTime Player and for the QuickTime Web browser plugin.

<span class="mw-page-title-main">Image plane</span>

In 3D computer graphics, the image plane is that plane in the world which is identified with the plane of the display monitor used to view the image that is being rendered. It is also referred to as screen space. If one makes the analogy of taking a photograph to rendering a 3D image, the surface of the film is the image plane. In this case, the viewing transformation is a projection that maps the world onto the image plane. A rectangular region of this plane, called the viewing window or viewport, maps to the monitor. This establishes the mapping between pixels on the monitor and points in the 3D world. The plane is not usually an actual geometric object in a 3D scene, but instead is usually a collection of target coordinates or dimensions that are used during the rasterization process so the final output can be displayed as intended on the physical screen.

<span class="mw-page-title-main">Reflection mapping</span>

In computer graphics, environment mapping, or reflection mapping, is an efficient image-based lighting technique for approximating the appearance of a reflective surface by means of a precomputed texture. The texture is used to store the image of the distant environment surrounding the rendered object.

<span class="mw-page-title-main">Equirectangular projection</span> Cylindrical equidistant map projection

The equirectangular projection, and which includes the special case of the plate carrée projection, is a simple map projection attributed to Marinus of Tyre, who Ptolemy claims invented the projection about AD 100.

<span class="mw-page-title-main">Image stitching</span> Combining multiple photographic images with overlapping fields of view

Image stitching or photo stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Commonly performed through the use of computer software, most approaches to image stitching require nearly exact overlaps between images and identical exposures to produce seamless results, although some stitching algorithms actually benefit from differently exposed images by doing high-dynamic-range imaging in regions of overlap. Some digital cameras can stitch their photos internally.

Fulldome refers to immersive dome-based video display environments. The dome, horizontal or tilted, is filled with real-time (interactive) or pre-rendered (linear) computer animations, live capture images, or composited environments.

<span class="mw-page-title-main">360-degree video</span> Visual arts technique

360-degree videos, also known as surround video, or immersive videos or spherical videos, are video recordings where a view in every direction is recorded at the same time, shot using an omnidirectional camera or a collection of cameras. The term 360x180 can be used to indicate 360° of azimuth and 180° from nadir to zenith. During playback on normal flat display the viewer has control of the viewing direction like a panorama. It can also be played on a display or projectors arranged in a sphere or some part of a sphere.

Tiled rendering is the process of subdividing a computer graphics image by a regular grid in optical space and rendering each section of the grid, or tile, separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to immediate mode rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power handheld device use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the graphics pipeline instead of near the end.

<span class="mw-page-title-main">VR photography</span> Interactive panoramic photo viewing format

VR photography is the interactive viewing of panoramic photographs, generally encompassing a 360-degree circle or a spherical view. The results is known as VR photograph, 360-degree photo, photo sphere, or spherical photo, as well as interactive panorama or immersive panorama.

<span class="mw-page-title-main">Omnidirectional (360-degree) camera</span> Camera that can see in all directions

In photography, an omnidirectional camera, also known as 360-degree camera, is a camera having a field of view that covers approximately the entire sphere or at least a full circle in the horizontal plane. Omnidirectional cameras are important in areas where large visual field coverage is needed, such as in panoramic photography and robotics.

CineForm Intermediate is an open source video codec developed for CineForm Inc by David Taylor, David Newman and Brian Schunck. On March 30, 2011, the company was acquired by GoPro which in particular wanted to use the 3D film capabilities of the CineForm 444 Codec for its 3D HERO System.

GoPro, Inc. is an American technology company founded in 2002 by Nick Woodman. It manufactures action cameras and develops its own mobile apps and video-editing software. Founded as Woodman Labs, Inc, the company eventually focused on the connected sports genre, developing its line of action cameras and, later, video editing software. It is based in San Mateo, California.

Sorenson Media was an American software company specializing in video encoding technology. Established in December 1995 as Sorenson Vision, the company developed technology which was licensed and ultimately acquired from Utah State University. The company first announced its codec at a developer’s preview at MacWorld Expo in January 1997.

Sun Raster was a raster graphics file format used on SunOS by Sun Microsystems. The format has no MIME type, it is specified in @(#)rasterfile.h 1.11 89/08/21 SMI. The format was used for some research papers.

<span class="mw-page-title-main">Blisk (web browser)</span> Chromium -based web browser

Blisk is a freemium Chromium-based web browser that aims to improve productivity and code quality by providing a wide array of tools for Web development and testing for different type of devices: desktop, tablet and mobile.

<span class="mw-page-title-main">16K resolution</span> Video or display resolutions with a width of around 16,000 pixels

16K resolution is a display resolution with approximately 16,000 pixels horizontally. The most commonly discussed 16K resolution is 15360 × 8640, which doubles the pixel count of 8K UHD in each dimension, for a total of four times as many pixels. This resolution has 132.7 megapixels, 16 times as many pixels as 4K resolution and 64 times as many pixels as 1080p resolution.

sView

sView is a free and open-source 3D media player software application. It is available for Linux, OS X, Microsoft Windows and Android. sView is a general-purpose media player focused on supporting various 3D displays.

The Quite OK Image Format (QOI) is a specification for lossless image compression of 24-bit or 32-bit color raster (bitmapped) images, invented by Dominic Szablewski and first announced on 24 November 2021.

References

  1. "Bringing pixels front and center in VR video". Google. 2017-03-14. Retrieved 2018-04-02.
  2. "Improving VR videos". YouTube Engineering and Developers Blog. Retrieved 2018-04-02.
  3. "[YouTube] 3D/2D 360° videos - now encoded in a new, proprietary & non-standard format · Issue #15267 · rg3/youtube-dl". GitHub. Retrieved 2018-04-02.
  4. "This is GoPro MAX: Tech, Specs + More". gopro.com. Retrieved 2022-09-08.
  5. "Reverse Engineering GoPro's 360 Video File Format (Part 1)". Trek View. 2021-09-10. Retrieved 2022-09-08.
  6. gmat (2022-08-25), goproMax-ffmpeg-v5 , retrieved 2022-09-08
  7. "Using ffmpeg to Process Raw GoPro MAX .360's into Equirectangular Projections". Trek View. 2022-03-18. Retrieved 2022-09-08.
  8. max2sphere, Trek View, 2022-08-06, retrieved 2022-09-08
  9. "Next-generation video encoding techniques for 360 video and VR". Facebook Code. Retrieved 2018-04-02.