Order-independent transparency

Last updated June 09, 2024

Order-independent transparency (OIT) is a class of techniques in rasterisational computer graphics for rendering transparency in a 3D scene, which do not require rendering geometry in sorted order for alpha compositing.

Description

Commonly, 3D geometry with transparency is rendered by blending (using alpha compositing) all surfaces into a single buffer (think of this as a canvas). Each surface occludes existing color and adds some of its own color depending on its alpha value, a ratio of light transmittance. The order in which surfaces are blended affects the total occlusion or visibility of each surface. For a correct result, surfaces must be blended from farthest to nearest or nearest to farthest, depending on the alpha compositing operation, over or under. Ordering may be achieved by rendering the geometry in sorted order, for example sorting triangles by depth, but can take a significant amount of time, not always produce a solution (in the case of intersecting or circularly overlapping geometry) and the implementation is complex. Instead, order-independent transparency sorts geometry per-pixel, after rasterisation. For exact results this requires storing all fragments before sorting and compositing.

History

The A-buffer is a computer graphics technique introduced in 1984 which stores per-pixel lists of fragment data (including micro-polygon information) in a software rasteriser, REYES, originally designed for anti-aliasing but also supporting transparency.

More recently, depth peeling ^[1] in 2001 described a hardware accelerated OIT technique. With limitations in graphics hardware the scene's geometry had to be rendered many times. A number of techniques have followed, to improve on the performance of depth peeling, still with the many-pass rendering limitation. For example, Dual Depth Peeling (2008).^[2]

In 2009, two significant features were introduced in GPU hardware/drivers/Graphics APIs that allowed capturing and storing fragment data in a single rendering pass of the scene, something not previously possible. These are, the ability to write to arbitrary GPU memory from shaders and atomic operations. With these features a new class of OIT techniques became possible that do not require many rendering passes of the scene's geometry.

The first was storing the fragment data in a 3D array,^[3] where fragments are stored along the z dimension for each pixel x/y. In practice, most of the 3D array is unused or overflows, as a scene's depth complexity is typically uneven. To avoid overflow the 3D array requires large amounts of memory, which in many cases is impractical.
Two approaches to reducing this memory overhead exist.
1. Packing the 3D array with a prefix sum scan, or linearizing,^[4] removed the unused memory issue but requires an additional depth complexity computation rendering pass of the geometry. The "Sparsity-aware" S-Buffer, Dynamic Fragment Buffer,^[5]"deque" D-Buffer^{[ citation needed ]}, Linearized Layered Fragment Buffer^[6] all pack fragment data with a prefix sum scan and are demonstrated with OIT.
2. Storing fragments in per-pixel linked lists^[7] provides tight packing of this data and in late 2011, driver improvements reduced the atomic operation contention overhead making the technique very competitive.^[6]

Exact OIT

Exact, as opposed to approximate, OIT accurately computes the final color, for which all fragments must be sorted. For high depth complexity scenes, sorting becomes the bottleneck.

One issue with the sorting stage is local memory limited occupancy, in this case a SIMT attribute relating to the throughput and operation latency hiding of GPUs. Backwards memory allocation^[8] (BMA) groups pixels by their depth complexity and sorts them in batches to improve the occupancy and hence performance of low depth complexity pixels in the context of a potentially high depth complexity scene. Up to a 3× overall OIT performance increase is reported.

Sorting is typically performed in a local array, however performance can be improved further by making use of the GPU's memory hierarchy and sorting in registers,^[9] similarly to an external merge sort, especially in conjunction with BMA.

Approximate OIT

Approximate OIT techniques relax the constraint of exact rendering to provide faster results. Higher performance can be gained from not having to store all fragments or only partially sorting the geometry. A number of techniques also compress, or reduce, the fragment data. These include:

Stochastic Transparency: draw in a higher resolution in full opacity but discard some fragments. Downsampling will then yield transparency.^[10]
Adaptive Transparency,^[11] a two-pass technique where the first constructs a visibility function which compresses on the fly (this compression avoids having to fully sort the fragments) and the second uses this data to composite unordered fragments. Intel's pixel synchronization^[12] avoids the need to store all fragments, removing the unbounded memory requirement of many other OIT techniques.
Weighted Blended Order-Independent Transparency replaced the over operator with a commutative approximation. Feeding depth information into the weight produces visually-acceptable occlusion.^[13]

OIT in Hardware

The Sega Dreamcast games console included hardware support for automatic OIT.^[14]

Related Research Articles

Rendering or image synthesis is the process of generating a photorealistic or non-photorealistic image from a 2D or 3D model by means of a computer program. The resulting image is referred to as a rendering. Multiple models can be defined in a scene file containing objects in a strictly defined language or data structure. The scene file contains geometry, viewpoint, textures, lighting, and shading information describing the virtual scene. The data contained in the scene file is then passed to a rendering program to be processed and output to a digital image or raster graphics image file. The term "rendering" is analogous to the concept of an artist's impression of a scene. The term "rendering" is also used to describe the process of calculating effects in a video editing program to produce the final video output.

Scanline rendering is an algorithm for visible surface determination, in 3D computer graphics, that works on a row-by-row basis rather than a polygon-by-polygon or pixel-by-pixel basis. All of the polygons to be rendered are first sorted by the top y coordinate at which they first appear, then each row or scan line of the image is computed using the intersection of a scanline with the polygons on the front of the sorted list, while the sorted list is updated to discard no-longer-visible polygons as the active scan line is advanced down the picture.

Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware acceleration if it is available on the graphics card, allowing for hardware acceleration of the entire 3D rendering pipeline or even only partial acceleration. Direct3D exposes the advanced graphics capabilities of 3D graphics hardware, including Z-buffering, W-buffering, stencil buffering, spatial anti-aliasing, alpha blending, color blending, mipmapping, texture blending, clipping, culling, atmospheric effects, perspective-correct texture mapping, programmable HLSL shaders and effects. Integration with other DirectX technologies enables Direct3D to deliver such features as video mapping, hardware 3D rendering in 2D overlay planes, and even sprites, providing the use of 2D and 3D graphics in interactive media ties.

Texture mapping is a method for mapping a texture on a computer-generated graphic. Texture here can be high frequency detail, surface texture, or color.

A depth buffer, also known as a z-buffer, is a type of data buffer used in computer graphics to represent depth information of objects in 3D space from a particular perspective. The depth is stored as a height map of the scene, the values representing a distance to camera, with 0 being the closest. The encoding scheme may be flipped with the highest number being the value closest to camera. Depth buffers are an aid to rendering a scene to ensure that the correct polygons properly occlude other polygons. Z-buffering was first described in 1974 by Wolfgang Straßer in his PhD thesis on fast algorithms for rendering occluded objects. A similar solution to determining overlapping polygons is the painter's algorithm, which is capable of handling non-opaque scene elements, though at the cost of efficiency and incorrect results.

The GeForce 3 series (NV20) is the third generation of Nvidia's GeForce line of graphics processing units (GPUs). Introduced in February 2001, it advanced the GeForce architecture by adding programmable pixel and vertex shaders, multisample anti-aliasing and improved the overall efficiency of the rendering process.

<span class="mw-page-title-main">Shadow volume</span> Computer graphics technique

Shadow volume is a technique used in 3D computer graphics to add shadows to a rendered scene. They were first proposed by Frank Crow in 1977 as the geometry describing the 3D shape of the region occluded from a light source. A shadow volume divides the virtual world in two: areas that are in shadow and areas that are not.

<span class="mw-page-title-main">Hidden-surface determination</span> Visibility in 3D computer graphics

In 3D computer graphics, hidden-surface determination is the process of identifying what surfaces and parts of surfaces can be seen from a particular viewing angle. A hidden-surface determination algorithm is a solution to the visibility problem, which was one of the first major problems in the field of 3D computer graphics. The process of hidden-surface determination is sometimes called hiding, and such an algorithm is sometimes called a hider. When referring to line rendering it is known as hidden-line removal. Hidden-surface determination is necessary to render a scene correctly, so that one may not view features hidden behind the model itself, allowing only the naturally viewable portion of the graphic to be visible.

<span class="mw-page-title-main">Volume rendering</span> Representing a 3D-modeled object or dataset as a 2D projection

In scientific visualization and computer graphics, volume rendering is a set of techniques used to display a 2D projection of a 3D discretely sampled data set, typically a 3D scalar field.

In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene—a process known as shading. Shaders have evolved to perform a variety of specialized functions in computer graphics special effects and video post-processing, as well as general-purpose computing on graphics processing units.

The computer graphics pipeline, also known as the rendering pipeline or graphics pipeline, is a framework within computer graphics that outlines the necessary procedures for transforming a three-dimensional (3D) scene into a two-dimensional (2D) representation on a screen. Once a 3D model is generated, the graphics pipeline converts the model into a visually perceivable format on the computer display. Due to the dependence on specific software, hardware configurations, and desired display attributes, a universally applicable graphics pipeline does not exist. Nevertheless, graphics application programming interfaces (APIs), such as Direct3D, OpenGL and Vulkan were developed to standardize common procedures and oversee the graphics pipeline of a given hardware accelerator. These APIs provide an abstraction layer over the underlying hardware, relieving programmers from the need to write code explicitly targeting various graphics hardware accelerators like AMD, Intel, Nvidia, and others.

Clipping, in the context of computer graphics, is a method to selectively enable or disable rendering operations within a defined region of interest. Mathematically, clipping can be described using the terminology of constructive geometry. A rendering algorithm only draws pixels in the intersection between the clip region and the scene model. Lines and surfaces outside the view volume are removed.

In computer graphics, per-pixel lighting refers to any technique for lighting an image or scene that calculates illumination for each pixel on a rendered image. This is in contrast to other popular methods of lighting such as vertex lighting, which calculates illumination at each vertex of a 3D model and then interpolates the resulting values over the model's faces to calculate the final per-pixel color values.

<span class="mw-page-title-main">Shadow mapping</span> Method to draw shadows in computer graphic images

Shadow mapping or shadowing projection is a process by which shadows are added to 3D computer graphics. This concept was introduced by Lance Williams in 1978, in a paper entitled "Casting curved shadows on curved surfaces." Since then, it has been used both in pre-rendered and realtime scenes in many console and PC games.

Tiled rendering is the process of subdividing a computer graphics image by a regular grid in optical space and rendering each section of the grid, or tile, separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to immediate mode rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power handheld device use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the graphics pipeline instead of near the end.

In the field of 3D computer graphics, deferred shading is a screen-space shading technique that is performed on a second rendering pass, after the vertex and pixel shaders are rendered. It was first suggested by Michael Deering in 1988.

Screen space ambient occlusion (SSAO) is a computer graphics technique for efficiently approximating the ambient occlusion effect in real time. It was developed by Vladimir Kajalin while working at Crytek and was used for the first time in 2007 by the video game Crysis, also developed by Crytek.

PICA200 is a graphics processing unit (GPU) designed by Digital Media Professionals Inc. (DMP), a Japanese GPU design startup company, for use in embedded devices such as vehicle systems, mobile phones, cameras, and game consoles. The PICA200 is an IP Core which can be licensed to other companies to incorporate into their SOCs. It was most notably licensed for use in the Nintendo 3DS.

<span class="mw-page-title-main">PlayStation technical specifications</span> Overview of the technical specifications of the PlayStation

The PlayStation technical specifications describe the various components of the original PlayStation video game console.

This is a glossary of terms relating to computer graphics.

References

↑ Everitt, Cass (2001-05-15). "Interactive Order-Independent Transparency". Nvidia. Archived from the original on 2011-09-27. Retrieved 2008-10-12.
↑ Bavoil, Louis (February 2008). "Order Independent Transparency with Dual Depth Peeling" (PDF). Nvidia . Retrieved 2013-03-12.
↑ Fang Liu, Meng-Cheng Huang, Xue-Hui Liu, and En-Hua Wu. "Single pass depth peeling via CUDA rasterizer", In SIGGRAPH 2009: Talks (SIGGRAPH '09), 2009
↑ Craig Peeper. "Prefix sum pass to linearize A-buffer storage", Patent application, Dec, 2008
↑ Marilena Maule and João L.D. Comba and Rafael Torchelsen and Rui Bastos. "Memory-optimized order-independent transparency with Dynamic Fragment Buffer ", In Computers & Graphics, 2014.
1 2 Pyarelal Knowles, Geoff Leach and Fabio Zambetta. "Chapter 20: Efficient Layered Fragment Buffer Techniques", OpenGL Insights, pages 279-292, Editors Cozzi and Riccio, CRC Press, 2012
↑ Jason C. Yang, Justin Hensley, Holger Grün, and Nicolas Thibieroz. "Real-time concurrent linked list construction on the GPU", In Proceedings of the 21st Eurographics conference on Rendering (EGSR'10), 2010
↑ Knowles; et al. (Oct 2013). "Backwards Memory Allocation and Improved OIT" (PDF). Eurographics Digital Library. Archived from the original (PDF) on 2014-03-02. Retrieved 2014-01-21.
↑ Knowles; et al. (June 2014). "Fast Sorting for Exact OIT of Complex Scenes" (PDF). Springer Berlin Heidelberg. Archived from the original (PDF) on 2014-08-09. Retrieved 2014-08-05.
↑ Enderton, Eric (n.d.). "Stochastic Transparency" (PDF). IEEE Transactions on Visualization and Computer Graphics. 17 (8). Nvidia: 1036–1047. doi:10.1109/TVCG.2010.123. PMID 20921587 . Retrieved 2013-03-12.
↑ Salvi; et al. (2013-07-18). "Adaptive Transparency" . Retrieved 2014-01-21.
↑ Davies, Leigh (2013-07-18). "Order-Independent Transparency Approximation with Pixel Synchronization". Intel . Retrieved 2014-01-21.
↑ McGuire, Morgan; Bavoil, Louis (2013). "Weighted Blended Order-Independent Transparency". Journal of Computer Graphics Techniques. 2 (2): 122–141.
↑ "Optimizing Dreamcast Microsoft Direct3D Performance". Microsoft. 1999-03-01.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Everitt, Cass (2001-05-15). "Interactive Order-Independent Transparency". Nvidia. Archived from the original on 2011-09-27. Retrieved 2008-10-12.

[2] Bavoil, Louis (February 2008). "Order Independent Transparency with Dual Depth Peeling" (PDF). Nvidia . Retrieved 2013-03-12.

[3] Fang Liu, Meng-Cheng Huang, Xue-Hui Liu, and En-Hua Wu. "Single pass depth peeling via CUDA rasterizer", In SIGGRAPH 2009: Talks (SIGGRAPH '09), 2009

[4] Craig Peeper. "Prefix sum pass to linearize A-buffer storage", Patent application, Dec, 2008

[5] Marilena Maule and João L.D. Comba and Rafael Torchelsen and Rui Bastos. "Memory-optimized order-independent transparency with Dynamic Fragment Buffer ", In Computers & Graphics, 2014.

[lfb-6] 1 2 Pyarelal Knowles, Geoff Leach and Fabio Zambetta. "Chapter 20: Efficient Layered Fragment Buffer Techniques", OpenGL Insights, pages 279-292, Editors Cozzi and Riccio, CRC Press, 2012

[7] Jason C. Yang, Justin Hensley, Holger Grün, and Nicolas Thibieroz. "Real-time concurrent linked list construction on the GPU", In Proceedings of the 21st Eurographics conference on Rendering (EGSR'10), 2010

[8] Knowles; et al. (Oct 2013). "Backwards Memory Allocation and Improved OIT" (PDF). Eurographics Digital Library. Archived from the original (PDF) on 2014-03-02. Retrieved 2014-01-21.

[9] Knowles; et al. (June 2014). "Fast Sorting for Exact OIT of Complex Scenes" (PDF). Springer Berlin Heidelberg. Archived from the original (PDF) on 2014-08-09. Retrieved 2014-08-05.

[10] Enderton, Eric (n.d.). "Stochastic Transparency" (PDF). IEEE Transactions on Visualization and Computer Graphics. 17 (8). Nvidia: 1036–1047. doi:10.1109/TVCG.2010.123. PMID 20921587 . Retrieved 2013-03-12.

[11] Salvi; et al. (2013-07-18). "Adaptive Transparency" . Retrieved 2014-01-21.

[12] Davies, Leigh (2013-07-18). "Order-Independent Transparency Approximation with Pixel Synchronization". Intel . Retrieved 2014-01-21.

[13] McGuire, Morgan; Bavoil, Louis (2013). "Weighted Blended Order-Independent Transparency". Journal of Computer Graphics Techniques. 2 (2): 122–141.

[14] "Optimizing Dreamcast Microsoft Direct3D Performance". Microsoft. 1999-03-01.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]