Tiled rendering

Last updated

Tiled rendering is the process of subdividing a computer graphics image by a regular grid in optical space and rendering each section of the grid, or tile, separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to immediate mode rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power handheld device use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the graphics pipeline instead of near the end. [1]


Basic concept

Creating a 3D image for display consists of a series of steps. First, the objects to be displayed are loaded into memory from individual models. The system then applies mathematical functions to transform the models into a common coordinate system, the world view. From this world view, a series of polygons (typically triangles) is created that approximates the original models as seen from a particular viewpoint, the camera. Next, a compositing system produces an image by rendering the triangles and applying textures to the outside. Textures are small images that are painted onto the triangles to produce realism. The resulting image is then combined with various special effects, and moved into a frame buffer, which video hardware then scans to produce the displayed image. This basic conceptual layout is known as the display pipeline.

Each of these steps increases the amount of memory needed to hold the resulting image. By the time it reaches the end of the pipeline the images are so large that typical graphics card designs often use specialized high-speed memory and a very fast computer bus to provide the required bandwidth to move the image in and out of the various sub-components of the pipeline. This sort of support is possible on dedicated graphics cards, but as power and size budgets become more limited, providing enough bandwidth becomes expensive in design terms.

Tiled renderers address this concern by breaking down the image into sections known as tiles, and rendering each one separately. This reduces the amount of memory needed during the intermediate steps, and the amount of data being moved about at any given time. To do this, the system sorts the triangles making up the geometry by location, allowing to quickly find which triangles overlap the tile boundaries. It then loads just those triangles into the rendering pipeline, performs the various rendering operations in the GPU, and sends the result to the frame buffer. Very small tiles can be used, 16×16 and 32×32 pixels are popular tile sizes, which makes the amount of memory and bandwidth required in the internal stages small as well. And because each tile is independent, it naturally lends itself to simple parallelization.

In a typical tiled renderer, geometry must first be transformed into screen space and assigned to screen-space tiles. This requires some storage for the lists of geometry for each tile. In early tiled systems, this was performed by the CPU, but all modern hardware contains hardware to accelerate this step. The list of geometry can also be sorted front to back, allowing the GPU to use hidden surface removal to avoid processing pixels that are hidden behind others, saving on memory bandwidth for unnecessary texture lookups. [2]

There are two main disadvantages of the tiled approach. One is that some triangles may be drawn several times if they overlap several tiles. This means the total rendering time would be higher than an immediate-mode rendering system. There are also possible issues when the tiles have to be stitched together to make a complete image, but this problem was solved long ago[ citation needed ]. More difficult to solve is that some image techniques are applied to the frame as a whole, and these are difficult to implement in a tiled render where the idea is to not have to work with the entire frame. These tradeoffs are well known, and of minor consequence for systems where the advantages are useful; tiled rendering systems are widely found in handheld computing devices.

Tiled rendering should not be confused with tiled/nonlinear framebuffer addressing schemes, which make adjacent pixels also adjacent in memory. [3] These addressing schemes are used by a wide variety of architectures, not just tiled renderers.

Early work

Much of the early work on tiled rendering was done as part of the Pixel Planes 5 architecture (1989). [4] [5]

The Pixel Planes 5 project validated the tiled approach and invented a lot of the techniques now viewed as standard for tiled renderers. It is the work most widely cited by other papers in the field.

The tiled approach was also known early in the history of software rendering. Implementations of Reyes rendering often divide the image into "tile buckets".

Commercial products – Desktop and console

Early in the development of desktop GPUs, several companies developed tiled architectures. Over time, these were largely supplanted by immediate-mode GPUs with fast custom external memory systems.

Major examples of this are:

Examples of non-tiled architectures that use large on-chip buffers are:

Commercial products – Embedded

Due to the relatively low external memory bandwidth, and the modest amount of on-chip memory required, tiled rendering is a popular technology for embedded GPUs. Current examples include:

Tile-based immediate mode rendering (TBIM):

Tile-based deferred rendering (TBDR):

Vivante produces mobile GPUs which have tightly coupled frame buffer memory (similar to the Xbox 360 GPU described above). Although this can be used to render parts of the screen, the large size of the rendered regions means that they are not usually described as using a tile-based architecture.

See also

Related Research Articles

<span class="mw-page-title-main">Rendering (computer graphics)</span> Process of generating an image from a model

Rendering or image synthesis is the process of generating a photorealistic or non-photorealistic image from a 2D or 3D model by means of a computer program. The resulting image is referred to as the render. Multiple models can be defined in a scene file containing objects in a strictly defined language or data structure. The scene file contains geometry, viewpoint, texture, lighting, and shading information describing the virtual scene. The data contained in the scene file is then passed to a rendering program to be processed and output to a digital image or raster graphics image file. The term "rendering" is analogous to the concept of an artist's impression of a scene. The term "rendering" is also used to describe the process of calculating effects in a video editing program to produce the final video output.

<span class="mw-page-title-main">Scanline rendering</span> 3D computer graphics image rendering method

Scanline rendering is an algorithm for visible surface determination, in 3D computer graphics, that works on a row-by-row basis rather than a polygon-by-polygon or pixel-by-pixel basis. All of the polygons to be rendered are first sorted by the top y coordinate at which they first appear, then each row or scan line of the image is computed using the intersection of a scanline with the polygons on the front of the sorted list, while the sorted list is updated to discard no-longer-visible polygons as the active scan line is advanced down the picture.

<span class="mw-page-title-main">Rasterisation</span> Conversion of a vector-graphics image to a raster image

In computer graphics, rasterisation or rasterization is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image. The rasterized image may then be displayed on a computer display, video display or printer, or stored in a bitmap file format. Rasterization may refer to the technique of drawing 3D models, or the conversion of 2D rendering primitives such as polygons, line segments into a rasterized format.

Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware acceleration if it is available on the graphics card, allowing for hardware acceleration of the entire 3D rendering pipeline or even only partial acceleration. Direct3D exposes the advanced graphics capabilities of 3D graphics hardware, including Z-buffering, W-buffering, stencil buffering, spatial anti-aliasing, alpha blending, color blending, mipmapping, texture blending, clipping, culling, atmospheric effects, perspective-correct texture mapping, programmable HLSL shaders and effects. Integration with other DirectX technologies enables Direct3D to deliver such features as video mapping, hardware 3D rendering in 2D overlay planes, and even sprites, providing the use of 2D and 3D graphics in interactive media ties.

<span class="mw-page-title-main">Texture mapping</span> Method of defining surface detail on a computer-generated graphic or 3D model

Texture mapping is a method for mapping a texture on a computer-generated graphic. Texture here can be high frequency detail, surface texture, or color.

<span class="mw-page-title-main">Graphics processing unit</span> Specialized electronic circuit; graphics accelerator

A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles.

<span class="mw-page-title-main">Shader</span> Type of program in a graphical processing unit (GPU)

In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene - a process known as shading. Shaders have evolved to perform a variety of specialized functions in computer graphics special effects and video post-processing, as well as general-purpose computing on graphics processing units.

PowerVR is a division of Imagination Technologies that develops hardware and software for 2D and 3D rendering, and for video encoding, decoding, associated image processing and DirectX, OpenGL ES, OpenVG, and OpenCL acceleration. PowerVR also develops AI accelerators called Neural Network Accelerator (NNA).

<span class="mw-page-title-main">Real-time computer graphics</span> Sub-field of computer graphics

Real-time computer graphics or real-time rendering is the sub-field of computer graphics focused on producing and analyzing images in real time. The term can refer to anything from rendering an application's graphical user interface (GUI) to real-time image analysis, but is most often used in reference to interactive 3D computer graphics, typically using a graphics processing unit (GPU). One example of this concept is a video game that rapidly renders changing 3D environments to produce an illusion of motion.

<span class="mw-page-title-main">Xenos (graphics chip)</span> GPU used in the Xbox 360

The Xenos is a custom graphics processing unit (GPU) designed by ATI, used in the Xbox 360 video game console developed and produced for Microsoft. Developed under the codename "C1", it is in many ways related to the R520 architecture and therefore very similar to an ATI Radeon X1800 XT series of PC graphics cards as far as features and performance are concerned. However, the Xenos introduced new design ideas that were later adopted in the TeraScale microarchitecture, such as the unified shader architecture. The package contains two separate dies, the GPU and an eDRAM, featuring a total of 337 million transistors.

<span class="mw-page-title-main">Radeon R100 series</span> Series of video cards

The Radeon R100 is the first generation of Radeon graphics chips from ATI Technologies. The line features 3D acceleration based upon Direct3D 7.0 and OpenGL 1.3, and all but the entry-level versions offloading host geometry calculations to a hardware transform and lighting (T&L) engine, a major improvement in features and performance compared to the preceding Rage design. The processors also include 2D GUI acceleration, video acceleration, and multiple display outputs. "R100" refers to the development codename of the initially released GPU of the generation. It is the basis for a variety of other succeeding products.

In computer graphics, the render output unit (ROP) or raster operations pipeline is a hardware component in modern graphics processing units (GPUs) and one of the final steps in the rendering process of modern graphics cards. The pixel pipelines take pixel and texel information and process it, via specific matrix and vector operations, into a final pixel or depth value; this process is called rasterization. Thus, ROPs control antialiasing, when more than one sample is merged into one pixel. The ROPs perform the transactions between the relevant buffers in the local memory – this includes writing or reading values, as well as blending them together. Dedicated antialiasing hardware used to perform hardware-based antialiasing methods like MSAA is contained in ROPs.

<span class="mw-page-title-main">ATI Rage series</span> Series of video cards

The ATI Rage is a series of graphics chipsets developed by ATI Technologies offering graphical user interface (GUI) 2D acceleration, video acceleration, and 3D acceleration developed by ATI Technologies. It is the successor to the ATI Mach series of 2D accelerators.

Talisman was a Microsoft project to build a new 3D graphics architecture based on quickly compositing 2D "sub-images" onto the screen, an adaptation of tiled rendering. In theory, this approach would dramatically reduce the amount of memory bandwidth required for 3D games and thereby lead to lower-cost graphics accelerators. The project took place during the introduction of the first high-performance 3D accelerators, and these quickly surpassed Talisman in both performance and price. No Talisman-based systems were ever released commercially, and the project was eventually cancelled in the late 1990s.

<span class="mw-page-title-main">Unified shader model</span> GPU whose shading hardware has equal capabilities for all stages of rendering

In the field of 3D computer graphics, the unified shader model refers to a form of shader hardware in a graphical processing unit (GPU) where all of the shader stages in the rendering pipeline have the same capabilities. They can all read textures and buffers, and they use instruction sets that are almost identical.

Order-independent transparency (OIT) is a class of techniques in rasterisational computer graphics for rendering transparency in a 3D scene, which do not require rendering geometry in sorted order for alpha compositing.

<span class="mw-page-title-main">InfiniteReality</span> Graphics subsystem by Silicon Graphics

InfiniteReality refers to a 3D graphics hardware architecture and a family of graphics systems that implemented the aforementioned hardware architecture that was developed and manufactured by Silicon Graphics from 1996 to 2005. The InfiniteReality was positioned as Silicon Graphics' high-end visualization hardware for their MIPS/IRIX platform and was used exclusively in their Onyx family of visualization systems, which are sometimes referred to as "graphics supercomputers" or "visualization supercomputers". The InfiniteReality was marketed to and used by large organizations such as companies and universities that are involved in computer simulation, digital content creation, engineering and research.

<span class="mw-page-title-main">Xbox technical specifications</span>

The Xbox technical specifications describe the various components of the Xbox video game console.

This is a glossary of terms relating to computer graphics.

Caustic Graphics was a computer graphics and fabless semiconductor company that developed technologies to bring real-time ray-traced computer graphics to the mass market.


  1. Molnar, Steven (1994-04-01). "A Sorting Classification of Parallel Rendering" (PDF). IEEE. Archived (PDF) from the original on 2014-09-12. Retrieved 2012-08-24.
  2. "PowerVR: A Master Class in Graphics Technology and Optimization" (PDF). Imagination Technologies. 2012-01-14. Archived (PDF) from the original on 2013-10-03. Retrieved 2014-01-11.
  3. Deucher, Alex (2008-05-16). "How Video Cards Work". X.Org Foundation. Archived from the original on 2010-05-21. Retrieved 2010-05-27.
  4. Mahaney, Jim (1998-06-22). "History". Pixel-Planes. University of North Carolina at Chapel Hill. Archived from the original on 2008-09-29. Retrieved 2008-08-04.
  5. Fuchs, Henry (1989-07-01). "Pixel-planes 5: a heterogeneous multiprocessor graphics system using processor-enhanced memories". Proceedings of the 16th annual conference on Computer graphics and interactive techniques - SIGGRAPH '89. Pixel-Planes. ACM. pp. 79–88. doi:10.1145/74333.74341. ISBN   0201504340. S2CID   1778124 . Retrieved 2012-08-24.
  6. Smith, Tony (1999-10-06). "GigaPixel takes on 3dfx, S3, Nvidia with... tiles". Gigapixel. The Register. Archived from the original on 2012-10-03. Retrieved 2012-08-24.
  7. mestour, mestour (2011-07-21). "Develop 2011: PS Vita is the most developer friendly hardware Sony has ever made". PS Vita. 3dsforums . Retrieved 2011-07-21.[ permanent dead link ]
  8. Kanter, David (August 1, 2016). "Tile-based Rasterization in Nvidia GPUs". Real World Technologies. Archived from the original on 2016-08-04. Retrieved April 1, 2016.
  9. "AMD Vega GPU Architecture Preview: Redesigned Memory Architecture". PC Perspective. Retrieved 2020-01-04.
  10. Smith, Ryan. "The AMD Vega GPU Architecture Teaser: Higher IPC, Tiling, & More, Coming in H1'2017". www.anandtech.com. Retrieved 2020-01-04.
  11. https://software.intel.com/sites/default/files/managed/db/88/The-Architecture-of-Intel-Processor-Graphics-Gen11_R1new.pdf [ bare URL PDF ]
  12. @intelnews (8 May 2019). "Intel's @gregorymbryant at today's..." (Tweet) via Twitter.
  13. https://newsroom.intel.com/wp-content/uploads/sites/11/2019/05/10th-Gen-Intel-Core-Product-Brief.pdf [ bare URL PDF ]
  14. LLC), Tara Meyer (Aquent. "XNA Game Studio 4.0 Refresh". msdn.microsoft.com. Archived from the original on 2015-01-07. Retrieved 2014-05-15.
  15. "Xbox One developer: upcoming SDK improvements will allow for more 1080p games".
  16. "Mali rendering strategy". ARM. Archived from the original on 2016-03-04. Retrieved 2018-10-27.
  17. "An update on the freedreno graphics driver". lwn.net. Archived from the original on 2015-09-05. Retrieved 2015-09-15.
  18. "The rise of mobile gaming on android" (PDF). Qualcomm. p. 5. Archived (PDF) from the original on 2014-11-09. Retrieved 17 September 2015.
  19. Simond, Brian Klug, Anand Lal Shimpi, Francois (September 11, 2011). "Samsung Galaxy S 2 (International) Review - The Best, Redefined". www.anandtech.com. Retrieved 2020-01-04.
  20. "Tile based rendering". Arm . Retrieved 2020-07-13.
  21. "A look at the PowerVR graphics architecture: Tile-based rendering". Imagination Technologies. Archived from the original on 2015-04-05. Retrieved 2015-09-15.
  22. "VideoCoreIV-AG100" (PDF). Broadcom. 2013-09-18. Archived (PDF) from the original on 2015-03-01. Retrieved 2015-01-10.
  23. "Bring your Metal app to Apple Silicon Macs". developer.apple.com. Retrieved 2020-07-13.