S3 Texture Compression

Last updated August 16, 2023

S3 Texture Compression (S3TC) (sometimes also called DXTn, DXTC, or BCn) is a group of related lossy texture compression algorithms originally developed by Iourcha et al. of S3 Graphics, Ltd. ^[1]^[2] for use in their Savage 3D computer graphics accelerator. The method of compression is strikingly similar to the previously published Color Cell Compression,^[3] which is in turn an adaptation of Block Truncation Coding published in the late 1970s. Unlike some image compression algorithms (e.g. JPEG), S3TC's fixed-rate data compression coupled with the single memory access (cf. Color Cell Compression and some VQ-based schemes) made it well-suited for use in compressing textures in hardware-accelerated 3D computer graphics. Its subsequent inclusion in Microsoft's DirectX 6.0 and OpenGL 1.3 (via the GL_EXT_texture_compression_s3tc extension) led to widespread adoption of the technology among hardware and software makers. While S3 Graphics is no longer a competitor in the graphics accelerator market, license fees have been levied and collected for the use of S3TC technology until October 2017, for example in game consoles and graphics cards. The wide use of S3TC has led to a de facto requirement for OpenGL drivers to support it, but the patent-encumbered status of S3TC presented a major obstacle to open source implementations,^[4] while implementation approaches which tried to avoid the patented parts existed.^[5]

Patent

Some (e.g. US 5956431 A) of the multiple USPTO patents on S3 Texture Compression expired on October 2, 2017.^[6] At least one continuation patent, US6,775,417, however had a 165-day extension. This continuation patent expired on March 16, 2018.

Codecs

There are five variations of the S3TC algorithm (named DXT1 through DXT5, referring to the FourCC code assigned by Microsoft to each format), each designed for specific types of image data. All convert a 4×4 block of pixels to a 64-bit or 128-bit quantity, resulting in compression ratios of 6:1 with 24-bit RGB input data or 4:1 with 32-bit RGBA input data. S3TC is a lossy compression algorithm, resulting in image quality degradation, an effect which is minimized by the ability to increase texture resolutions while maintaining the same memory requirements. Hand-drawn cartoon-like images do not compress well, nor do normal map data, both of which usually generate artifacts. ATI's 3Dc compression algorithm is a modification of DXT5 designed to overcome S3TC's shortcomings with regard to normal maps. id Software worked around the normalmap compression issues in Doom 3 by moving the red component into the alpha channel before compression and moving it back during rendering in the pixel shader.^[7]

Like many modern image compression algorithms, S3TC only specifies the method used to decompress images, allowing implementers to design the compression algorithm to suit their specific needs, although the patent still covers compression algorithms. The nVidia GeForce 256 through to GeForce 4 cards also used 16-bit interpolation to render DXT1 textures, which resulted in banding when unpacking textures with color gradients. Again, this created an unfavorable impression of texture compression, not related to the fundamentals of the codec itself.

DXT1

DXT1 (also known as Block Compression 1 or BC1) is the smallest variation of S3TC, storing 16 input pixels in 64 bits of output, consisting of two 16-bit RGB 5:6:5 color values $c_{0}$ and $c_{1}$ , and a 4×4 two-bit lookup table.

If $c_{0}>c_{1}$ (compare these colors by interpreting them as two 16-bit unsigned numbers), then two other colors are calculated, such that for each component, ${\textstyle c_{2}={2 \over 3}c_{0}+{1 \over 3}c_{1}}$ and ${\textstyle c_{3}={1 \over 3}c_{0}+{2 \over 3}c_{1}}$ . This mode operates similarly to mode 0xC0 of the original Apple Video codec.^[8]

Otherwise, if $c_{0}\leq c_{1}$ , then ${\textstyle c_{2}={1 \over 2}c_{0}+{1 \over 2}c_{1}}$ and $c_{3}$ is transparent black corresponding to a premultiplied alpha format. This color sometimes causes a black border surrounding the transparent area when linear texture filtering and alpha test is used, due to colors being interpolated between the color of opaque texel and neighbouring black transparent texel.

The lookup table is then consulted to determine the color value for each pixel, with a value of 0 corresponding to $c_{0}$ and a value of 3 corresponding to $c_{3}$ .

DXT2 and DXT3

DXT2 and DXT3 (collectively also known as Block Compression 2 or BC2) converts 16 input pixels (corresponding to a 4x4 pixel block) into 128 bits of output, consisting of 64 bits of alpha channel data (4 bits for each pixel) followed by 64 bits of color data, encoded the same way as DXT1 (with the exception that the 4-color version of the DXT1 algorithm is always used instead of deciding which version to use based on the relative values of $c_{0}$ and $c_{1}$ ).

In DXT2, the color data is interpreted as being premultiplied by alpha, in DXT3 it is interpreted as not having been premultiplied by alpha. Typically DXT2/3 are well suited to images with sharp alpha transitions, between translucent and opaque areas.

DXT4 and DXT5

DXT4 and DXT5 (collectively also known as Block Compression 3 or BC3) converts 16 input pixels into 128 bits of output, consisting of 64 bits of alpha channel data (two 8-bit alpha values and a 4×4 3-bit lookup table) followed by 64 bits of color data (encoded the same way as DXT1).

If $\alpha _{0}>\alpha _{1}$ , then six other alpha values are calculated, such that ${\textstyle \alpha _{2}={{6\alpha _{0}+1\alpha _{1}} \over 7}}$ , ${\textstyle \alpha _{3}={{5\alpha _{0}+2\alpha _{1}} \over 7}}$ , ${\textstyle \alpha _{4}={{4\alpha _{0}+3\alpha _{1}} \over 7}}$ , ${\textstyle \alpha _{5}={{3\alpha _{0}+4\alpha _{1}} \over 7}}$ , ${\textstyle \alpha _{6}={{2\alpha _{0}+5\alpha _{1}} \over 7}}$ , and ${\textstyle \alpha _{7}={{1\alpha _{0}+6\alpha _{1}} \over 7}}$ .

Otherwise, if ${\textstyle \alpha _{0}\leq \alpha _{1}}$ , four other alpha values are calculated such that ${\textstyle \alpha _{2}={{4\alpha _{0}+1\alpha _{1}} \over 5}}$ , ${\textstyle \alpha _{3}={{3\alpha _{0}+2\alpha _{1}} \over 5}}$ , ${\textstyle \alpha _{4}={{2\alpha _{0}+3\alpha _{1}} \over 5}}$ , and ${\textstyle \alpha _{5}={{1\alpha _{0}+4\alpha _{1}} \over 5}}$ with $\alpha _{6}=0$ and $\alpha _{7}=255$ .

The lookup table is then consulted to determine the alpha value for each pixel, with a value of 0 corresponding to $\alpha _{0}$ and a value of 7 corresponding to $\alpha _{7}$ . DXT4's color data is premultiplied by alpha, whereas DXT5's is not. Because DXT4/5 use an interpolated alpha scheme, they generally produce superior results for alpha (transparency) gradients than DXT2/3.

Further variants

BC4 and BC5

BC4 and BC5 (Block Compression 4 and 5) are added in Direct3D 10. They reuse the alpha channel encoding found in DXT4/5 (BC3).^[9]

BC4 stores 16 input single-channel (e.g. greyscale) pixels into 64 bits of output, encoded in nearly^[10] the same way as BC3 alphas. The expanded palette provides higher quality.
BC5 stores 16 input double-channel (e.g. tangent space normal map) pixels into 128 bits of output, consisting of two halves each encoded like BC4.

BC6H and BC7

BC6H (sometimes BC6) and BC7 (Block Compression 6H and 7) are added in Direct3D 11.^[9]

BC6H encodes 16 input RGB HDR (float16) pixels into 128 bits of output. It essentially treats float16 as 16 sign-magnitude integer value and interpolates such integers linearly. It works well for blocks without sign changes. A total of 14 modes are defined, though most differ minimally: only two prediction modes are really used.^[10]
BC7 encodes 16 input RGB8/RGBA8 pixels into 128 bits of output. It can be understood as a much-enhanced BC3.^[10]

BC6H and BC7 have a much more complex algorithm with a selection of encoding modes. The quality is much better as a result.^[9] These two modes are also specified much more exactly, with ranges of accepted deviation. Earlier BCn modes decode slightly differently among GPU vendors.^[10]

S3TC format comparison

FOURCC	DX 10/11 name	Description	Alpha premultiplied?	Compression ratio	Texture type
DXT1	BC1	1-bit alpha / opaque	Yes	6:1 (for 24-bit source image)	Simple non-alpha
DXT2	BC2	Explicit alpha	Yes	4:1	Sharp alpha
DXT3	BC2	Explicit alpha	No	4:1	Sharp alpha
DXT4	BC3	Interpolated alpha	Yes	4:1	Gradient alpha
DXT5	BC3	Interpolated alpha	No	4:1	Gradient alpha
—	BC4	Interpolated greyscale	—	2:1	Gradient
—	BC5	Interpolated two-channel	—	2:1	Gradient
—	BC6H	Interpolated HDR (no alpha)	—	6:1	Gradient
—	BC7	Interpolated alpha	?	4:1	Gradient

Data preconditioning

BCn textures can be further compressed for on-disk storage and distribution. An application would decompress this extra layer and send the BCn data to the GPU as usual.

BCn can be combined with Oodle Texture, a lossy preprocessor that modifies the input texture so that the BCn output is more easily compressed by a LZ77 compressor (rate-distortion optimization). BC7 specifically can also use "bc7prep", a lossless pass to re-encode the texture in a more compressible form (requiring its inverse at decompression).^[11]

crunch is another tool that performs RDO and optionally further re-encoding.^[12]

In 2021, Microsoft produced a "BCPack" compression algorithm specifically for BCn-compressed textures. Xbox series X and S have hardware support for decompressing BCPack streams.^[13]

Related Research Articles

In computer graphics, alpha compositing or alpha blending is the process of combining one image with a background to create the appearance of partial or full transparency. It is often useful to render picture elements (pixels) in separate passes or layers and then combine the resulting 2D images into a single, final image called the composite. Compositing is used extensively in film when combining computer-rendered image elements with live footage. Alpha blending is also used in 2D computer graphics to put rasterized foreground elements over a background.

The Graphics Interchange Format is a bitmap image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on June 15, 1987. It is in widespread usage on the World Wide Web due to its wide support and portability between applications and operating systems.

In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.

Portable Network Graphics is a raster-graphics file format that supports lossless data compression. PNG was developed as an improved, non-patented replacement for Graphics Interchange Format (GIF)—unofficially, the initials PNG stood for the recursive acronym "PNG's not GIF".

Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic encoding, frequently used characters will be stored with fewer bits and not-so-frequently occurring characters will be stored with more bits, resulting in fewer bits used in total. Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, an arbitrary-precision fraction q, where 0.0 ≤ q < 1.0. It represents the current information as a range, defined by two numbers. A recent family of entropy coders called asymmetric numeral systems allows for faster implementations thanks to directly operating on a single natural number representing the current information.

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the Unix file compression utility compress and is used in the GIF image format.

The BMP file format or bitmap, is a raster graphics image file format used to store bitmap digital images, independently of the display device, especially on Microsoft Windows and OS/2 operating systems.

In information theory, turbo codes are a class of high-performance forward error correction (FEC) codes developed around 1990–91, but first published in 1993. They were the first practical codes to closely approach the maximum channel capacity or Shannon limit, a theoretical maximum for the code rate at which reliable communication is still possible given a specific noise level. Turbo codes are used in 3G/4G mobile communications and in satellite communications as well as other applications where designers seek to achieve reliable information transfer over bandwidth- or latency-constrained communication links in the presence of data-corrupting noise. Turbo codes compete with low-density parity-check (LDPC) codes, which provide similar performance.

3Dc, also known as DXN, BC5, or Block Compression 5 is a lossy data compression algorithm for normal maps invented and first implemented by ATI. It builds upon the earlier DXT5 algorithm and is an open standard. 3Dc is now implemented by both ATI and Nvidia.

FXT1 is a texture compression scheme for 3D graphics, invented by the hardware vendor 3dfx Interactive and offered as an open source rival standard to S3TC in September 1999, a year after S3TC had been adopted by Microsoft as part of DirectX. Limited vendor hardware support has been a barrier to its acceptance. Notably, despite being open source, FXT1 was not adopted by Nintendo for the GameCube, nor by Sony for the PlayStation 3, in both cases losing out to the established S3TC standard. Another possible reason for its lack of adoption is that the CC_MIXED mode probably infringes the S3TC patent.

<span class="mw-page-title-main">Data Matrix</span> Two-dimensional matrix barcode

A Data Matrix is a two-dimensional code consisting of black and white "cells" or dots arranged in either a square or rectangular pattern, also known as a matrix. The information to be encoded can be text or numeric data. Usual data size is from a few bytes up to 1556 bytes. The length of the encoded data depends on the number of cells in the matrix. Error correction codes are often used to increase reliability: even if one or more cells are damaged so it is unreadable, the message can still be read. A Data Matrix symbol can store up to 2,335 alphanumeric characters.

Lossless JPEG is a 1993 addition to JPEG standard by the Joint Photographic Experts Group to enable lossless compression. However, the term may also be used to refer to all lossless compression schemes developed by the group, including JPEG 2000 and JPEG-LS.

Block Truncation Coding (BTC) is a type of lossy image compression technique for greyscale images. It divides the original images into blocks and then uses a quantizer to reduce the number of grey levels in each block whilst maintaining the same mean and standard deviation. It is an early predecessor of the popular hardware DXTC technique, although BTC compression method was first adapted to color long before DXTC using a very similar approach called Color Cell Compression. BTC has also been adapted to video compression.

Ericsson Texture Compression (ETC) is a lossy texture compression technique developed in collaboration with Ericsson Research in early 2005. It was originally developed under the name iPACKMAN and based on an earlier compression scheme called PACKMAN.

<span class="mw-page-title-main">Color Cell Compression</span> Lossy color image compression algorithm

Color Cell Compression is a lossy image compression algorithm developed by Campbell et al., in 1986, which can be considered an early forerunner of modern texture compression algorithms, such as S3 Texture Compression and Adaptive Scalable Texture Compression. It is closely related to Block Truncation Coding, another lossy image compression algorithm, which predates Color Cell Compression, in that it uses the dominant luminance of a block of pixels to partition said pixels into two representative colors. The primary difference between Block Truncation Coding and Color Cell Compression is that the former was designed to compress grayscale images and the latter was designed to compress color images. Also, Block Truncation Coding requires that the standard deviation of the colors of pixels in a block be computed in order to compress an image, whereas Color Cell Compression does not use the standard deviation. Both algorithms, though, can compress an image down to effectively 2 bits per pixel.

Adaptive scalable texture compression (ASTC) is a lossy block-based texture compression algorithm developed by Jørn Nystad et al. of ARM Ltd. and AMD.

Apple Video is a lossy video compression and decompression algorithm (codec) developed by Apple Inc. and first released as part of QuickTime 1.0 in 1991. The codec is also known as QuickTime Video, by its FourCC RPZA and the name Road Pizza. When used in the AVI container, the FourCC AZPR is also used.

The Quite OK Image Format (QOI) is a specification for lossless image compression of 24-bit or 32-bit color raster (bitmapped) images, invented by Dominic Szablewski and first announced on 24 November 2021.

References

↑ US 5956431 "Fixed-rate block-based image compression with inferred pixel values"
↑ US 5956431,Iourcha, Konstantine I.; Nayak, Krishna S.& Hong, Zhou,"System and method for fixed-rate block-based image compression with inferred pixel values",published Sep 21, 1999
↑ "1990 IEEE Color Cell Compression Paper". Ieeexplore.ieee.org. doi:10.1109/TENCON.1990.152671. S2CID 62015990.{{cite journal}}: Cite journal requires |journal= (help)
↑ "S3TC situation on official DRI information page". Dri.freedesktop.org. Retrieved 2012-01-25.
↑ S2TC: A Possible Workaround For The S3TC Patent Situation on phoronix
↑ Yates, Tom (2017-02-15). "This is why I drink: a discussion of Fedora's legal state". LWN.net . Retrieved 2017-02-16. ... The patent on S3 texture compression expires on October 2, 2017, so Steam games might work better on Fedora after that date. ...
↑ Duffy, Robert (July 27, 2004). "DOOM 3 Video Requirements". Gamershell.com. Archived from the original on January 3, 2008. Retrieved 2012-01-25.
↑ Togni, Roberto, et al. "Apple RPZA". MultimediaWiki.
1 2 3 Reed, Nathan. "Understanding BCn Texture Compression Formats". Nathan Reed’s coding blog.
1 2 3 4 Giesen, Fabian “ryg” (4 October 2021). "GPU BCn decoding". The ryg blog. Retrieved 24 July 2023.
↑ "Oodle Texture Compression". www.radgametools.com. Open source part mentioned: bc7enc_rdo
↑ "crunch open source texture compression library". GitHub. Retrieved 2016-09-13.
↑ "DirectStorage Overview - Microsoft Game Development Kit". 16 March 2023.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] US 5956431 "Fixed-rate block-based image compression with inferred pixel values"

[2] US 5956431,Iourcha, Konstantine I.; Nayak, Krishna S.& Hong, Zhou,"System and method for fixed-rate block-based image compression with inferred pixel values",published Sep 21, 1999

[3] "1990 IEEE Color Cell Compression Paper". Ieeexplore.ieee.org. doi:10.1109/TENCON.1990.152671. S2CID 62015990.{{cite journal}}: Cite journal requires |journal= (help)

[4] "S3TC situation on official DRI information page". Dri.freedesktop.org. Retrieved 2012-01-25.

[5] S2TC: A Possible Workaround For The S3TC Patent Situation on phoronix

[lwn-714524-6] Yates, Tom (2017-02-15). "This is why I drink: a discussion of Fedora's legal state". LWN.net . Retrieved 2017-02-16. ... The patent on S3 texture compression expires on October 2, 2017, so Steam games might work better on Fedora after that date. ...

[7] Duffy, Robert (July 27, 2004). "DOOM 3 Video Requirements". Gamershell.com. Archived from the original on January 3, 2008. Retrieved 2012-01-25.

[8] Togni, Roberto, et al. "Apple RPZA". MultimediaWiki.

[reed-9] 1 2 3 Reed, Nathan. "Understanding BCn Texture Compression Formats". Nathan Reed’s coding blog.

[ryg-10] 1 2 3 4 Giesen, Fabian “ryg” (4 October 2021). "GPU BCn decoding". The ryg blog. Retrieved 24 July 2023.

[11] "Oodle Texture Compression". www.radgametools.com. Open source part mentioned: bc7enc_rdo

[12] "crunch open source texture compression library". GitHub. Retrieved 2016-09-13.

[13] "DirectStorage Overview - Microsoft Game Development Kit". 16 March 2023.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]