Block Truncation Coding

Last updated

Block Truncation Coding (BTC) is a type of lossy image compression technique for greyscale images. It divides the original images into blocks and then uses a quantizer to reduce the number of grey levels in each block whilst maintaining the same mean and standard deviation. It is an early predecessor of the popular hardware DXTC technique, although BTC compression method was first adapted to color long before DXTC using a very similar approach called Color Cell Compression. [1] BTC has also been adapted to video compression. [2]

Contents

BTC was first proposed by Professors Mitchell and Delp at Purdue University. [3] Another variation of BTC is Absolute Moment Block Truncation Coding or AMBTC, in which instead of using the standard deviation the first absolute moment is preserved along with the mean. AMBTC is computationally simpler than BTC and also typically results in a lower Mean Squared Error (MSE). AMBTC was proposed by Maximo Lema and Robert Mitchell. [4]

Using sub-blocks of 4×4 pixels gives a compression ratio of 4:1 assuming 8-bit integer values are used during transmission or storage. Larger blocks allow greater compression ("a" and "b" values spread over more pixels) however quality also reduces with the increase in block size due to the nature of the algorithm.

The BTC algorithm was used for compressing Mars Pathfinder's rover images. [5]

Compression procedure

A pixel image is divided into blocks of typically 4×4 pixels. For each block the Mean and Standard Deviation of the pixel values are calculated; these statistics generally change from block to block. The pixel values selected for each reconstructed, or new, block are chosen so that each block of the BTC compressed image will have (approximately) the same mean and standard deviation as the corresponding block of the original image. A two level quantization on the block is where we gain the compression and is performed as follows:

Here are pixel elements of the original block and are elements of the compressed block. In words this can be explained as: If a pixel value is greater than the mean it is assigned the value "1", otherwise "0". Values equal to the mean can have either a "1" or a "0" depending on the preference of the person or organisation implementing the algorithm.

This 16-bit block is stored or transmitted along with the values of Mean and Standard Deviation. Reconstruction is made with two values "a" and "b" which preserve the mean and the standard deviation. The values of "a" and "b" can be computed as follows:

Where is the standard deviation, m is the total number of pixels in the block and q is the number of pixels greater than the mean ()

To reconstruct the image, or create its approximation, elements assigned a 0 are replaced with the "a" value and elements assigned a 1 are replaced with the "b" value.

This demonstrates that the algorithm is asymmetric in that the encoder has much more work to do than the decoder. This is because the decoder is simply replacing 1's and 0's with the estimated value whereas the encoder is also required to calculate the mean, standard deviation and the two values to use. [6]

Example

Encoder

Take a 4×4 block from an image, in this case the mountain test image: [7]

Like any small block from an image this appears rather boring to work with as the numbers are all quite similar, this is the nature of lossy compression and how it can work so well for images. Now we need to calculate two values from this data, that is the mean and standard deviation. The mean can be computed to 241.875, this is a simple calculation which should require no further explanation. The standard deviation is easily calculated at 4.36. From this the values of "a" and "b" can be calculated using the previous equations. They come out to be 236.935 and 245.718 respectively. The last calculation that needs to be done on the encoding side is to set the matrix to transmit to 1's and 0's so that each pixel can be transmitted as a single bit.

Decoder

Now at the decoder side all we need to do is reassign the "a" and "b" values to the 1 and 0 pixels. This will give us the following block:

As can be seen, the block has been reconstructed with the two values of "a" and "b" as integers (because images aren't defined to store floating point numbers). When working through the theory, this is a good point to calculate the mean and standard deviation of the reconstructed block. They should equal the original mean and standard deviation. Remember to use integers, otherwise much quantization error will become involved, as we previously quantized everything to integers in the encoder.

See also

Related Research Articles

<span class="mw-page-title-main">JPEG</span> Lossy compression method for reducing the size of digital images

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.

Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics ; third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

Rate–distortion theory is a major branch of information theory which provides the theoretical foundations for lossy data compression; it addresses the problem of determining the minimal number of bits per symbol, as measured by the rate R, that should be communicated over a channel, so that the source can be approximately reconstructed at the receiver without exceeding an expected distortion D.

<span class="mw-page-title-main">Quantization (signal processing)</span> Process of mapping a continuous set to a countable set

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

<span class="mw-page-title-main">YCbCr</span> Family of digital colour spaces

YCbCr, Y′CbCr, or Y Pb/Cb Pr/Cr, also written as YCBCR or Y′CBCR, is a family of color spaces used as a part of the color image pipeline in video and digital photography systems. Y′ is the luma component and CB and CR are the blue-difference and red-difference chroma components. Y′ is distinguished from Y, which is luminance, meaning that light intensity is nonlinearly encoded based on gamma corrected RGB primaries.

<span class="mw-page-title-main">Gaussian blur</span> Type of image blur produced by a Gaussian function

In image processing, a Gaussian blur is the result of blurring an image by a Gaussian function.

Lossless JPEG is a 1993 addition to JPEG standard by the Joint Photographic Experts Group to enable lossless compression. However, the term may also be used to refer to all lossless compression schemes developed by the group, including JPEG 2000 and JPEG-LS.

A co-occurrence matrix or co-occurrence distribution is a matrix that is defined over an image to be the distribution of co-occurring pixel values at a given offset. It is used as an approach to texture analysis with various applications especially in medical image analysis.

The mean absolute difference (univariate) is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean absolute difference, which is the mean absolute difference divided by the arithmetic mean, and equal to twice the Gini coefficient. The mean absolute difference is also known as the absolute mean difference and the Gini mean difference (GMD). The mean absolute difference is sometimes denoted by Δ or as MD.

Mean square quantization error (MSQE) is a figure of merit for the process of analog to digital conversion.

<span class="mw-page-title-main">Progressive Graphics File</span> File format

PGF is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio.

<span class="mw-page-title-main">Ordered dithering</span> Image dithering algorithm

Ordered dithering is an image dithering algorithm. It is commonly used to display a continuous image on a display of smaller color depth. For example, Microsoft Windows uses it in 16-color graphics modes. The algorithm is characterized by noticeable crosshatch patterns in the result.

<span class="mw-page-title-main">Color Cell Compression</span> Lossy color image compression algorithm

Color Cell Compression is a lossy image compression algorithm developed by Campbell et al., in 1986, which can be considered an early forerunner of modern texture compression algorithms, such as S3 Texture Compression and Adaptive Scalable Texture Compression. It is closely related to Block Truncation Coding, another lossy image compression algorithm, which predates Color Cell Compression, in that it uses the dominant luminance of a block of pixels to partition said pixels into two representative colors. The primary difference between Block Truncation Coding and Color Cell Compression is that the former was designed to compress grayscale images and the latter was designed to compress color images. Also, Block Truncation Coding requires that the standard deviation of the colors of pixels in a block be computed in order to compress an image, whereas Color Cell Compression does not use the standard deviation. Both algorithms, though, can compress an image down to effectively 2 bits per pixel.

<span class="mw-page-title-main">Non-local means</span>

Non-local means is an algorithm in image processing for image denoising. Unlike "local mean" filters, which take the mean value of a group of pixels surrounding a target pixel to smooth the image, non-local means filtering takes a mean of all pixels in the image, weighted by how similar these pixels are to the target pixel. This results in much greater post-filtering clarity, and less loss of detail in the image compared with local mean algorithms.

<span class="mw-page-title-main">Kuwahara filter</span>

The Kuwahara filter is a non-linear smoothing filter used in image processing for adaptive noise reduction. Most filters that are used for image smoothing are linear low-pass filters that effectively reduce noise but also blur out the edges. However the Kuwahara filter is able to apply smoothing on the image while preserving the edges.

ZPEG is a motion video technology that applies a human visual acuity model to a decorrelated transform-domain space, thereby optimally reducing the redundancies in motion video by removing the subjectively imperceptible. This technology is applicable to a wide range of video processing problems such as video optimization, real-time motion video compression, subjective quality monitoring, and format conversion.

References

  1. Liou, D. -M.; Huang, Y.; Reynolds, N. (1990). "A new microcomputer based imaging system with C/sup 3/ technique". IEEE TENCON'90: 1990 IEEE Region 10 Conference on Computer and Communication Systems. Conference Proceedings. p. 555. doi:10.1109/TENCON.1990.152671. ISBN   0-87942-556-3. S2CID   62015990.
  2. Healy, D.; Mitchell, O. (1981). "Digital Video Bandwidth Compression Using Block Truncation Coding". IEEE Transactions on Communications. 29 (12): 1809. Bibcode:1981ITCom..29.1809H. doi:10.1109/TCOM.1981.1094938.
  3. Delp, E.; Mitchell, O. (1979). "Image Compression Using Block Truncation Coding". IEEE Transactions on Communications. 27 (9): 1335. Bibcode:1979STIA...8011525D. doi:10.1109/TCOM.1979.1094560.
  4. Lema, M.; Mitchell, O. (1984). "Absolute Moment Block Truncation Coding and akhand Its Application to Color Images". IEEE Transactions on Communications. 32 (10): 1148. doi:10.1109/TCOM.1984.1095973.
  5. "Rover Camera Instrument Description". NASA. Retrieved 2021-05-18.
  6. Leis, J 2008, ELE4607 Advance Digital Communications, Module 3: Image & Video Coding. Lecture Slides, University of Southern Queensland, 2008.
  7. Waterloo Fractal Coding and Analysis Group