Fractal compression is a lossy compression method for digital images, based on fractals. The method is best suited for textures and natural images, relying on the fact that parts of an image often resemble other parts of the same image. [1] Fractal algorithms convert these parts into mathematical data called "fractal codes" which are used to recreate the encoded image.
Fractal image representation may be described mathematically as an iterated function system (IFS). [2]
We begin with the representation of a binary image, where the image may be thought of as a subset of . An IFS is a set of contraction mappings ƒ1,...,ƒN,
According to these mapping functions, the IFS describes a two-dimensional set S as the fixed point of the Hutchinson operator
That is, H is an operator mapping sets to sets, and S is the unique set satisfying H(S) = S. The idea is to construct the IFS such that this set S is the input binary image. The set S can be recovered from the IFS by fixed point iteration: for any nonempty compact initial set A0, the iteration Ak+1 = H(Ak) converges to S.
The set S is self-similar because H(S) = S implies that S is a union of mapped copies of itself:
So we see the IFS is a fractal representation of S.
IFS representation can be extended to a grayscale image by considering the image's graph as a subset of . For a grayscale image u(x,y), consider the set S = {(x,y,u(x,y))}. Then similar to the binary case, S is described by an IFS using a set of contraction mappings ƒ1,...,ƒN, but in ,
A challenging problem of ongoing research in fractal image representation is how to choose the ƒ1,...,ƒN such that its fixed point approximates the input image, and how to do this efficiently.
A simple approach [2] for doing so is the following partitioned iterated function system (PIFS):
In the second step, it is important to find a similar block so that the IFS accurately represents the input image, so a sufficient number of candidate blocks for Di need to be considered. On the other hand, a large search considering many blocks is computationally costly. This bottleneck of searching for similar blocks is why PIFS fractal encoding is much slower than for example DCT and wavelet based image representation.
The initial square partitioning and brute-force search algorithm presented by Jacquin provides a starting point for further research and extensions in many possible directions—different ways of partitioning the image into range blocks of various sizes and shapes; fast techniques for quickly finding a close-enough matching domain block for each range block rather than brute-force searching, such as fast motion estimation algorithms; different ways of encoding the mapping from the domain block to the range block; etc. [3]
Other researchers attempt to find algorithms to automatically encode an arbitrary image as RIFS (recurrent iterated function systems) or global IFS, rather than PIFS; and algorithms for fractal video compression including motion compensation and three dimensional iterated function systems. [4] [5]
Fractal image compression has many similarities to vector quantization image compression. [6]
With fractal compression, encoding is extremely computationally expensive because of the search used to find the self-similarities. Decoding, however, is quite fast. While this asymmetry has so far made it impractical for real time applications, when video is archived for distribution from disk storage or file downloads fractal compression becomes more competitive. [7] [8]
At common compression ratios, up to about 50:1, fractal compression provides similar results to DCT-based algorithms such as JPEG. [9] At high compression ratios fractal compression may offer superior quality. For satellite imagery, ratios of over 170:1 [10] have been achieved with acceptable results. Fractal video compression ratios of 25:1–244:1 have been achieved in reasonable compression times (2.4 to 66 sec/frame). [11]
Compression efficiency increases with higher image complexity and color depth, compared to simple grayscale images.
An inherent feature of fractal compression is that images become resolution independent [12] after being converted to fractal code. This is because the iterated function systems in the compressed file scale indefinitely. This indefinite scaling property of a fractal is known as "fractal scaling".
The resolution independence of a fractal-encoded image can be used to increase the display resolution of an image. This process is also known as "fractal interpolation". In fractal interpolation, an image is encoded into fractal codes via fractal compression, and subsequently decompressed at a higher resolution. The result is an up-sampled image in which iterated function systems have been used as the interpolant. [13] Fractal interpolation maintains geometric detail very well compared to traditional interpolation methods like bilinear interpolation and bicubic interpolation. [14] [15] [16] Since the interpolation cannot reverse Shannon entropy however, it ends up sharpening the image by adding random instead of meaningful detail. One cannot, for example, enlarge an image of a crowd where each person's face is one or two pixels and hope to identify them.
Michael Barnsley led the development of fractal compression from 1985 at the Georgia Institute of Technology (where both Barnsley and Sloan were professors in the mathematics department). [17] The work was sponsored by DARPA and the Georgia Tech Research Corporation. The project resulted in several patents from 1987. [18] Barnsley's graduate student Arnaud Jacquin implemented the first automatic algorithm in software in 1992. [19] [20] All methods are based on the fractal transform using iterated function systems. Michael Barnsley and Alan Sloan formed Iterated Systems Inc. [21] in 1987 which was granted over 20 additional patents related to fractal compression.
A major breakthrough for Iterated Systems Inc. was the automatic fractal transform process which eliminated the need for human intervention during compression as was the case in early experimentation with fractal compression technology. In 1992, Iterated Systems Inc. received a US$2.1 million government grant [22] to develop a prototype digital image storage and decompression chip using fractal transform image compression technology.
Fractal image compression has been used in a number of commercial applications: onOne Software, developed under license from Iterated Systems Inc., Genuine Fractals 5 [23] which is a Photoshop plugin capable of saving files in compressed FIF (Fractal Image Format). To date the most successful use of still fractal image compression is by Microsoft in its Encarta multimedia encyclopedia, [24] also under license.
Iterated Systems Inc. supplied a shareware encoder (Fractal Imager), a stand-alone decoder, a Netscape plug-in decoder and a development package for use under Windows. The redistribution of the "decompressor DLL" provided by the ColorBox III SDK was governed by restrictive per-disk or year-by-year licensing regimes for proprietary software vendors and by a discretionary scheme that entailed the promotion of the Iterated Systems products for certain classes of other users. [25]
ClearVideo – also known as RealVideo (Fractal) – and SoftVideo were early fractal video compression products. ClearFusion was Iterated's freely distributed streaming video plugin for web browsers. In 1994 SoftVideo was licensed to Spectrum Holobyte for use in its CD-ROM games including Falcon Gold and Star Trek: The Next Generation A Final Unity. [26]
In 1996, Iterated Systems Inc. announced [27] an alliance with the Mitsubishi Corporation to market ClearVideo to their Japanese customers. The original ClearVideo 1.2 decoder driver is still supported [28] by Microsoft in Windows Media Player although the encoder is no longer supported.
Two firms, Total Multimedia Inc. and Dimension, both claim to own or have the exclusive licence to Iterated's video technology, but neither has yet released a working product. The technology basis appears to be Dimension's U.S. patents 8639053 and 8351509, which have been considerably analyzed. [29] In summary, it is a simple quadtree block-copying system with neither the bandwidth efficiency nor PSNR quality of traditional DCT-based codecs. In January 2016, TMMI announced that it was abandoning fractal-based technology altogether.
Research papers between 1997 and 2007 discussed possible solutions to improve fractal algorithms and encoding hardware. [30] [31] [32] [33] [34] [35] [36] [37] [38]
A library called Fiasco was created by Ullrich Hafner. In 2001, Fiasco was covered in the Linux Journal . [39] According to the 2000-04 Fiasco manual, Fiasco can be used for video compression. [40] The Netpbm library includes the Fiasco library. [41] [42]
Femtosoft developed an implementation of fractal image compression in Object Pascal and Java. [43]
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis. It is the study of numerical methods that attempt to find approximate solutions of problems rather than the exact ones. Numerical analysis finds application in all fields of engineering and the physical sciences, and in the 21st century also the life and social sciences like economics, medicine, business and even the arts. Current growth in computing power has enabled the use of more complex numerical analysis, providing detailed and realistic mathematical models in science and engineering. Examples of numerical analysis include: ordinary differential equations as found in celestial mechanics, numerical linear algebra in data analysis, and stochastic differential equations and Markov chains for simulating living cells in medicine and biology.
A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases or decreases, and then returns to zero one or more times. Wavelets are termed a "brief oscillation". A taxonomy of wavelets has been established, based on the number and direction of its pulses. Wavelets are imbued with specific properties that make them useful for signal processing.
Motion compensation in computing is an algorithmic technique used to predict a frame in a video given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.
In mathematics, iterated function systems (IFSs) are a method of constructing fractals; the resulting fractals are often self-similar. IFS fractals are more related to set theory than fractal geometry. They were introduced in 1981.
In mathematics, the transfer operator encodes information about an iterated map and is frequently used to study the behavior of dynamical systems, statistical mechanics, quantum chaos and fractals. In all usual cases, the largest eigenvalue is 1, and the corresponding eigenvector is the invariant measure of the system.
In mathematics, the term chaos game originally referred to a method of creating a fractal, using a polygon and an initial point selected at random inside it. The fractal is created by iteratively creating a sequence of points, starting with the initial random point, in which each point in the sequence is a given fraction of the distance between the previous point and one of the vertices of the polygon; the vertex is chosen at random in each iteration. Repeating this iterative process a large number of times, selecting the vertex at random on each iteration, and throwing out the first few points in the sequence, will often produce a fractal shape. Using a regular triangle and the factor 1/2 will result in the Sierpinski triangle, while creating the proper arrangement with four points and a factor 1/2 will create a display of a "Sierpinski Tetrahedron", the three-dimensional analogue of the Sierpinski triangle. As the number of points is increased to a number N, the arrangement forms a corresponding (N-1)-dimensional Sierpinski Simplex.
In mathematics, a wavelet series is a representation of a square-integrable function by a certain orthonormal series generated by a wavelet. This article provides a formal, mathematical definition of an orthonormal wavelet and of the integral wavelet transform.
The Hilbert curve is a continuous fractal space-filling curve first described by the German mathematician David Hilbert in 1891, as a variant of the space-filling Peano curves discovered by Giuseppe Peano in 1890.
Block Truncation Coding (BTC) is a type of lossy image compression technique for greyscale images. It divides the original images into blocks and then uses a quantizer to reduce the number of grey levels in each block whilst maintaining the same mean and standard deviation. It is an early predecessor of the popular hardware DXTC technique, although BTC compression method was first adapted to color long before DXTC using a very similar approach called Color Cell Compression. BTC has also been adapted to video compression.
Matching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete dictionary . The basic idea is to approximately represent a signal from Hilbert space as a weighted sum of finitely many functions taken from . An approximation with atoms has the form
The filled-in Julia set of a polynomial is a Julia set and its interior, non-escaping set.
Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing.
In mathematics, the collage theorem characterises an iterated function system whose attractor is close, relative to the Hausdorff metric, to a given set. The IFS described is composed of contractions whose images, as a collage or union when mapping the given set, are arbitrarily close to the given set. It is typically used in fractal compression.
The Barnsley fern is a fractal named after the British mathematician Michael Barnsley who first described it in his book Fractals Everywhere. He made it to resemble the black spleenwort, Asplenium adiantum-nigrum.
In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.
Sparse dictionary learning is a representation learning method which aims to find a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms, and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than any one of the signals being observed. These two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal, but also provide an improvement in sparsity and flexibility of the representation.
Asymmetric numeral systems (ANS) is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda from Jagiellonian University, used in data compression since 2014 due to improved performance compared to previous methods. ANS combines the compression ratio of arithmetic coding, with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication.