Scale space implementation

Last updated
Scale space
Scale-space axioms
Scale space implementation
Feature detection
Edge detection
Blob detection
Corner detection
Ridge detection
Interest point detection
Scale selection
Affine shape adaptation
Scale-space segmentation

In the areas of computer vision, image analysis and signal processing, the notion of scale-space representation is used for processing measurement data at multiple scales, and specifically enhance or suppress image features over different ranges of scale (see the article on scale space). A special type of scale-space representation is provided by the Gaussian scale space, where the image data in N dimensions is subjected to smoothing by Gaussian convolution. Most of the theory for Gaussian scale space deals with continuous images, whereas one when implementing this theory will have to face the fact that most measurement data are discrete. Hence, the theoretical problem arises concerning how to discretize the continuous theory while either preserving or well approximating the desirable theoretical properties that lead to the choice of the Gaussian kernel (see the article on scale-space axioms). This article describes basic approaches for this that have been developed in the literature.

Contents

Statement of the problem

The Gaussian scale-space representation of an N-dimensional continuous signal,

is obtained by convolving fC with an N-dimensional Gaussian kernel:

In other words:

However, for implementation, this definition is impractical, since it is continuous. When applying the scale space concept to a discrete signal fD, different approaches can be taken. This article is a brief summary of some of the most frequently used methods.

Separability

Using the separability property of the Gaussian kernel

the N-dimensional convolution operation can be decomposed into a set of separable smoothing steps with a one-dimensional Gaussian kernel G along each dimension

where

and the standard deviation of the Gaussian σ is related to the scale parameter t according to t = σ2.

Separability will be assumed in all that follows, even when the kernel is not exactly Gaussian, since separation of the dimensions is the most practical way to implement multidimensional smoothing, especially at larger scales. Therefore, the rest of the article focuses on the one-dimensional case.

The sampled Gaussian kernel

When implementing the one-dimensional smoothing step in practice, the presumably simplest approach is to convolve the discrete signal fD with a sampled Gaussian kernel:

where

(with t = σ2) which in turn is truncated at the ends to give a filter with finite impulse response

for M chosen sufficiently large (see error function) such that

A common choice is to set M to a constant C times the standard deviation of the Gaussian kernel

where C is often chosen somewhere between 3 and 6.

Using the sampled Gaussian kernel can, however, lead to implementation problems, in particular when computing higher-order derivatives at finer scales by applying sampled derivatives of Gaussian kernels. When accuracy and robustness are primary design criteria, alternative implementation approaches should therefore be considered.

For small values of ε (10−6 to 10−8) the errors introduced by truncating the Gaussian are usually negligible. For larger values of ε, however, there are many better alternatives to a rectangular window function. For example, for a given number of points, a Hamming window, Blackman window, or Kaiser window will do less damage to the spectral and other properties of the Gaussian than a simple truncation will. Notwithstanding this, since the Gaussian kernel decreases rapidly at the tails, the main recommendation is still to use a sufficiently small value of ε such that the truncation effects are no longer important.

The discrete Gaussian kernel

The ideal discrete Gaussian kernel (solid) compared with sampled ordinary Gaussian (dashed), for scales t = [0.5, 1, 2, 4] Discrete Gaussian kernel.svg
The ideal discrete Gaussian kernel (solid) compared with sampled ordinary Gaussian (dashed), for scales t = [0.5, 1, 2, 4]

A more refined approach is to convolve the original signal with the discrete Gaussian kernelT(n, t) [1] [2] [3]

where

and denotes the modified Bessel functions of integer order, n. This is the discrete counterpart of the continuous Gaussian in that it is the solution to the discrete diffusion equation (discrete space, continuous time), just as the continuous Gaussian is the solution to the continuous diffusion equation. [1] [2] [4]

This filter can be truncated in the spatial domain as for the sampled Gaussian

or can be implemented in the Fourier domain using a closed-form expression for its discrete-time Fourier transform:

With this frequency-domain approach, the scale-space properties transfer exactly to the discrete domain, or with excellent approximation using periodic extension and a suitably long discrete Fourier transform to approximate the discrete-time Fourier transform of the signal being smoothed. Moreover, higher-order derivative approximations can be computed in a straightforward manner (and preserving scale-space properties) by applying small support central difference operators to the discrete scale space representation. [5]

As with the sampled Gaussian, a plain truncation of the infinite impulse response will in most cases be a sufficient approximation for small values of ε, while for larger values of ε it is better to use either a decomposition of the discrete Gaussian into a cascade of generalized binomial filters or alternatively to construct a finite approximate kernel by multiplying by a window function. If ε has been chosen too large such that effects of the truncation error begin to appear (for example as spurious extrema or spurious responses to higher-order derivative operators), then the options are to decrease the value of ε such that a larger finite kernel is used, with cutoff where the support is very small, or to use a tapered window.

Recursive filters

Scale-space kernels. Ideal discrete gaussian based on bessel functions (red), and two-pole-pair forward/backward recursive smoothing filters (blue) with poles as described in the text. Top shows individual kernels, and bottom is their cumulative convolution with each other; t = [0.5, 1, 2, 4]. ScaleSpaceKernels.png
Scale-space kernels. Ideal discrete gaussian based on bessel functions (red), and two-pole-pair forward/backward recursive smoothing filters (blue) with poles as described in the text. Top shows individual kernels, and bottom is their cumulative convolution with each other; t = [0.5, 1, 2, 4].

Since computational efficiency is often important, low-order recursive filters are often used for scale-space smoothing. For example, Young and van Vliet [6] use a third-order recursive filter with one real pole and a pair of complex poles, applied forward and backward to make a sixth-order symmetric approximation to the Gaussian with low computational complexity for any smoothing scale.

By relaxing a few of the axioms, Lindeberg [1] concluded that good smoothing filters would be "normalized Pólya frequency sequences", a family of discrete kernels that includes all filters with real poles at 0 < Z < 1 and/or Z > 1, as well as with real zeros at Z < 0. For symmetry, which leads to approximate directional homogeneity, these filters must be further restricted to pairs of poles and zeros that lead to zero-phase filters.

To match the transfer function curvature at zero frequency of the discrete Gaussian, which ensures an approximate semi-group property of additive t, two poles at

can be applied forward and backwards, for symmetry and stability. This filter is the simplest implementation of a normalized Pólya frequency sequence kernel that works for any smoothing scale, but it is not as excellent an approximation to the Gaussian as Young and van Vliet's filter, which is not normalized Pólya frequency sequence, due to its complex poles.

The transfer function, H1, of a symmetric pole-pair recursive filter is closely related to the discrete-time Fourier transform of the discrete Gaussian kernel via first-order approximation of the exponential:

where the t parameter here is related to the stable pole position Z = p via:

Furthermore, such filters with N pairs of poles, such as the two pole pairs illustrated in this section, are an even better approximation to the exponential:

where the stable pole positions are adjusted by solving:

The impulse responses of these filters are not very close to gaussian unless more than two pole pairs are used. However, even with only one or two pole pairs per scale, a signal successively smoothed at increasing scales will be very close to a gaussian-smoothed signal. The semi-group property is poorly approximated when too few pole pairs are used.

Scale-space axioms that are still satisfied by these filters are:

The following are only approximately satisfied, the approximation being better for larger numbers of pole pairs:

This recursive filter method and variations to compute both the Gaussian smoothing as well as Gaussian derivatives has been described by several authors. [6] [7] [8] [9] Tan et al. have analyzed and compared some of these approaches, and have pointed out that the Young and van Vliet filters are a cascade (multiplication) of forward and backward filters, while the Deriche and the Jin et al. filters are sums of forward and backward filters. [10]

At fine scales, the recursive filtering approach as well as other separable approaches are not guaranteed to give the best possible approximation to rotational symmetry, so non-separable implementations for 2D images may be considered as an alternative.

When computing several derivatives in the N-jet simultaneously, discrete scale-space smoothing with the discrete analogue of the Gaussian kernel, or with a recursive filter approximation, followed by small support difference operators, may be both faster and more accurate than computing recursive approximations of each derivative operator.

Finite-impulse-response (FIR) smoothers

For small scales, a low-order FIR filter may be a better smoothing filter than a recursive filter. The symmetric 3-kernel [t/2, 1-t, t/2], for t  0.5 smooths to a scale of t using a pair of real zeros at Z < 0, and approaches the discrete Gaussian in the limit of small t. In fact, with infinitesimal t, either this two-zero filter or the two-pole filter with poles at Z = t/2 and Z = 2/t can be used as the infinitesimal generator for the discrete Gaussian kernels described above.

The FIR filter's zeros can be combined with the recursive filter's poles to make a general high-quality smoothing filter. For example, if the smoothing process is to always apply a biquadratic (two-pole, two-zero) filter forward then backwards on each row of data (and on each column in the 2D case), the poles and zeros can each do a part of the smoothing. The zeros limit out at t = 0.5 per pair (zeros at Z = –1), so for large scales the poles do most of the work. At finer scales, the combination makes an excellent approximation to the discrete Gaussian if the poles and zeros each do about half the smoothing. The t values for each portion of the smoothing (poles, zeros, forward and backward multiple applications, etc.) are additive, in accordance with the approximate semi-group property.

Z-plane locations of four poles (X) and four zeros (circles) for a smoothing filter using forward/backward biquad to smooth to a scale t = 2, with half the smoothing from the poles and half from the zeros. The zeros are all at Z = -1; the poles are at Z = 0.172 and Z = 5.83. The poles outside the unit circle are implemented by filtering backwards with the stable poles. Smoothing Filter Pole Zero Plot.svg
Z-plane locations of four poles (X) and four zeros (circles) for a smoothing filter using forward/backward biquad to smooth to a scale t = 2, with half the smoothing from the poles and half from the zeros. The zeros are all at Z = –1; the poles are at Z = 0.172 and Z = 5.83. The poles outside the unit circle are implemented by filtering backwards with the stable poles.

The FIR filter transfer function is closely related to the discrete Gaussian's DTFT, just as was the recursive filter's. For a single pair of zeros, the transfer function is

where the t parameter here is related to the zero positions Z = z via:

and we require t  0.5 to keep the transfer function non-negative.

Furthermore, such filters with N pairs of zeros, are an even better approximation to the exponential and extend to higher values of t :

where the stable zero positions are adjusted by solving:

These FIR and pole-zero filters are valid scale-space kernels, satisfying the same axioms as the all-pole recursive filters.

Real-time implementation within pyramids and discrete approximation of scale-normalized derivatives

Regarding the topic of automatic scale selection based on normalized derivatives, pyramid approximations are frequently used to obtain real-time performance. [11] [12] [13] The appropriateness of approximating scale-space operations within a pyramid originates from the fact that repeated cascade smoothing with generalized binomial kernels leads to equivalent smoothing kernels that under reasonable conditions approach the Gaussian. Furthermore, the binomial kernels (or more generally the class of generalized binomial kernels) can be shown to constitute the unique class of finite-support kernels that guarantee non-creation of local extrema or zero-crossings with increasing scale (see the article on multi-scale approaches for details). Special care may, however, need to be taken to avoid discretization artifacts.

Other multi-scale approaches

For one-dimensional kernels, there is a well-developed theory of multi-scale approaches, concerning filters that do not create new local extrema or new zero-crossings with increasing scales. For continuous signals, filters with real poles in the s-plane are within this class, while for discrete signals the above-described recursive and FIR filters satisfy these criteria. Combined with the strict requirement of a continuous semi-group structure, the continuous Gaussian and the discrete Gaussian constitute the unique choice for continuous and discrete signals.

There are many other multi-scale signal processing, image processing and data compression techniques, using wavelets and a variety of other kernels, that do not exploit or require the same requirements as scale space descriptions do; that is, they do not depend on a coarser scale not generating a new extremum that was not present at a finer scale (in 1D) or non-enhancement of local extrema between adjacent scale levels (in any number of dimensions).

See also

Related Research Articles

The Chebyshev polynomials are two sequences of polynomials related to the cosine and sine functions, notated as and . They can be defined several ways that have the same end result; in this article the polynomials are defined by starting with trigonometric functions:

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the form

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

Chebyshev filters are analog or digital filters having a steeper roll-off than Butterworth filters, and have passband ripple or stopband ripple. Chebyshev filters have the property that they minimize the error between the idealized and the actual filter characteristic over the range of the filter, but with ripples in the passband. This type of filter is named after Pafnuty Chebyshev because its mathematical characteristics are derived from Chebyshev polynomials. Type I Chebyshev filters are usually referred to as "Chebyshev filters", while type II filters are usually called "inverse Chebyshev filters".

In control theory and signal processing, a linear, time-invariant system is said to be minimum-phase if the system and its inverse are causal and stable.

Sinc function Special mathematical function defined as sin(x)/x

In mathematics, physics and engineering, the sinc function, denoted by sinc(x), has two forms, normalized and unnormalized.

In probability and statistics, a circular distribution or polar distribution is a probability distribution of a random variable whose values are angles, usually taken to be in the range [0, 2π). A circular distribution is often a continuous probability distribution, and hence has a probability density, but such distributions can also be discrete, in which case they are called circular lattice distributions. Circular distributions can be used even when the variables concerned are not explicitly angles: the main consideration is that there is not usually any real distinction between events occurring at the lower or upper end of the range, and the division of the range could notionally be made at any point.

Rice distribution Probability distribution

In probability theory, the Rice distribution or Rician distribution is the probability distribution of the magnitude of a circularly-symmetric bivariate normal random variable, possibly with non-zero mean (noncentral). It was named after Stephen O. Rice.

Arc length Distance along a curve

Arc length is the distance between two points along a section of a curve.

In electronics and signal processing, a Bessel filter is a type of analog linear filter with a maximally flat group/phase delay, which preserves the wave shape of filtered signals in the passband. Bessel filters are often used in audio crossover systems.

In mathematics, and specifically in potential theory, the Poisson kernel is an integral kernel, used for solving the two-dimensional Laplace equation, given Dirichlet boundary conditions on the unit disk. The kernel can be understood as the derivative of the Green's function for the Laplace equation. It is named for Siméon Poisson.

In image processing and computer vision, a scale space framework can be used to represent an image as a family of gradually smoothed images. This framework is very general and a variety of scale space representations exist. A typical approach for choosing a particular type of scale space representation is to establish a set of scale-space axioms, describing basic properties of the desired scale-space representation and often chosen so as to make the representation useful in practical applications. Once established, the axioms narrow the possible scale-space representations to a smaller class, typically with only a few free parameters.

The scale space representation of a signal obtained by Gaussian smoothing satisfies a number of special properties, scale-space axioms, which make it into a special form of multi-scale representation. There are, however, also other types of "multi-scale approaches" in the areas of computer vision, image processing and signal processing, in particular the notion of wavelets. The purpose of this article is to describe a few of these approaches:

Gaussian filter

In electronics and signal processing, a Gaussian filter is a filter whose impulse response is a Gaussian function. Gaussian filters have the properties of having no overshoot to a step function input while minimizing the rise and fall time. This behavior is closely connected to the fact that the Gaussian filter has the minimum possible group delay. It is considered the ideal time domain filter, just as the sinc is the ideal frequency domain filter. These properties are important in areas such as oscilloscopes and digital telecommunication systems.

Clenshaw–Curtis quadrature and Fejér quadrature are methods for numerical integration, or "quadrature", that are based on an expansion of the integrand in terms of Chebyshev polynomials. Equivalently, they employ a change of variables and use a discrete cosine transform (DCT) approximation for the cosine series. Besides having fast-converging accuracy comparable to Gaussian quadrature rules, Clenshaw–Curtis quadrature naturally leads to nested quadrature rules, which is important for both adaptive quadrature and multidimensional quadrature (cubature).

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

Kepler orbit

In celestial mechanics, a Kepler orbit is the motion of one body relative to another, as an ellipse, parabola, or hyperbola, which forms a two-dimensional orbital plane in three-dimensional space. A Kepler orbit can also form a straight line. It considers only the point-like gravitational attraction of two bodies, neglecting perturbations due to gravitational interactions with other objects, atmospheric drag, solar radiation pressure, a non-spherical central body, and so on. It is thus said to be a solution of a special case of the two-body problem, known as the Kepler problem. As a theory in classical mechanics, it also does not take into account the effects of general relativity. Keplerian orbits can be parametrized into six orbital elements in various ways.

In fluid dynamics, the Oseen equations describe the flow of a viscous and incompressible fluid at small Reynolds numbers, as formulated by Carl Wilhelm Oseen in 1910. Oseen flow is an improved description of these flows, as compared to Stokes flow, with the (partial) inclusion of convective acceleration.

References

  1. 1 2 3 Lindeberg, T., "Scale-space for discrete signals," PAMI(12), No. 3, March 1990, pp. 234-254.
  2. 1 2 Lindeberg, T., Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994, ISBN   0-7923-9418-6
  3. R.A. Haddad and A.N. Akansu, "A Class of Fast Gaussian Binomial Filters for Speech and Image Processing," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 39, pp 723-727, March 1991.
  4. Campbell, J, 2007, The SMM model as a boundary value problem using the discrete diffusion equation , Theor Popul Biol. 2007 Dec;72(4):539-46.
  5. Lindeberg, T. Discrete derivative approximations with scale-space properties: A basis for low-level feature extraction, J. of Mathematical Imaging and Vision, 3(4), pp. 349--376, 1993.
  6. 1 2 Ian T. Young & Lucas J. van Vliet (1995). "Recursive implementation of the Gaussian filter". Signal Processing. 44 (2): 139–151. CiteSeerX   10.1.1.12.2826 . doi:10.1016/0165-1684(95)00020-E.
  7. Deriche, R: Recursively implementing the Gaussian and its derivatives, INRIA Research Report 1893, 1993.
  8. Richard F. Lyon. "Speech recognition in scale space," Proc. of 1987 ICASSP. San Diego, March, pp. 29.3.14, 1987.
  9. Jin, JS, Gao Y. "Recursive implementation of LoG Filtering". Real-Time Imaging 1997;3:59–65.
  10. . Sovira Tan; Jason L. Dale & Alan Johnston (2003). "Performance of three recursive algorithms for fast space-variant Gaussian filtering". Real-Time Imaging. Vol. 9 no. 3. pp. 215–228. doi:10.1016/S1077-2014(03)00040-8.
  11. Lindeberg, Tony & Bretzner, Lars (2003). Real-time scale selection in hybrid multi-scale representations. Proc. Scale-Space'03, Springer Lecture Notes in Computer Science. Lecture Notes in Computer Science. 2695. pp. 148–163. doi:10.1007/3-540-44935-3_11. ISBN   978-3-540-40368-5.
  12. Crowley, J, Riff O: Fast computation of scale normalised Gaussian receptive fields, Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science, volume 2695, 2003.
  13. Lowe, D. G., “Distinctive image features from scale-invariant keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.