Difference of Gaussians

Last updated

In imaging science, difference of Gaussians (DoG) is a feature enhancement algorithm that involves the subtraction of one Gaussian blurred version of an original image from another, less blurred version of the original. In the simple case of grayscale images, the blurred images are obtained by convolving the original grayscale images with Gaussian kernels having differing width (standard deviations). Blurring an image using a Gaussian kernel suppresses only high-frequency spatial information. Subtracting one image from the other preserves spatial information that lies between the range of frequencies that are preserved in the two blurred images. Thus, the DoG is a spatial band-pass filter that attenuates frequencies in the original grayscale image that are far from the band center. [1]

Contents

Mathematics of difference of Gaussians

Comparison of difference of Gaussian with Mexican hat wavelet DOG vs MHF.png
Comparison of difference of Gaussian with Mexican hat wavelet

Given an n-dimensional gray-scale image The difference of Gaussians (DoG) of the image is the function obtained by subtracting the image convolved with the Gaussian of standard deviation from the image convolved with a Gaussian of narrower standard deviation :

where is a Gaussian with standard deviation :

Equivalently one can write

which represents an image convolved by the difference of two Gaussians, which approximates a Mexican hat function.

The relation between the difference of Gaussians operator and the Laplacian of the Gaussian operator (the Mexican hat wavelet) is explained in appendix A in Lindeberg (2015). [2]

Details and applications

Example before difference of Gaussians Flowers before difference of gaussians.jpg
Example before difference of Gaussians
After difference of Gaussians filtering in black and white Flowers after difference of gaussians grayscale.jpg
After difference of Gaussians filtering in black and white

As a feature enhancement algorithm, the difference of Gaussians can be utilized to increase the visibility of edges and other detail present in a digital image. A wide variety of alternative edge sharpening filters operate by enhancing high frequency detail, but because random noise also has a high spatial frequency, many of these sharpening filters tend to enhance noise, which can be an undesirable artifact. The difference of Gaussians algorithm removes high frequency detail that often includes random noise, rendering this approach one of the most suitable for processing images with a high degree of noise. A major drawback to application of the algorithm is an inherent reduction in overall image contrast produced by the operation. [1]

When utilized for image enhancement, the difference of Gaussians algorithm is typically applied when the size ratio of kernel (2) to kernel (1) is 4:1 or 5:1. In the example images to the right, the sizes of the Gaussian kernels employed to smooth the sample image were 10 pixels and 5 pixels.

The algorithm can also be used to obtain an approximation of the Laplacian of Gaussian when the ratio of size 2 to size 1 is roughly equal to 1.6. [3] The Laplacian of Gaussian is useful for detecting edges that appear at various image scales or degrees of image focus. The exact values of sizes of the two kernels that are used to approximate the Laplacian of Gaussian will determine the scale of the difference image, which may appear blurry as a result.

Differences of Gaussians have also been used for blob detection in the scale-invariant feature transform. In fact, the DoG as the difference of two Multivariate normal distribution has always a total null sum and convolving it with a uniform signal generates no response. It approximates well a second derivate of Gaussian (Laplacian of Gaussian) with K~1.6 and the receptive fields of ganglion cells in the retina with K~5. It may easily be used in recursive schemes and is used as an operator in real-time algorithms for blob detection and automatic scale selection.

More information

In its operation, the difference of Gaussians algorithm is believed to mimic how neural processing in the retina of the eye extracts details from images destined for transmission to the brain. [4] [5] [6]

See also

Related Research Articles

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

<span class="mw-page-title-main">Canny edge detector</span> Image edge detection algorithm

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a computational theory of edge detection explaining why the technique works.

<span class="mw-page-title-main">Ricker wavelet</span> Wavelet proportional to the second derivative of a Gaussian

In mathematics and numerical analysis, the Ricker wavelet

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.

<span class="mw-page-title-main">Gabor filter</span> Linear filter used for texture analysis

In image processing, a Gabor filter, named after Dennis Gabor, who first proposed it as a 1D filter. The Gabor filter was first generalized to 2D by Gösta Granlund, by adding a reference direction. The Gabor filter is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis. Frequency and orientation representations of Gabor filters are claimed by many contemporary vision scientists to be similar to those of the human visual system. They have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.

Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theory for handling image structures at different scales, by representing an image as a one-parameter family of smoothed images, the scale-space representation, parametrized by the size of the smoothing kernel used for suppressing fine-scale structures. The parameter in this family is referred to as the scale parameter, with the interpretation that image structures of spatial size smaller than about have largely been smoothed away in the scale-space level at scale .

<span class="mw-page-title-main">Gaussian blur</span> Type of image blur produced by a Gaussian function

In image processing, a Gaussian blur is the result of blurring an image by a Gaussian function.

<span class="mw-page-title-main">Corner detection</span> Approach used in computer vision systems

Corner detection is an approach used within computer vision systems to extract certain kinds of features and infer the contents of an image. Corner detection is frequently used in motion detection, image registration, video tracking, image mosaicing, panorama stitching, 3D reconstruction and object recognition. Corner detection overlaps with the topic of interest point detection.

<span class="mw-page-title-main">Gaussian filter</span> Filter in electronics and signal processing

In electronics and signal processing, mainly in digital signal processing, a Gaussian filter is a filter whose impulse response is a Gaussian function. Gaussian filters have the properties of having no overshoot to a step function input while minimizing the rise and fall time. This behavior is closely connected to the fact that the Gaussian filter has the minimum possible group delay. A Gaussian filter will have the best combination of suppression of high frequencies while also minimizing spatial spread, being the critical point of the uncertainty principle. These properties are important in areas such as oscilloscopes and digital telecommunication systems.

In computer vision, blob detection methods are aimed at detecting regions in a digital image that differ in properties, such as brightness or color, compared to surrounding regions. Informally, a blob is a region of an image in which some properties are constant or approximately constant; all the points in a blob can be considered in some sense to be similar to each other. The most common method for blob detection is by using convolution.

Affine shape adaptation is a methodology for iteratively adapting the shape of the smoothing kernels in an affine group of smoothing kernels to the local image structure in neighbourhood region of a specific image point. Equivalently, affine shape adaptation can be accomplished by iteratively warping a local image patch with affine transformations while applying a rotationally symmetric filter to the warped image patches. Provided that this iterative process converges, the resulting fixed point will be affine invariant. In the area of computer vision, this idea has been used for defining affine invariant interest point operators as well as affine invariant texture analysis methods.

Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing.

In computer vision, speeded up robust features (SURF) is a patented local feature detector and descriptor. It can be used for tasks such as object recognition, image registration, classification, or 3D reconstruction. It is partly inspired by the scale-invariant feature transform (SIFT) descriptor. The standard version of SURF is several times faster than SIFT and claimed by its authors to be more robust against different image transformations than SIFT.

In the fields of computer vision and image analysis, the Harris affine region detector belongs to the category of feature detection. Feature detection is a preprocessing step of several algorithms that rely on identifying characteristic points or interest points so to make correspondences between images, recognize textures, categorize objects or build panoramas.

The Hessian affine region detector is a feature detector used in the fields of computer vision and image analysis. Like other feature detectors, the Hessian affine detector is typically used as a preprocessing step to algorithms that rely on identifiable, characteristic interest points.

<span class="mw-page-title-main">Bilateral filter</span> Smoothing filler for images

A bilateral filter is a non-linear, edge-preserving, and noise-reducing smoothing filter for images. It replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. This weight can be based on a Gaussian distribution. Crucially, the weights depend not only on Euclidean distance of pixels, but also on the radiometric differences. This preserves sharp edges.

<span class="mw-page-title-main">Pyramid (image processing)</span> Type of multi-scale signal representation

Pyramid, or pyramid representation, is a type of multi-scale signal representation developed by the computer vision, image processing and signal processing communities, in which a signal or an image is subject to repeated smoothing and subsampling. Pyramid representation is a predecessor to scale-space representation and multiresolution analysis.

In image processing and computer vision, anisotropic diffusion, also called Perona–Malik diffusion, is a technique aiming at reducing image noise without removing significant parts of the image content, typically edges, lines or other details that are important for the interpretation of the image. Anisotropic diffusion resembles the process that creates a scale space, where an image generates a parameterized family of successively more and more blurred images based on a diffusion process. Each of the resulting images in this family are given as a convolution between the image and a 2D isotropic Gaussian filter, where the width of the filter increases with the parameter. This diffusion process is a linear and space-invariant transformation of the original image. Anisotropic diffusion is a generalization of this diffusion process: it produces a family of parameterized images, but each resulting image is a combination between the original image and a filter that depends on the local content of the original image. As a consequence, anisotropic diffusion is a non-linear and space-variant transformation of the original image.

In signal processing it is useful to simultaneously analyze the space and frequency characteristics of a signal. While the Fourier transform gives the frequency information of the signal, it is not localized. This means that we cannot determine which part of a signal produced a particular frequency. It is possible to use a short time Fourier transform for this purpose, however the short time Fourier transform limits the basis functions to be sinusoidal. To provide a more flexible space-frequency signal decomposition several filters have been proposed. The Log-Gabor filter is one such filter that is an improvement upon the original Gabor filter. The advantage of this filter over the many alternatives is that it better fits the statistics of natural images compared with Gabor filters and other wavelet filters.

References

  1. 1 2 "Molecular Expressions Microscopy Primer: Digital Image Processing – Difference of Gaussians Edge Enhancement Algorithm", Olympus America Inc., and Florida State University Michael W. Davidson, Mortimer Abramowitz
  2. Lindeberg, Tony (2015). "Image Matching Using Generalized Scale-Space Interest Points". Journal of Mathematical Imaging and Vision. 52: 3–36. doi: 10.1007/s10851-014-0541-0 . S2CID   254657377.
  3. D. Marr; E. Hildreth (29 February 1980). "Theory of Edge Detection". Proceedings of the Royal Society of London. Series B, Biological Sciences. 207 (1167): 215–217. Bibcode:1980RSPSB.207..187M. doi:10.1098/rspb.1980.0020. JSTOR   35407. PMID   6102765. S2CID   2150419. — A difference of Gaussians of any scale is an approximation to the laplacian of the Gaussian (see the entry for difference of Gaussians under Blob detection). However, Marr and Hildreth recommend the ratio of 1.6 because of design considerations balancing bandwidth and sensitivity. The url for this reference may only make the first page and abstract of the article available depending on if you are connecting through an academic institution or not.
  4. C. Enroth-Cugell; J. G. Robson (1966). "The Contrast Sensitivity of Retinal Ganglion Cells of the Cat". Journal of Physiology. 187 (3): 517–23. doi:10.1113/jphysiol.1966.sp008107. PMC   1395960 . PMID   16783910.
  5. Matthew J. McMahon; Orin S. Packer; Dennis M. Dacey (April 14, 2004). "The Classical Receptive Field Surround of Primate Parasol Ganglion Cells Is Mediated Primarily by a Non-GABAergic Pathway" (PDF). Journal of Neuroscience. 24 (15): 3736–3745. doi:10.1523/JNEUROSCI.5252-03.2004. PMC   6729348 . PMID   15084653.
  6. Young, Richard (1987). "The Gaussian derivative model for spatial vision: I. Retinal mechanisms". Spatial Vision. 2 (4): 273–293(21). doi:10.1163/156856887X00222. PMID   3154952.

Further reading