Grassfire transform

Last updated

In image processing, the grassfire transform is the computation of the distance from a pixel to the border of a region. It can be described as "setting fire" to the borders of an image region to yield descriptors such as the region's skeleton or medial axis. Harry Blum introduced the concept in 1967. [1]

Contents

Motivation

A region's skeleton can be a useful descriptor, because it describes things such as the symmetry of the region as well as subparts, depressions and protrusions. [2] It also provides a way of relating the interior of a region to the shape of the boundary. In the grassfire transform, the skeleton forms at the points in the region where the "fires" meet. In the literature this is described as the locus of meeting waveforms. [2]

Another advantage of using the outcome of the grassfire transform as a descriptor is that it is invertible. Assuming information about when the medial axis or skeleton is created by meeting waveforms is kept, then the skeleton can be restored by radiating outward. [1]

Example algorithm

The algorithm below is a simple two pass method for computing the Manhattan distance from the border of a region. Of course there are several other algorithms for performing the grassfire transform.

foreachrowinimagelefttorightforeachcolumninimagetoptobottomif(pixelisinregion){setpixelto1+minimumvalueofthenorthandwestneighbours}else{setpixeltozero}}}foreachrowrighttoleftforeachcolumnbottomtotopif(pixelisinregion){setpixeltomin(valueofthepixel,1+minimumvalueofthesouthandeastneighbours)}else{setpixeltozero}}}

Below is the result of this transform. It is important to note that the most intense lines make up the skeleton.

Source image Grassfire example before.png
Source image
Result image Grassfire output example.jpeg
Result image

Applications

The grassfire transform can be abstracted to suit a variety of computing problems. It has been shown that it can be extended beyond the context of images to arbitrary functions. [3] This includes applications in energy minimization problems such as those handled by the Viterbi algorithm, max-product belief propagation, resource allocation, and in optimal control methods. [3]

It can also be used to compute the distance between regions by setting the background to be as a region.

See also

Related Research Articles

<span class="mw-page-title-main">Digital geometry</span> Deals with digitized models or images of objects of the 2D or 3D Euclidean space

Digital geometry deals with discrete sets considered to be digitized models or images of objects of the 2D or 3D Euclidean space.

<span class="mw-page-title-main">Ray casting</span> Methodological basis for 3D CAD/CAM solid modeling and image rendering

Ray casting is the methodological basis for 3D CAD/CAM solid modeling and image rendering. It is essentially the same as ray tracing for computer graphics where virtual light rays are "cast" or "traced" on their path from the focal point of a camera through each pixel in the camera sensor to determine what is visible along the ray in the 3D scene. The term "Ray Casting" was introduced by Scott Roth while at the General Motors Research Labs from 1978–1980. His paper, "Ray Casting for Modeling Solids", describes modeled solid objects by combining primitive solids, such as blocks and cylinders, using the set operators union (+), intersection (&), and difference (-). The general idea of using these binary operators for solid modeling is largely due to Voelcker and Requicha's geometric modelling group at the University of Rochester. See Solid modeling for a broad overview of solid modeling methods. This figure on the right shows a U-Joint modeled from cylinders and blocks in a binary tree using Roth's ray casting system, circa 1979.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

<span class="mw-page-title-main">Distance transform</span>

A distance transform, also known as distance map or distance field, is a derived representation of a digital image. The choice of the term depends on the point of view on the object in question: whether the initial image is transformed into another representation, or it is simply endowed with an additional map or field.

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.

<span class="mw-page-title-main">Topological skeleton</span>

In shape analysis, skeleton of a shape is a thin version of that shape that is equidistant to its boundaries. The skeleton usually emphasizes geometrical and topological properties of the shape, such as its connectivity, topology, length, direction, and width. Together with the distance of its points to the shape boundary, the skeleton can also serve as a representation of the shape.

In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as points, edges or objects. Features may also be the result of a general neighborhood operation or feature detection applied to the image. Other examples of features are related to motion in image sequences, or to shapes defined in terms of curves or boundaries between different image regions.

<span class="mw-page-title-main">Medial axis</span> Points with more than one closest boundary point

The medial axis of an object is the set of all points having more than one closest point on the object's boundary. Originally referred to as the topological skeleton, it was introduced in 1967 by Harry Blum as a tool for biological shape recognition. In mathematics the closure of the medial axis is known as the cut locus.

In computer vision, blob detection methods are aimed at detecting regions in a digital image that differ in properties, such as brightness or color, compared to surrounding regions. Informally, a blob is a region of an image in which some properties are constant or approximately constant; all the points in a blob can be considered in some sense to be similar to each other. The most common method for blob detection is convolution.

Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with segmentation.

<span class="mw-page-title-main">Image rectification</span>

Image rectification is a transformation process used to project images onto a common image plane. This process has several degrees of freedom and there are many strategies for transforming images to the common plane. Image rectification is used in computer stereo vision to simplify the problem of finding matching points between images, and in geographic information systems to merge images taken from multiple perspectives into a common map coordinate system.

The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image. This method is similar to that of edge orientation histograms, scale-invariant feature transform descriptors, and shape contexts, but differs in that it is computed on a dense grid of uniformly spaced cells and uses overlapping local contrast normalization for improved accuracy.

In computer vision, maximally stable extremal regions (MSER) are used as a method of blob detection in images. This technique was proposed by Matas et al. to find correspondences between image elements from two images with different viewpoints. This method of extracting a comprehensive number of corresponding image elements contributes to the wide-baseline matching, and it has led to better stereo matching and object recognition algorithms.

Adaptive histogram equalization (AHE) is a computer image processing technique used to improve contrast in images. It differs from ordinary histogram equalization in the respect that the adaptive method computes several histograms, each corresponding to a distinct section of the image, and uses them to redistribute the lightness values of the image. It is therefore suitable for improving the local contrast and enhancing the definitions of edges in each region of an image.

Image segmentation strives to partition a digital image into regions of pixels with similar properties, e.g. homogeneity. The higher-level region representation simplifies image analysis tasks such as counting objects or detecting changes, because region attributes can be compared more readily than raw pixels.

In computer science, the Wagner–Fischer algorithm is a dynamic programming algorithm that computes the edit distance between two strings of characters.

In image processing, line detection is an algorithm that takes a collection of n edge points and finds all the lines on which these edge points lie. The most popular line detectors are the Hough transform and convolution-based techniques.

Discrete Skeleton Evolution (DSE) describes an iterative approach to reducing a morphological or topological skeleton. It is a form of pruning in that it removes noisy or redundant branches (spurs) generated by the skeletonization process, while preserving information-rich "trunk" segments. The value assigned to individual branches varies from algorithm to algorithm, with the general goal being to convey the features of interest of the original contour with a few carefully chosen lines. Usually, clarity for human vision is valued as well. DSE algorithms are distinguished by complex, recursive decision-making processes with high computational requirements. Pruning methods such as by structuring element (SE) convolution and the Hough transform are general purpose algorithms which quickly pass through an image and eliminate all branches shorter than a given threshold. DSE methods are most applicable when detail retention and contour reconstruction are valued.

Semi-global matching (SGM) is a computer vision algorithm for the estimation of a dense disparity map from a rectified stereo image pair, introduced in 2005 by Heiko Hirschmüller while working at the German Aerospace Center. Given its predictable run time, its favourable trade-off between quality of the results and computing time, and its suitability for fast parallel implementation in ASIC or FPGA, it has encountered wide adoption in real-time stereo vision applications such as robotics and advanced driver assistance systems.

References

  1. 1 2 Blum, Harry (1967). "A transformation for extracting new descriptors of shape". In Wathen-Dunn, Weiant (ed.). Models for the Perception of Speech and Visual Form (PDF). Cambridge, Massachusetts: MIT Press. pp. 362–380.
  2. 1 2 Leymarie, F; Levine, M.D (1992). "Simulating the grassfire transform using an active contour model". IEEE Transactions on Pattern Analysis and Machine Intelligence. 14: 56–75. doi:10.1109/34.107013.
  3. 1 2 Felzenszwalb, Pedro F; Huttenlocher, Daniel P (2012). "Distance Transforms of Sampled Functions". Theory of Computing. 8: 415–28. CiteSeerX   10.1.1.88.1647 . doi:10.4086/toc.2012.v008a019.