Statistical region merging

Last updated

Statistical region merging (SRM) is an algorithm used for image segmentation. [1] [2] The algorithm is used to evaluate the values within a regional span and grouped together based on the merging criteria, resulting in a smaller list. Some useful examples are creating a group of generations within a population, or in image processing, grouping a number of neighboring pixels based on their shades that fall within a particular threshold (Qualification Criteria).

For example, with 10 values of x (1.7, 1.8, 1.9, 3.2, 4.9, 5.1, 5.3, 5.6, 9, 10) within a range of 0 < x < 10, there can be a statistical region-merging algorithm that defines a merging criteria that can be applied to merge the given values into a smaller number of values.

For the given values, if the merging criterion is merely a threshold check which states that the distance of the selected values should be within 0.3 range and an average should be applied, then the result of the above values of x will be:

(1.7 + 1.8 + 1.9) / 3 = 5.4 / 3 = 1.8 3.2 = 3.2 / 1 = 3.2 4.9 = 4.9 / 1 = 4.9 (5.1 + 5.2 + 5.3) / 3 = 15.6 / 3 = 5.2 5.6 = 5.6 / 1 = 5.6 9 = 9 / 1 = 9 10 = 10 / 1 = 10 

Thus, the resultant set will be 1.8, 3.2, 4.9, 5.2, 5.6, 9, 10. Note the result on SRM varies, based on the order in which the values are evaluated by the algorithm.

A major use of SRM is in image processing where higher number color palettes in an image are converted into lower number palettes by merging the similar colors' palettes together. The merging criteria include allowed color ranges, minimum size of a region, maximum size of a region, allowed number of platelets, etc.

There are several implementations available of SRM for color image segmentation: Java, [3] Matlab, [4] Python, [5] and a demo applet. [3]

SRM has been used in many image applications, like ClickRemoval [6] and Volume Catcher. [7]

See also

Related Research Articles

<span class="mw-page-title-main">MATLAB</span> Numerical computing environment and programming language

MATLAB is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages.

<span class="mw-page-title-main">Image compression</span> Reduction of image size to save storage and transmission costs

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.

<span class="mw-page-title-main">Hierarchical clustering</span> Statistical method of analysis which seeks to build a hierarchy of clusters

In data mining and statistics, hierarchical clustering is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories:

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population. However, while problems associated with "mixture distributions" relate to deriving the properties of the overall population from those of the sub-populations, "mixture models" are used to make statistical inferences about the properties of the sub-populations given only observations on the pooled population, without sub-population identity information.

<span class="mw-page-title-main">Otsu's method</span> In computer vision and image processing

In computer vision and image processing, Otsu's method, named after Nobuyuki Otsu, is used to perform automatic image thresholding. In the simplest form, the algorithm returns a single intensity threshold that separate pixels into two classes, foreground and background. This threshold is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance. Otsu's method is a one-dimensional discrete analogue of Fisher's Discriminant Analysis, is related to Jenks optimization method, and is equivalent to a globally optimal k-means performed on the intensity histogram. The extension to multi-level thresholding was described in the original paper, and computationally efficient implementations have since been proposed.

<span class="mw-page-title-main">Fuzzy clustering</span>

Fuzzy clustering is a form of clustering in which each data point can belong to more than one cluster.

<span class="mw-page-title-main">Scale-space segmentation</span>

Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.

An active appearance model (AAM) is a computer vision algorithm for matching a statistical model of object shape and appearance to a new image. They are built during a training phase. A set of images, together with coordinates of landmarks that appear in all of the images, is provided to the training supervisor.

Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with segmentation.

As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems, such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph. Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph, the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization.

Range segmentation is the task of segmenting (dividing) a range image, an image containing depth information for each pixel, into segments (regions), so that all the points of the same surface belong to the same region, there is no overlap between different regions and the union of these regions generates the entire image.

<span class="mw-page-title-main">Watershed (image processing)</span>

In the study of image processing, a watershed is a transformation defined on a grayscale image. The name refers metaphorically to a geological watershed, or drainage divide, which separates adjacent drainage basins. The watershed transformation treats the image it operates upon like a topographic map, with the brightness of each point representing its height, and finds the lines that run along the tops of ridges.

<span class="mw-page-title-main">Orfeo toolbox</span>

In computer science, Orfeo Toolbox (OTB) is a software library for processing images from Earth observation satellites.

The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels, e.g., "object" and "background". The unlabeled pixels are each imagined to release a random walker, and the probability is computed that each pixel's random walker first arrives at a seed bearing each label, i.e., if a user places K seeds, each with a different label, then it is necessary to compute, for each pixel, the probability that a random walker leaving the pixel will first arrive at each seed. These probabilities may be determined analytically by solving a system of linear equations. After computing these probabilities for each pixel, the pixel is assigned to the label for which it is most likely to send a random walker. The image is modeled as a graph, in which each pixel corresponds to a node which is connected to neighboring pixels by edges, and the edges are weighted to reflect the similarity between the pixels. Therefore, the random walk occurs on the weighted graph.

<span class="mw-page-title-main">Point Cloud Library</span>

The Point Cloud Library (PCL) is an open-source library of algorithms for point cloud processing tasks and 3D geometry processing, such as occur in three-dimensional computer vision. The library contains algorithms for filtering, feature estimation, surface reconstruction, 3D registration, model fitting, object recognition, and segmentation. Each module is implemented as a smaller library that can be compiled separately. PCL has its own data format for storing point clouds - PCD, but also allows datasets to be loaded and saved in many other formats. It is written in C++ and released under the BSD license.

<span class="mw-page-title-main">Step detection</span>

In statistics and signal processing, step detection is the process of finding abrupt changes in the mean level of a time series or signal. It is usually considered as a special case of the statistical method known as change detection or change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required.

CVIPtools is an Open Source image processing software. It is free for use with Windows, and previous versions are available for UNIX. It is an interactive program for image processing and computer vision.

In computer vision, rigid motion segmentation is the process of separating regions, features, or trajectories from a video sequence into coherent subsets of space and time. These subsets correspond to independent rigidly moving objects in the scene. The goal of this segmentation is to differentiate and extract the meaningful rigid motion from the background and analyze it. Image segmentation techniques labels the pixels to be a part of pixels with certain characteristics at a particular time. Here, the pixels are segmented depending on its relative movement over a period of time i.e. the time of the video sequence.

<span class="mw-page-title-main">Saliency map</span>

In computer vision, a saliency map is an image that highlights the region on which people's eyes focus first. The goal of a saliency map is to reflect the degree of importance of a pixel to the human visual system. For example, in this image, a person first looks at the fort and light clouds, so they should be highlighted on the saliency map. Saliency maps engineered in artificial or computer vision are typically not the same as the actual saliency map constructed by biological or natural vision.

References

  1. Nielsen, Frank; Nock, Richard (2003). "On region merging: The statistical soundness of fast sorting, with applications". 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings. Vol. 2. IEEE. pp. II:19–26. doi:10.1109/CVPR.2003.1211447. ISBN   0-7695-1900-8.
  2. Nock, Richard; Frank Nielsen (November 2004). "Statistical Region Merging" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 26 (11): 1452–1458. CiteSeerX   10.1.1.1.1930 . doi:10.1109/tpami.2004.110. PMID   15521493. S2CID   595377 . Retrieved 19 October 2013.
  3. 1 2 Nielsen, Frank; Richard Nock (May 2006). "Statistical Region Merging in Java: SRMj". Laboratoire d'Informatique de l'École polytechnique. Retrieved 19 October 2013.
  4. Boltz, Sylvain. "Image segmentation using statistical region merging". Matlab Central. Retrieved 19 October 2013.
  5. Schwander, Olivier (2012). "Python-SRM — Statistical Region Merging in Python". Laboratoire d'Informatique de l'École polytechnique. Retrieved 19 October 2013.
  6. Nielsen, Frank; Richard Nock (November 2005). "ClickRemoval: Interactive Pinpoint Image Object Removal" (PDF). MM'05: 315–318. Retrieved 19 October 2013.
  7. Owada, Shigeru; Frank Nielsen; Takeo Igarashi (2005). Volume Catcher (PDF). Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games. pp. 111–116. doi:10.1145/1053427.1053445. ISBN   978-1595930132. S2CID   16040481 . Retrieved 19 October 2013.