Thresholding (image processing)

Last updated
Original image Pavlovsk Railing of bridge Yellow palace Winter.jpg
Original image
Example of a threshold effect used on an image Pavlovsk Railing of bridge Yellow palace Winter bw threshold.jpg
Example of a threshold effect used on an image

In digital image processing, thresholding is the simplest method of segmenting images. From a grayscale image, thresholding can be used to create binary images. [1]



The simplest thresholding methods replace each pixel in an image with a black pixel if the image intensity is less than some fixed constant T (that is, ), or a white pixel if the image intensity is greater than that constant. In the example image on the right, this results in the dark tree becoming completely black, and the white snow becoming completely white.

Categorizing thresholding methods

To make thresholding completely automated, it is necessary for the computer to automatically select the threshold T. Sezgin and Sankur (2004) categorize thresholding methods into the following six groups based on the information the algorithm manipulates (Sezgin et al., 2004):

Multiband thresholding

Colour images can also be thresholded. One approach is to designate a separate threshold for each of the RGB components of the image and then combine them with an AND operation. This reflects the way the camera works and how the data is stored in the computer, but it does not correspond to the way that people recognize colour. Therefore, the HSL and HSV colour models are more often used; note that since hue is a circular quantity it requires circular thresholding. It is also possible to use the CMYK colour model (Pham et al., 2007).

Probability distributions

Histogram shape-based methods in particular, but also many other thresholding algorithms, make certain assumptions about the image intensity probability distribution. The most common thresholding methods work on bimodal distributions, but algorithms have also been developed for unimodal distributions, multimodal distributions, and circular distributions.

Automatic thresholding

Automatic thresholding is a great way to extract useful information encoded into pixels while minimizing background noise. This is accomplished by utilizing a feedback loop to optimize the threshold value before converting the original grayscale image to binary. The idea is to separate the image into two parts; the background and foreground. [3]

  1. Select initial threshold value, typically the mean 8-bit value of the original image.
  2. Divide the original image into two portions;
    1. Pixel values that are less than or equal to the threshold; background
    2. Pixel values greater than the threshold; foreground
  3. Find the average mean values of the two new images
  4. Calculate the new threshold by averaging the two means.
  5. If the difference between the previous threshold value and the new threshold value are below a specified limit, you are finished. Otherwise apply the new threshold to the original image keep trying.

Note about limits and threshold selection

The limit mentioned above is user definable. A larger limit will allow a greater difference between successive threshold values. Advantages of this can be quicker execution but with a less clear boundary between background and foreground. Picking starting thresholds is often done by taking the mean value of the grayscale image. However, it is also possible to pick out the starting threshold values based on the two well separated peaks of the image histogram and finding the average pixel value of those points. This can allow the algorithm to converge faster; allowing a much smaller limit to be chosen.

Method limitations

Automatic thresholding will work best when a good background to foreground contrast ratio exists. Meaning the picture must be taken in good lighting conditions with minimal glare.

See also

Related Research Articles

Binary image image comprising exactly two colors, typically black and white

A binary image is one that consists of pixels that can have one of exactly two colors, usually black and white. Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit—i.e., a 0 or 1. The names black-and-white, B&W, monochrome or monochromatic are often used for this concept, but may also designate any images that have only one sample per pixel, such as grayscale images. In Photoshop parlance, a binary image is the same as an image in "Bitmap" mode.

In digital photography, computer-generated imagery, and colorimetry, a grayscale or greyscale image is one in which the value of each pixel is a single sample representing only an amount of light; that is, it carries only intensity information. Grayscale images, a kind of black-and-white or gray monochrome, are composed exclusively of shades of gray. The contrast ranges from black at the weakest intensity to white at the strongest.

In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges, that span the image's color space, the set of all possible colors.

Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms tend to alter signals to a greater or lesser degree.

Image segmentation Division of an image into sets of pixels for further processing

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple segments. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

Color digital images are made of pixels, and pixels are made of combinations of primary colors represented by a series of code. A channel in this context is the grayscale image of the same size as a color image, made of just one of these primary colors. For instance, an image from a standard digital camera will have a red, green and blue channel. A grayscale image has just one channel.

Otsus method in computer vision and image processing

In computer vision and image processing, Otsu's method, named after Nobuyuki Otsu, is used to perform automatic image thresholding. In the simplest form, the algorithm returns a single intensity threshold that separate pixels into two classes, foreground and background. This threshold is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance. Otsu's method is a one-dimensional discrete analog of Fisher's Discriminant Analysis, is related to Jenks optimization method, and is equivalent to a globally optimal k-means performed on the intensity histogram. The extension to multi-level thresholding was described in the original paper, and computationally efficient implementations have since been proposed.

In image processing, normalization is a process that changes the range of pixel intensity values. Applications include photographs with poor contrast due to glare, for example. Normalization is sometimes called contrast stretching or histogram stretching. In more general fields of data processing, such as digital signal processing, it is referred to as dynamic range expansion.

Histogram equalization

Histogram equalization is a method in image processing of contrast adjustment using the image's histogram. Histogram equalization is the best method for image enhancement. It provides better quality of images without loss of any information.

As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems, such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph. Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph, the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization.

In computer vision, maximally stable extremal regions (MSER) are used as a method of blob detection in images. This technique was proposed by Matas et al. to find correspondences between image elements from two images with different viewpoints. This method of extracting a comprehensive number of corresponding image elements contributes to the wide-baseline matching, and it has led to better stereo matching and object recognition algorithms.

Adaptive histogram equalization (AHE) is a computer image processing technique used to improve contrast in images. It differs from ordinary histogram equalization in the respect that the adaptive method computes several histograms, each corresponding to a distinct section of the image, and uses them to redistribute the lightness values of the image. It is therefore suitable for improving the local contrast and enhancing the definitions of edges in each region of an image.

Region growing is a simple region-based image segmentation method. It is also classified as a pixel-based image segmentation method since it involves the selection of initial seed points.

In various science/engineering applications, such as independent component analysis, image analysis, genetic analysis, speech recognition, manifold learning, evaluation of the status of biological systems and time delay estimation it is useful to estimate the differential entropy of a system or process, given some observations.

In mathematical morphology and digital image processing, top-hat transform is an operation that extracts small elements and details from given images. There exist two types of top-hat transform: the white top-hat transform is defined as the difference between the input image and its opening by some structuring element, while the black top-hat transform is defined dually as the difference between the closing and the input image. Top-hat transforms are used for various image processing tasks, such as feature extraction, background equalization, image enhancement, and others.

Local binary patterns (LBP) is a type of visual descriptor used for classification in computer vision. LBP is the particular case of the Texture Spectrum model proposed in 1990. LBP was first described in 1994. It has since been found to be a powerful feature for texture classification; it has further been determined that when LBP is combined with the Histogram of oriented gradients (HOG) descriptor, it improves the detection performance considerably on some datasets. A comparison of several improvements of the original LBP in the field of background subtraction was made in 2015 by Silva et al. A full survey of the different versions of LBP can be found in Bouwmans et al.

Histogram matching

In image processing, histogram matching or histogram specification is the transformation of an image so that its histogram matches a specified histogram. The well-known histogram equalization method is a special case in which the specified histogram is uniformly distributed.

Color normalization is a topic in computer vision concerned with artificial color vision and object recognition. In general, the distribution of color values in an image depends on the illumination, which may vary depending on lighting conditions, cameras, and other factors. Color normalisation allows for object recognition techniques based on colour to compensate for these variations.

Foreground detection

Foreground detection is one of the major tasks in the field of computer vision and image processing whose aim is to detect changes in image sequences. Background subtraction is any technique which allows an image's foreground to be extracted for further processing.

Circular thresholding

Circular thresholding is an algorithm for automatic image threshold selection in image processing. Most threshold selection algorithms assume that the values lie on a linear scale. However, some quantities such as hue and orientation are a circular quantity, and therefore require circular thresholding algorithms. The example shows that the standard linear version of Otsu's Method when applied to the hue channel of an image of blood cells fails to correctly segment the large white blood cells (leukocytes). In contrast the white blood cells are correctly segmented by the circular version of Otsu's Method.


  1. (Shapiro, et al. 2001:83)
  2. Zhang, Y. (2011). "Optimal multi-level Thresholding based on Maximum Tsallis Entropy via an Artificial Bee Colony Approach". Entropy. 13 (4): 841–859. Bibcode:2011Entrp..13..841Z. doi: 10.3390/e13040841 .
  3. E., Umbaugh, Scott (2017-11-30). Digital Image Processing and Analysis with MATLAB and CVIPtools, Third Edition (3rd ed.). ISBN   9781498766074. OCLC   1016899766.


Further reading