Color normalization

Last updated

Color normalization is a topic in computer vision concerned with artificial color vision and object recognition. In general, the distribution of color values in an image depends on the illumination, which may vary depending on lighting conditions, cameras, and other factors. Color normalization allows for object recognition techniques based on color to compensate for these variations.

Contents

Main concepts

Color constancy

Color constancy is a feature of the human internal model of perception, which provides humans with the ability to assign a relatively constant color to objects even under different illumination conditions. This is helpful for object recognition as well as identification of light sources in an environment. For example, humans see an object approximately as the same color when the sun is bright or when the sun is dim.

Applications

Color normalization has been used for object recognition on color images in the field of robotics, bioinformatics and general artificial intelligence, when it is important to remove all intensity values from the image while preserving color values. One example is in case of a scene shot by a surveillance camera over the day, where it is important to remove shadows or lighting changes on same color pixels and recognize the people that passed. [1] Another example is automated screening tools used for the detection of diabetic retinopathy [2] as well as molecular diagnosis of cancer states, [3] where it is important to include color information during classification.


Known issues

The main issue about certain applications of color normalization is that the end result looks unnatural or too distant from the original colors. [4] In cases where there is a subtle variation between important aspects, this can be problematic. More specifically, the side effect can be that pixels become divergent and not reflect the actual color value of the image. A way of combating this issue is to use color normalization in combination with thresholding to correctly and consistently segment a colored image. [5]

Transformations and algorithms

There is a vast array of different transformations and algorithms for achieving color normalization and a limited list is presented here. The performance of an algorithm is dependent on the task and one algorithm which performs better than another in one task might perform worse in another (no free lunch theorem). Additionally, the choice of the algorithm depends on the preferences of the user for the end-result, e.g. they may want a more natural-looking color image.

Grey world

The grey world normalization makes the assumption that changes in the lighting spectrum can be modelled by three constant factors applied to the red, green and blue channels of color. [6] More specifically, a change in illuminated color can be modelled as a scaling α, β and γ in the R, G and B color channels and as such the grey world algorithm is invariant to illumination color variations. Therefore, a constancy solution can be achieved by dividing each color channel by its average value as shown in the following formula:

As mentioned above, grey world color normalization is invariant to illuminated color variations α, β and γ, however it has one important problem: it does not account for all variations of illumination intensity and it is not dynamic; when new objects appear in the scene it fails. [6] To solve this problem there are several variants of the grey world algorithm. [6] Additionally there is an iterative variation of the grey world normalization, however it was not found to perform significantly better. [7]

Histogram equalization

Histogram equalization is a non-linear transform which maintains pixel rank and is capable of normalizing for any monotonically increasing color transform function. It is considered to be a more powerful normalization transformation than the grey world method. The results of histogram equalization tend to have an exaggerated blue channel and look unnatural, due to the fact that in most images the distribution of the pixel values is usually more similar to a Gaussian distribution, rather than uniform. [5]

Histogram specification

Histogram specification transforms the red, green and blue histograms to match the shapes of three specific histograms, rather than simply equalizing them. It refers to a class of image transforms which aims to obtain images of which the histograms have a desired shape. [2] As specified, firstly it is necessary to convert the image so that it has a particular histogram. Assume an image x. The following formula is the equalization transform of this image:

Then assume wanted image z. The equalization transform of this image is:

Of course is the histogram of the output image. The formula to find the inverse of the above transform is:

Therefore, since images y and y' have the same equalized histogram they are actually the same image, meaning y = y' and the transform from the given image x to the wanted image z is:

Histogram specification has the advantage of producing more realistic looking images, [8] as it does not exaggerate the blue channel like histogram equalization.

Comprehensive Color Normalization

The comprehensive color normalization is shown to increase localization and object classification results in combination with color indexing. [7] It is an iterative algorithm which works in two stages. The first stage is to use the red, green and blue color space with the intensity normalized, to normalize each pixel. The second stage is to normalize each color channel separately, so that the sum of the color components is equal to one third of the number of pixels. The iterations continue until convergence, meaning no additional changes. Formally:

Normalize the color image

which consists of color vectors

For the first step explained above, compute:

which leads to

and

For the second step explained above, compute:

and normalize

Of course the same process is done for b' and g'. Then these two steps are repeated until the changes between iteration t and t+2 are less than some set threshold.

Comprehensive color normalization, just like the histogram equalization method previously mentioned, produces results that may look less natural due to the reduction in the number of color values. [4]

Related Research Articles

<span class="mw-page-title-main">Ray tracing (graphics)</span> Rendering method

In 3-D computer graphics, ray tracing is a technique for modeling light transport for use in a wide variety of rendering algorithms for generating digital images.

<span class="mw-page-title-main">Z-buffering</span> Type of data buffer in computer graphics

A depth buffer, also known as a z-buffer, is a type of data buffer used in computer graphics to represent depth information of objects in 3D space from a particular perspective. Depth buffers are an aid to rendering a scene to ensure that the correct polygons properly occlude other polygons. Z-buffering was first described in 1974 by Wolfgang Straßer in his PhD thesis on fast algorithms for rendering occluded objects. A similar solution to determining overlapping polygons is the painter's algorithm, which is capable of handling non-opaque scene elements, though at the cost of efficiency and incorrect results.

The RGB chromaticity space, two dimensions of the normalized RGB space, is a chromaticity space, a two-dimensional color space in which there is no intensity information.

<span class="mw-page-title-main">Canny edge detector</span> Image edge detection algorithm

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a computational theory of edge detection explaining why the technique works.

<span class="mw-page-title-main">Fractal flame</span>

Fractal flames are a member of the iterated function system class of fractals created by Scott Draves in 1992. Draves' open-source code was later ported into Adobe After Effects graphics software and translated into the Apophysis fractal flame editor.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.

Phase correlation is an approach to estimate the relative translative offset between two similar images or other data sets. It is commonly used in image registration and relies on a frequency-domain representation of the data, usually calculated by fast Fourier transforms. The term is applied particularly to a subset of cross-correlation techniques that isolate the phase information from the Fourier-space representation of the cross-correlogram.

<span class="mw-page-title-main">Otsu's method</span> In computer vision and image processing

In computer vision and image processing, Otsu's method, named after Nobuyuki Otsu, is used to perform automatic image thresholding. In the simplest form, the algorithm returns a single intensity threshold that separate pixels into two classes, foreground and background. This threshold is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance. Otsu's method is a one-dimensional discrete analogue of Fisher's Discriminant Analysis, is related to Jenks optimization method, and is equivalent to a globally optimal k-means performed on the intensity histogram. The extension to multi-level thresholding was described in the original paper, and computationally efficient implementations have since been proposed.

<span class="mw-page-title-main">Histogram equalization</span> Method in image processing of contrast adjustment using the images histogram

Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.

Digital image correlation and tracking is an optical method that employs tracking and image registration techniques for accurate 2D and 3D measurements of changes in images. This method is often used to measure full-field displacement and strains, and it is widely applied in many areas of science and engineering. Compared to strain gauges and extensometers, digital image correlation methods provide finer details about deformation, due to the ability to provide both local and average data.

In the fields of computer vision and image analysis, the Harris affine region detector belongs to the category of feature detection. Feature detection is a preprocessing step of several algorithms that rely on identifying characteristic points or interest points so to make correspondences between images, recognize textures, categorize objects or build panoramas.

The image segmentation problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion. This article is primarily concerned with graph theoretic approaches to image segmentation applying graph partitioning via minimum cut or maximum cut. Segmentation-based object categorization can be viewed as a specific case of spectral clustering applied to image segmentation.

One-shot learning is an object categorization problem, found mostly in computer vision. Whereas most machine learning-based object categorization algorithms require training on hundreds or thousands of examples, one-shot learning aims to classify objects from one, or only a few, examples. The term few-shot learning is also used for these problems, especially when more than one example is needed.

Shape context is a feature descriptor used in object recognition. Serge Belongie and Jitendra Malik proposed the term in their paper "Matching with Shape Contexts" in 2000.

Computer stereo vision is the extraction of 3D information from digital images, such as those obtained by a CCD camera. By comparing information about a scene from two vantage points, 3D information can be extracted by examining the relative positions of objects in the two panels. This is similar to the biological process of stereopsis.

<span class="mw-page-title-main">Histogram matching</span>

In image processing, histogram matching or histogram specification is the transformation of an image so that its histogram matches a specified histogram. The well-known histogram equalization method is a special case in which the specified histogram is uniformly distributed.

The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels, e.g., "object" and "background". The unlabeled pixels are each imagined to release a random walker, and the probability is computed that each pixel's random walker first arrives at a seed bearing each label, i.e., if a user places K seeds, each with a different label, then it is necessary to compute, for each pixel, the probability that a random walker leaving the pixel will first arrive at each seed. These probabilities may be determined analytically by solving a system of linear equations. After computing these probabilities for each pixel, the pixel is assigned to the label for which it is most likely to send a random walker. The image is modeled as a graph, in which each pixel corresponds to a node which is connected to neighboring pixels by edges, and the edges are weighted to reflect the similarity between the pixels. Therefore, the random walk occurs on the weighted graph.

<span class="mw-page-title-main">Image texture</span>

An image texture is a set of metrics calculated in image processing designed to quantify the perceived texture of an image. Image texture gives us information about the spatial arrangement of color or intensities in an image or selected region of an image.

Batch normalization is a method used to make training of artificial neural networks faster and more stable through normalization of the layers' inputs by re-centering and re-scaling. It was proposed by Sergey Ioffe and Christian Szegedy in 2015.

References

  1. Maria Vanrell; Felipe Lumbreras; Albert Pujol; Ramon Baldrich; Josep Llados; J.J. Villanueva (2001). "Colour normalisation based on background information". Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205). Vol. 1. Thessaloniki, Greece. pp. 874–877. doi:10.1109/icip.2001.959185. ISBN   978-0-7803-6725-8. S2CID   206810375. INSPEC 7210999.{{cite book}}: CS1 maint: location missing publisher (link)
  2. 1 2 Keith A. Goatman; A. David Whitwam; A. Manivannan; John A. Olson; Peter F. Sharp. Colour normalisation of retinal images (PDF) (Report).
  3. Mark A. Rubin; Maciej P. Zerkowski; Robest L. Camp; Rainer Kuefer; Matthias D. Hofer; Arul M. Chinnaiyan; David L. Rimm (March 2004). "Quantitative Determination of Expression of the Prostate Cancer Protein a-Methylacyl-CoA Racemase Using Automated Quantitative Analysis (AQUA)". The American Journal of Pathology. 164 (3): 831–840. doi:10.1016/s0002-9440(10)63171-9. PMC   1613273 . PMID   14982837.
  4. 1 2 L.Csink; D. Paulus; U. Ahlrichs; B. Heigl (September 24, 1990). Color Normalization and Object Localization (PDF) (Report).
  5. 1 2 Burger, Wilhelm; Mark J. Burge (2008). Digital Image Processing: An algorithmic introduction using Java. Springer. ISBN   978-1846283796.
  6. 1 2 3 Jose M. Buenaposada; Luis Baumela. Variations of Grey World for face tracking (PDF) (Report).
  7. 1 2 Finlayson, Graham D.; Schiele, Bernt; Crowley, James L. (1998). "Comprehensive Colour Image Normalization" (PDF). Burkhard and Neumann: 475–490. OCLC   849180213. INSPEC 7210999. Archived from the original (PDF) on March 4, 2016. Retrieved March 10, 2012.
  8. A. Osareh; M. Mirmehdi; B. Thomas; et al. (2002). Classification and localisation of diabetic-related eye disease. Eccv '02. pp. 502–516. ISBN   9783540437482.{{cite book}}: |journal= ignored (help)

Bibliography