Grayscale

Last updated

Grayscale image of a parrot Grayscale 8bits palette sample image.png
Grayscale image of a parrot

In digital photography, computer-generated imagery, and colorimetry, a grayscale image is one in which the value of each pixel is a single sample representing only an amount of light; that is, it carries only intensity information. Grayscale images, a kind of black-and-white or gray monochrome, are composed exclusively of shades of gray. The contrast ranges from black at the weakest intensity to white at the strongest. [1]

Contents

Grayscale images are distinct from one-bit bi-tonal black-and-white images, which, in the context of computer imaging, are images with only two colors: black and white (also called bilevel or binary images ). Grayscale images have many shades of gray in between.

Grayscale images can be the result of measuring the intensity of light at each pixel according to a particular weighted combination of frequencies (or wavelengths), and in such cases they are monochromatic proper when only a single frequency (in practice, a narrow band of frequencies) is captured. The frequencies can in principle be from anywhere in the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.).

A colorimetric (or more specifically photometric) grayscale image is an image that has a defined grayscale colorspace, which maps the stored numeric sample values to the achromatic channel of a standard colorspace, which itself is based on measured properties of human vision.

If the original color image has no defined colorspace, or if the grayscale image is not intended to have the same human-perceived achromatic intensity as the color image, then there is no unique mapping from such a color image to a grayscale image.

Numerical representations

The intensity of a pixel is expressed within a given range between a minimum and a maximum, inclusive. This range is represented in an abstract way as a range from 0 (or 0%) (total absence, black) and 1 (or 100%) (total presence, white), with any fractional values in between. This notation is used in academic papers, but this does not define what "black" or "white" is in terms of colorimetry. Sometimes the scale is reversed, as in printing where the numeric intensity denotes how much ink is employed in halftoning, with 0% representing the paper white (no ink) and 100% being a solid black (full ink).

In computing, although the grayscale can be computed through rational numbers, image pixels are usually quantized to store them as unsigned integers, to reduce the required storage and computation. Some early grayscale monitors can only display up to sixteen different shades, which would be stored in binary form using 4 bits.[ citation needed ] But today grayscale images intended for visual display are commonly stored with 8 bits per sampled pixel. This pixel depth allows 256 different intensities (i.e., shades of gray) to be recorded, and also simplifies computation as each pixel sample can be accessed individually as one full byte. However, if these intensities were spaced equally in proportion to the amount of physical light they represent at that pixel (called a linear encoding or scale), the differences between adjacent dark shades could be quite noticeable as banding artifacts, while many of the lighter shades would be "wasted" by encoding a lot of perceptually-indistinguishable increments. Therefore, the shades are instead typically spread out evenly on a gamma-compressed nonlinear scale, which better approximates uniform perceptual increments for both dark and light shades, usually making these 256 shades enough to avoid noticeable increments. [2]

Technical uses (e.g. in medical imaging or remote sensing applications) often require more levels, to make full use of the sensor accuracy (typically 10 or 12 bits per sample) and to reduce rounding errors in computations. Sixteen bits per sample (65,536 levels) is often a convenient choice for such uses, as computers manage 16-bit words efficiently. The TIFF and PNG (among other) image file formats support 16-bit grayscale natively, although browsers and many imaging programs tend to ignore the low order 8 bits of each pixel. Internally for computation and working storage, image processing software typically uses integer or floating-point numbers of size 16 or 32 bits.

Converting color to grayscale

Examples of conversion from a full-color image to grayscale using Adobe Photoshop's Channel Mixer, compared to the original image and colorimetric conversion to grayscale Auschwitz channel mixer.jpg
Examples of conversion from a full-color image to grayscale using Adobe Photoshop's Channel Mixer, compared to the original image and colorimetric conversion to grayscale

Conversion of an arbitrary color image to grayscale is not unique in general; different weighting of the color channels effectively represent the effect of shooting black-and-white film with different-colored photographic filters on the cameras.

Colorimetric (perceptual luminance-preserving) conversion to grayscale

A common strategy is to use the principles of photometry or, more broadly, colorimetry to calculate the grayscale values (in the target grayscale colorspace) so as to have the same luminance (technically relative luminance) as the original color image (according to its colorspace). [3] [4] In addition to the same (relative) luminance, this method also ensures that both images will have the same absolute luminance when displayed, as can be measured by instruments in its SI units of candelas per square meter, in any given area of the image, given equal whitepoints. Luminance itself is defined using a standard model of human vision, so preserving the luminance in the grayscale image also preserves other perceptual lightness measures, such as L* (as in the 1976 CIE Lab color space) which is determined by the linear luminance Y itself (as in the CIE 1931 XYZ color space) which we will refer to here as Ylinear to avoid any ambiguity.

To convert a color from a colorspace based on a typical gamma-compressed (nonlinear) RGB color model to a grayscale representation of its luminance, the gamma compression function must first be removed via gamma expansion (linearization) to transform the image to a linear RGB colorspace, so that the appropriate weighted sum can be applied to the linear color components () to calculate the linear luminance Ylinear, which can then be gamma-compressed back again if the grayscale result is also to be encoded and stored in a typical nonlinear colorspace. [5]

For the common sRGB color space, gamma expansion is defined as

where Csrgb represents any of the three gamma-compressed sRGB primaries (Rsrgb, Gsrgb, and Bsrgb, each in range [0,1]) and Clinear is the corresponding linear-intensity value (Rlinear, Glinear, and Blinear, also in range [0,1]). Then, linear luminance is calculated as a weighted sum of the three linear-intensity values. The sRGB color space is defined in terms of the CIE 1931 linear luminance Ylinear, which is given by [6]

These three particular coefficients represent the intensity (luminance) perception of typical trichromat humans to light of the precise Rec. 709 additive primary colors (chromaticities) that are used in the definition of sRGB. Human vision is most sensitive to green, so this has the greatest coefficient value (0.7152), and least sensitive to blue, so this has the smallest coefficient (0.0722). To encode grayscale intensity in linear RGB, each of the three color components can be set to equal the calculated linear luminance (replacing by the values to get this linear grayscale), which then typically needs to be gamma compressed to get back to a conventional non-linear representation. [7] For sRGB, each of its three primaries is then set to the same gamma-compressed Ysrgb given by the inverse of the gamma expansion above as

Because the three sRGB components are then equal, indicating that it is actually a gray image (not color), it is only necessary to store these values once, and we call this the resulting grayscale image. This is how it will normally be stored in sRGB-compatible image formats that support a single-channel grayscale representation, such as JPEG or PNG. Web browsers and other software that recognizes sRGB images should produce the same rendering for such a grayscale image as it would for a "color" sRGB image having the same values in all three color channels.

Luma coding in video systems

For images in color spaces such as Y'UV and its relatives, which are used in standard color TV and video systems such as PAL, SECAM, and NTSC, a nonlinear luma component (Y) is calculated directly from gamma-compressed primary intensities as a weighted sum, which, although not a perfect representation of the colorimetric luminance, can be calculated more quickly without the gamma expansion and compression used in photometric/colorimetric calculations. In the Y'UV and Y'IQ models used by PAL and NTSC, the rec601 luma (Y) component is computed as

where we use the prime to distinguish these nonlinear values from the sRGB nonlinear values (discussed above) which use a somewhat different gamma compression formula, and from the linear RGB components. The ITU-R BT.709 standard used for HDTV developed by the ATSC uses different color coefficients, computing the luma component as

Although these are numerically the same coefficients used in sRGB above, the effect is different because here they are being applied directly to gamma-compressed values rather than to the linearized values. The ITU-R BT.2100 standard for HDR television uses yet different coefficients, computing the luma component as

Normally these colorspaces are transformed back to nonlinear R'G'B' before rendering for viewing. To the extent that enough precision remains, they can then be rendered accurately.

But if the luma component Y' itself is instead used directly as a grayscale representation of the color image, luminance is not preserved: two colors can have the same luma Y but different CIE linear luminance Y (and thus different nonlinear Ysrgb as defined above) and therefore appear darker or lighter to a typical human than the original color. Similarly, two colors having the same luminance Y (and thus the same Ysrgb) will in general have different luma by either of the Y luma definitions above. [8]

Grayscale as single channels of multichannel color images

Color images are often built of several stacked color channels, each of them representing value levels of the given channel. For example, RGB images are composed of three independent channels for red, green and blue primary color components; CMYK images have four channels for cyan, magenta, yellow and black ink plates, etc.

Here is an example of color channel splitting of a full RGB color image. The column at left shows the isolated color channels in natural colors, while at right there are their grayscale equivalences:

Composition of RGB from three grayscale images Beyoglu 4671 tricolor.png
Composition of RGB from three grayscale images

The reverse is also possible: to build a full-color image from their separate grayscale channels. By mangling channels, using offsets, rotating and other manipulations, artistic effects can be achieved instead of accurately reproducing the original image.

See also

Related Research Articles

<span class="mw-page-title-main">Alpha compositing</span> Operation in computer graphics

In computer graphics, alpha compositing or alpha blending is the process of combining one image with a background to create the appearance of partial or full transparency. It is often useful to render picture elements (pixels) in separate passes or layers and then combine the resulting 2D images into a single, final image called the composite. Compositing is used extensively in film when combining computer-rendered image elements with live footage. Alpha blending is also used in 2D computer graphics to put rasterized foreground elements over a background.

<span class="mw-page-title-main">RGB color model</span> Color model based on red, green, and blue

The RGB color model is an additive color model in which the red, green and blue primary colors of light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additive primary colors, red, green, and blue.

Gamma correction or gamma is a nonlinear operation used to encode and decode luminance or tristimulus values in video or still image systems. Gamma correction is, in the simplest cases, defined by the following power-law expression:

<span class="mw-page-title-main">Y′UV</span> Mathematical color model

Y′UV, also written YUV, is the color model found in the PAL analogue color TV standard. A color is described as a Y′ component (luma) and two chroma components U and V. The prime symbol (') denotes that the luma is calculated from gamma-corrected RGB input and that it is different from true luminance. Today, the term YUV is commonly used in the computer industry to describe colorspaces that are encoded using YCbCr.

<span class="mw-page-title-main">Chroma subsampling</span> Practice of encoding images

Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.

<span class="mw-page-title-main">HSL and HSV</span> Alternative representations of the RGB color model

HSL and HSV are the two most common cylindrical-coordinate representations of points in an RGB color model. The two representations rearrange the geometry of RGB in an attempt to be more intuitive and perceptually relevant than the cartesian (cube) representation. Developed in the 1970s for computer graphics applications, HSL and HSV are used today in color pickers, in image editing software, and less commonly in image analysis and computer vision.

<span class="mw-page-title-main">CIELAB color space</span> Standard color space with color-opponent values

The CIELAB color space, also referred to as L*a*b*, is a color space defined by the International Commission on Illumination in 1976. It expresses color as three values: L* for perceptual lightness and a* and b* for the four unique colors of human vision: red, green, blue and yellow. CIELAB was intended as a perceptually uniform space, where a given numerical change corresponds to a similar perceived change in color. While the LAB space is not truly perceptually uniform, it nevertheless is useful in industry for detecting small differences in color.

The RGB chromaticity space, two dimensions of the normalized RGB space, is a chromaticity space, a two-dimensional color space in which there is no intensity information.

<span class="mw-page-title-main">YCbCr</span> Family of digital colour spaces

YCbCr, Y′CbCr, or Y Pb/Cb Pr/Cr, also written as YCBCR or Y′CBCR, is a family of color spaces used as a part of the color image pipeline in video and digital photography systems. Y′ is the luma component and CB and CR are the blue-difference and red-difference chroma components. Y′ is distinguished from Y, which is luminance, meaning that light intensity is nonlinearly encoded based on gamma corrected RGB primaries.

sRGB Standard RGB color space

sRGB is a standard RGB color space that HP and Microsoft created cooperatively in 1996 to use on monitors, printers, and the World Wide Web. It was subsequently standardized by the International Electrotechnical Commission (IEC) as IEC 61966-2-1:1999. sRGB is the current defined standard colorspace for the web, and it is usually the assumed colorspace for images that are neither tagged for a colorspace nor have an embedded color profile.

Color digital images are made of pixels, and pixels are made of combinations of primary colors represented by a series of code. A channel in this context is the grayscale image of the same size as a color image, made of just one of these primary colors. For instance, an image from a standard digital camera will have a red, green and blue channel. A grayscale image has just one channel.

Relative luminance follows the photometric definition of luminance including spectral weighting for human vision, but while luminance is a measure of light in units such as , Relative luminance values are normalized as 0.0 to 1.0, with 1.0 being a theoretical perfect reflector of 100% reference white. Like the photometric definition, it is related to the luminous flux density in a particular direction, which is radiant flux density weighted by the luminous efficiency function y(λ) of the CIE Standard Observer.

<span class="mw-page-title-main">Histogram equalization</span> Method in image processing of contrast adjustment using the images histogram

Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.

In video, luma represents the brightness in an image. Luma is typically paired with chrominance. Luma represents the achromatic image, while the chroma components represent the color information. Converting R′G′B′ sources into luma and chroma allows for chroma subsampling: because human vision has finer spatial sensitivity to luminance differences than chromatic differences, video systems can store and transmit chromatic information at lower resolution, optimizing perceived detail at a particular bandwidth.

<span class="mw-page-title-main">Lightness</span> Property of a color

Lightness is a visual perception of the luminance of an object. It is often judged relative to a similarly lit object. In colorimetry and color appearance models, lightness is a prediction of how an illuminated color will appear to a standard observer. While luminance is a linear measurement of light, lightness is a linear prediction of the human perception of that light.

<span class="mw-page-title-main">Rec. 709</span> Standard for HDTV image encoding and signal characteristics

Rec. 709, also known as Rec.709, BT.709, and ITU 709, is a standard developed by ITU-R for image encoding and signal characteristics of high-definition television.

<span class="mw-page-title-main">Blend modes</span> How two or more digital photo layers are mixed together

Blend modes in digital image editing and computer graphics are used to determine how two layers are blended with each other. The default blend mode in most applications is simply to obscure the lower layer by covering it with whatever is present in the top layer ; because each pixel has numerical values, there also are many other ways to blend two layers.

<span class="mw-page-title-main">Color gradient</span> Specifies a range of position-dependent colors

In color science, a color gradient specifies a range of position-dependent colors, usually used to fill a region.

Apple Video is a lossy video compression and decompression algorithm (codec) developed by Apple Inc. and first released as part of QuickTime 1.0 in 1991. The codec is also known as QuickTime Video, by its FourCC RPZA and the name Road Pizza. When used in the AVI container, the FourCC AZPR is also used.

<i>ICtCp</i>

ICTCP, ICtCp, or ITP is a color representation format specified in the Rec. ITU-R BT.2100 standard that is used as a part of the color image pipeline in video and digital photography systems for high dynamic range (HDR) and wide color gamut (WCG) imagery. It was developed by Dolby Laboratories from the IPT color space by Ebner and Fairchild. The format is derived from an associated RGB color space by a coordinate transformation that includes two matrix transformations and an intermediate nonlinear transfer function that is informally known as gamma pre-correction. The transformation produces three signals called I, CT, and CP. The ICTCP transformation can be used with RGB signals derived from either the perceptual quantizer (PQ) or hybrid log–gamma (HLG) nonlinearity functions, but is most commonly associated with the PQ function.

References

  1. Johnson, Stephen (2006). Stephen Johnson on Digital Photography. O'Reilly. ISBN   0-596-52370-X.
  2. Poynton, Charles (2012). Digital Video and HD: Algorithms and Interfaces (2nd ed.). Morgan Kaufmann. pp. 31–35, 65–68, 333, 337. ISBN   978-0-12-391926-7 . Retrieved 2022-03-31.
  3. Poynton, Charles A. (2022-03-14). Written at San Jose, Calif.. Rogowitz, B. E.; Pappas, T. N. (eds.). Rehabilitation of Gamma (PDF). SPIE/IS&T Conference 3299: Human Vision and Electronic Imaging III; January 26–30, 1998. Bellingham, Wash.: SPIE. doi:10.1117/12.320126. Archived (PDF) from the original on 2023-04-23.
  4. Poynton, Charles A. (2004-02-25). "Constant Luminance". Video Engineering. Archived from the original on 2023-03-16.
  5. Lindbloom, Bruce (2017-04-06). "RGB Working Space Information". Archived from the original on 2023-06-01.
  6. Stokes, Michael; Anderson, Matthew; Chandrasekar, Srinivasan; Motta, Ricardo (1996-11-05). "A Standard Default Color Space for the Internet – sRGB". World Wide Web Consortium – Graphics on the Web. Part 2, matrix in equation 1.8. Archived from the original on 2023-05-24.
  7. Burger, Wilhelm; Burge, Mark J. (2010). Principles of Digital Image Processing Core Algorithms. Springer Science & Business Media. pp. 110–111. ISBN   978-1-84800-195-4.
  8. Poynton, Charles A. (1997-07-15). "The Magnitude of Nonconstant Luminance Errors" (PDF).