Oversampled binary image sensor

Last updated

An oversampled binary image sensor is an image sensor with non-linear response capabilities reminiscent of traditional photographic film. [1] [2] Each pixel in the sensor has a binary response, giving only a one-bit quantized measurement of the local light intensity. The response function of the image sensor is non-linear and similar to a logarithmic function, which makes the sensor suitable for high dynamic range imaging. [1]

Contents

Working principle

Before the advent of digital image sensors, photography, for the most part of its history, used film to record light information. At the heart of every photographic film are a large number of light-sensitive grains of silver-halide crystals. [3] During exposure, each micron-sized grain has a binary fate: Either it is struck by some incident photons and becomes "exposed", or it is missed by the photon bombardment and remains "unexposed". In the subsequent film development process, exposed grains, due to their altered chemical properties, are converted to silver metal, contributing to opaque spots on the film; unexposed grains are washed away in a chemical bath, leaving behind the transparent regions on the film. Thus, in essence, photographic film is a binary imaging medium, using local densities of opaque silver grains to encode the original light intensity information. Thanks to the small size and large number of these grains, one hardly notices this quantized nature of film when viewing it at a distance, observing only a continuous gray tone.

The oversampled binary image sensor is reminiscent of photographic film. Each pixel in the sensor has a binary response, giving only a one-bit quantized measurement of the local light intensity. At the start of the exposure period, all pixels are set to 0. A pixel is then set to 1 if the number of photons reaching it during the exposure is at least equal to a given threshold q. One way to build such binary sensors is to modify standard memory chip technology, where each memory bit cell is designed to be sensitive to visible light. [4] With current CMOS technology, the level of integration of such systems can exceed 109~1010 (i.e., 1 giga to 10 giga) pixels per chip. In this case, the corresponding pixel sizes (around 50~nm [5] ) are far below the diffraction limit of light, and thus the image sensor is oversampling the optical resolution of the light field. Intuitively, one can exploit this spatial redundancy to compensate for the information loss due to one-bit quantizations, as is classic in oversampling delta-sigma converters. [6]

Building a binary sensor that emulates the photographic film process was first envisioned by Fossum, [7] who coined the name digital film sensor (now referred to as a quanta image sensor [8] ). The original motivation was mainly out of technical necessity. The miniaturization of camera systems calls for the continuous shrinking of pixel sizes. At a certain point, however, the limited full-well capacity (i.e., the maximum photon-electrons a pixel can hold) of small pixels becomes a bottleneck, yielding very low signal-to-noise ratios (SNRs) and poor dynamic ranges. In contrast, a binary sensor whose pixels need to detect only a few photon-electrons around a small threshold q has much less requirement for full-well capacities, allowing pixel sizes to shrink further.

Imaging model

Lens

Fig.1 The imaging model. The simplified architecture of a diffraction-limited imaging system. Incident light field
l
0
(
x
)
{\displaystyle \lambda _{0}(x)}
passes through an optical lens, which acts like a linear system with a diffraction-limited point spread function (PSF). The result is a smoothed light field
l
(
x
)
{\displaystyle \lambda (x)}
, which is subsequently captured by the image sensor. Oversampled binary sensor imaging model.jpg
Fig.1 The imaging model. The simplified architecture of a diffraction-limited imaging system. Incident light field passes through an optical lens, which acts like a linear system with a diffraction-limited point spread function (PSF). The result is a smoothed light field , which is subsequently captured by the image sensor.

Consider a simplified camera model shown in Fig.1. The is the incoming light intensity field. By assuming that light intensities remain constant within a short exposure period, the field can be modeled as only a function of the spatial variable . After passing through the optical system, the original light field gets filtered by the lens, which acts like a linear system with a given impulse response. Due to imperfections (e.g., aberrations) in the lens, the impulse response, a.k.a. the point spread function (PSF) of the optical system, cannot be a Dirac delta, thus, imposing a limit on the resolution of the observable light field. However, a more fundamental physical limit is due to light diffraction. [9] As a result, even if the lens is ideal, the PSF is still unavoidably a small blurry spot. In optics, such diffraction-limited spot is often called the Airy disk, [9] whose radius can be computed as

where is the wavelength of the light and is the F-number of the optical system. Due to the lowpass (smoothing) nature of the PSF, the resulting has a finite spatial-resolution, i.e., it has a finite number of degrees of freedom per unit space.

Sensor

Fig.2 The model of the binary image sensor. The pixels (shown as "buckets") collect photons, the numbers of which are compared against a quantization threshold q. In the figure, we illustrate the case when q = 2. The pixel outputs are binary:
b
m
=
1
{\displaystyle b_{m}=1}
(i.e., white pixels) if there are at least two photons received by the pixel; otherwise,
b
m
=
0
{\displaystyle b_{m}=0}
(i.e., gray pixels). Binary sensor model.svg
Fig.2 The model of the binary image sensor. The pixels (shown as "buckets") collect photons, the numbers of which are compared against a quantization threshold q. In the figure, we illustrate the case when q = 2. The pixel outputs are binary: (i.e., white pixels) if there are at least two photons received by the pixel; otherwise, (i.e., gray pixels).

Fig.2 illustrates the binary sensor model. The denote the exposure values accumulated by the sensor pixels. Depending on the local values of , each pixel (depicted as "buckets" in the figure) collects a different number of photons hitting on its surface. is the number of photons impinging on the surface of the th pixel during an exposure period. The relation between and the photon count is stochastic. More specifically, can be modeled as realizations of a Poisson random variable, whose intensity parameter is equal to ,

As a photosensitive device, each pixel in the image sensor converts photons to electrical signals, whose amplitude is proportional to the number of photons impinging on that pixel. In a conventional sensor design, the analog electrical signals are then quantized by an A/D converter into 8 to 14 bits (usually the more bits the better). But in the binary sensor, the quantizer is 1 bit. In Fig.2, is the quantized output of the th pixel. Since the photon counts are drawn from random variables, so are the binary sensor output .

Spatial and temporal oversampling

If it is allowed to have temporal oversampling, i.e., taking multiple consecutive and independent frames without changing the total exposure time , the performance of the binary sensor is equivalent to the sensor with same number of spatial oversampling under certain condition. [2] It means that people can make trade off between spatial oversampling and temporal oversampling. This is quite important, since technology usually gives limitation on the size of the pixels and the exposure time.

Advantages over traditional sensors

Due to the limited full-well capacity of conventional image pixel, the pixel will saturate when the light intensity is too strong. This is the reason that the dynamic range of the pixel is low. For the oversampled binary image sensor, the dynamic range is not defined for a single pixel, but a group of pixels, which makes the dynamic range high. [2]

Reconstruction

Fig.4 Reconstructing an image from the binary measurements taken by a SPAD sensor, with a spatial resolution of 32x32 pixels. The final image (lower-right corner) is obtained by incorporating 4096 consecutive frames, 11 of which are shown in the figure. SPAD EPFL BINARY IMAGES.png
Fig.4 Reconstructing an image from the binary measurements taken by a SPAD sensor, with a spatial resolution of 32×32 pixels. The final image (lower-right corner) is obtained by incorporating 4096 consecutive frames, 11 of which are shown in the figure.

One of the most important challenges with the use of an oversampled binary image sensor is the reconstruction of the light intensity from the binary measurement . Maximum likelihood estimation can be used for solving this problem. [2] Fig. 4 shows the results of reconstructing the light intensity from 4096 binary images taken by single photon avalanche diodes (SPADs) camera. [10] A better reconstruction quality with fewer temporal measurements and faster, hardware friendly implementation, can be achieved by more sophisticated algorithms. [11]

Related Research Articles

<span class="mw-page-title-main">Diffraction</span> Phenomenon of the motion of waves

Diffraction is the interference or bending of waves around the corners of an obstacle or through an aperture into the region of geometrical shadow of the obstacle/aperture. The diffracting object or aperture effectively becomes a secondary source of the propagating wave. Italian scientist Francesco Maria Grimaldi coined the word diffraction and was the first to record accurate observations of the phenomenon in 1660.

<span class="mw-page-title-main">Analog-to-digital converter</span> System that converts an analog signal into a digital signal

In electronics, an analog-to-digital converter is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provide an isolated measurement such as an electronic device that converts an analog input voltage or current to a digital number representing the magnitude of the voltage or current. Typically the digital output is a two's complement binary number that is proportional to the input, but there are other possibilities.

<span class="mw-page-title-main">Diffraction grating</span> Optical component which splits light into several beams

In optics, a diffraction grating is an optical grating with a periodic structure that diffracts light into several beams traveling in different directions. The emerging coloration is a form of structural coloration. The directions or diffraction angles of these beams depend on the wave (light) incident angle to the diffraction grating, the spacing or distance between adjacent diffracting elements on the grating, and the wavelength of the incident light. The grating acts as a dispersive element. Because of this, diffraction gratings are commonly used in monochromators and spectrometers, but other applications are also possible such as optical encoders for high-precision motion control and wavefront measurement.

Dynamic range is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base-10 (decibel) or base-2 logarithmic value of the difference between the smallest and largest signal values.

<span class="mw-page-title-main">Binary image</span> Image comprising exactly two colors, typically black and white

A binary image is one that consists of pixels that can have one of exactly two colors, usually black and white. Binary images are also called bi-level or two-level, Pixelart made of two colours is often referred to as 1-Bit or 1bit. This means that each pixel is stored as a single bit—i.e., a 0 or 1. The names black-and-white, B&W, monochrome or monochromatic are often used for this concept, but may also designate any images that have only one sample per pixel, such as grayscale images. In Photoshop parlance, a binary image is the same as an image in "Bitmap" mode.

<span class="mw-page-title-main">Multi-exposure HDR capture</span> Technique to capture HDR images and videos

In photography and videography, multi-exposure HDR capture is a technique that creates high dynamic range (HDR) images by taking and combining multiple exposures of the same subject matter at different exposure levels. Combining multiple images in this way results in an image with a greater dynamic range than what would be possible by taking one single image. The technique can also be used to capture video by taking and combining multiple exposures for each frame of the video. The term "HDR" is used frequently to refer to the process of creating HDR images from multiple exposures. Many smartphones have an automated HDR feature that relies on computational imaging techniques to capture and combine multiple exposures.

<span class="mw-page-title-main">Diffraction-limited system</span> Optical system with resolution performance at the instruments theoretical limit

In optics, any optical instrument or system – a microscope, telescope, or camera – has a principal limit to its resolution due to the physics of diffraction. An optical instrument is said to be diffraction-limited if it has reached this limit of resolution performance. Other factors may affect an optical system's performance, such as lens imperfections or aberrations, but these are caused by errors in the manufacture or calculation of a lens, whereas the diffraction limit is the maximum resolution possible for a theoretically perfect, or ideal, optical system.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

<span class="mw-page-title-main">Airy disk</span> Diffraction pattern in optics

In optics, the Airy disk and Airy pattern are descriptions of the best-focused spot of light that a perfect lens with a circular aperture can make, limited by the diffraction of light. The Airy disk is of importance in physics, optics, and astronomy.

<span class="mw-page-title-main">Zone plate</span> Device used to focus light using diffraction

A zone plate is a device used to focus light or other things exhibiting wave character. Unlike lenses or curved mirrors, zone plates use diffraction instead of refraction or reflection. Based on analysis by French physicist Augustin-Jean Fresnel, they are sometimes called Fresnel zone plates in his honor. The zone plate's focusing ability is an extension of the Arago spot phenomenon caused by diffraction from an opaque disc.

<span class="mw-page-title-main">Reciprocity (photography)</span>

In photography, reciprocity is the inverse relationship between the intensity and duration of light that determines the reaction of light-sensitive material. Within a normal exposure range for film stock, for example, the reciprocity law states that the film response will be determined by the total exposure, defined as intensity × time. Therefore, the same response can result from reducing duration and increasing light intensity, and vice versa.

In signal processing, oversampling is the process of sampling a signal at a sampling frequency significantly higher than the Nyquist rate. Theoretically, a bandwidth-limited signal can be perfectly reconstructed if sampled at the Nyquist rate or above it. The Nyquist rate is defined as twice the bandwidth of the signal. Oversampling is capable of improving resolution and signal-to-noise ratio, and can be helpful in avoiding aliasing and phase distortion by relaxing anti-aliasing filter performance requirements.

Super-resolution imaging (SR) is a class of techniques that enhance (increase) the resolution of an imaging system. In optical SR the diffraction limit of systems is transcended, while in geometrical SR the resolution of digital imaging sensors is enhanced.

<span class="mw-page-title-main">Delta-sigma modulation</span> Method for converting signals between digital and analog

Delta-sigma modulation is an oversampling method for encoding signals into low bit depth digital signals at a very high sample-frequency as part of the process of delta-sigma analog-to-digital converters (ADCs) and digital-to-analog converters (DACs). Delta-sigma modulation achieves high quality by utilizing a negative feedback loop during quantization to the lower bit depth that continuously corrects quantization errors and moves quantization noise to higher frequencies well above the original signal's bandwidth. Subsequent low-pass filtering for demodulation easily removes this high frequency noise and time averages to achieve high accuracy in amplitude which can be ultimately encoded as pulse-code modulation (PCM).

Optical resolution describes the ability of an imaging system to resolve detail, in the object that is being imaged. An imaging system may have many individual components, including one or more lenses, and/or recording and display components. Each of these contributes to the optical resolution of the system; the environment in which the imaging is done often is a further important factor.

<span class="mw-page-title-main">Image noise</span> Visible interference in an image

Image noise is random variation of brightness or color information in images, and is usually an aspect of electronic noise. It can be produced by the image sensor and circuitry of a scanner or digital camera. Image noise can also originate in film grain and in the unavoidable shot noise of an ideal photon detector. Image noise is an undesirable by-product of image capture that obscures the desired information. Typically the term “image noise” is used to refer to noise in 2D images, not 3D images.

<span class="mw-page-title-main">Image sensor</span> Device that converts images into electronic signals

An image sensor or imager is a sensor that detects and conveys information used to form an image. It does so by converting the variable attenuation of light waves into signals, small bursts of current that convey the information. The waves can be light or other electromagnetic radiation. Image sensors are used in electronic imaging devices of both analog and digital types, which include digital cameras, camera modules, camera phones, optical mouse devices, medical imaging equipment, night vision equipment such as thermal imaging devices, radar, sonar, and others. As technology changes, electronic and digital imaging tends to replace chemical and analog imaging.

<span class="mw-page-title-main">Optical transfer function</span> Function that specifies how different spatial frequencies are captured by an optical system

The optical transfer function (OTF) of an optical system such as a camera, microscope, human eye, or projector specifies how different spatial frequencies are captured or transmitted. It is used by optical engineers to describe how the optics project light from the object or scene onto a photographic film, detector array, retina, screen, or simply the next item in the optical transmission chain. A variant, the modulation transfer function (MTF), neglects phase effects, but is equivalent to the OTF in many situations.

<span class="mw-page-title-main">Audio bit depth</span> Number of bits of information recorded for each digital audio sample

In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc, which can support up to 24 bits per sample.

<span class="mw-page-title-main">Image sensor format</span> Shape and size of a digital cameras image sensor

In digital photography, the image sensor format is the shape and size of the image sensor.

References

  1. 1 2 L. Sbaiz, F. Yang, E. Charbon, S. Süsstrunk and M. Vetterli, The Gigavision Camera, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1093 - 1096, 2009.
  2. 1 2 3 4 F. Yang, Y.M. Lu, L. Saibz and M. Vetterli, Bits from Photons: Oversampled Image Acquisition Using Binary Poisson Statistics, IEEE Transactions on Image Processing, vol. 21, issue 4, pp.1421-1436, 2012.
  3. T. H. James, The Theory of The Photographic Process, 4th ed., New York: Macmillan Publishing Co., Inc., 1977.
  4. S. A. Ciarcia, A 64K-bit dynamic RAM chip is the visual sensor in this digital image camera, Byte Magazine, pp.21-31, Sep. 1983.
  5. Y. K. Park, S. H. Lee, J. W. Lee et al., Fully integrated 56nm DRAM technology for 1Gb DRAM, in IEEE Symposium on VLSI Technology, Kyoto, Japan, Jun. 2007.
  6. J. C. Candy and G. C. Temes, Oversamling Delta-Sigma Data Converters-Theory, Design and Simulation. New York, NY: IEEE Press, 1992.
  7. E. R. Fossum, What to do with sub-diffraction-limit (SDL) pixels? - A proposal for a gigapixel digital film sensor (DFS), in IEEE Workshop on Charge-Coupled Devices and Advanced Image Sensors, Nagano, Japan, Jun. 2005, pp.214-217.
  8. E.R. Fossum, J. Ma, S. Masoodian, L. Anzagira, and R. Zizza, The quanta image sensor: every photon counts, MDPI Sensors, vol. 16, no. 8, 1260; August 2016. doi : 10.3390/s16081260 (Special Issue on Photon-Counting Image Sensors)
  9. 1 2 M. Born and E. Wolf, Principles of Optics , 7th ed. Cambridge: Cambridge University Press, 1999
  10. 1 2 L. Carrara, C. Niclass, N. Scheidegger, H. Shea, and E. Charbon, A gamma, X-ray and high energy proton radiation-tolerant CMOS image sensor for space applications, in IEEE International Solid-State Circuits Conference, Feb. 2009, pp.40-41.
  11. Litany, Or; Remez, Tal; Bronstein, Alex (2015-12-06). "Image reconstruction from dense binary pixels". Signal Processing with Adaptive Sparse Structured Representations (SPARS 2015). arXiv: 1512.01774 . Bibcode:2015arXiv151201774L.