Event camera

Last updated

An event camera, also known as a neuromorphic camera, [1] silicon retina [2] or dynamic vision sensor, [3] is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional (frame) cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise.

Contents

Functional description

Event camera pixels independently respond to changes in brightness as they occur. [4] Each pixel stores a reference brightness level, and continuously compares it to the current brightness level. If the difference in brightness exceeds a threshold, that pixel resets its reference level and generates an event: a discrete packet that contains the pixel address and timestamp. Events may also contain the polarity (increase or decrease) of a brightness change, or an instantaneous measurement of the illumination level, [5] depending on the specific sensor model. Thus, event cameras output an asynchronous stream of events triggered by changes in scene illumination.

Comparison of the data produced by an event camera and a conventional camera. Event camera comparison.jpg
Comparison of the data produced by an event camera and a conventional camera.

Event cameras typically report timestamps with a microsecond temporal resolution, 120 dB dynamic range, and less under/overexposure and motion blur [4] [6] than frame cameras. This allows them to track object and camera movement (optical flow) more accurately. They yield grey-scale information. Initially (2014), resolution was limited to 100 pixels. A later entry reached 640x480 resolution in 2019. Because individual pixels fire independently, event cameras appear suitable for integration with asynchronous computing architectures such as neuromorphic computing. Pixel independence allows these cameras to cope with scenes with brightly and dimly lit regions without having to average across them. [7] It is important to note that while the camera reports events with microsecond resolution, the actual temporal resolution (or, alternatively, the bandwidth for sensing) is in the order of tens of microseconds to a few miliseconds - depending on signal contrast, lighting conditions and sensor design. [8]

Typical image sensor characteristics
SensorDynamic

range (dB)

Equivalent

framerate (fps)

Spatial

resolution (MP)

Power

consumption (mW)

Human eye30–40200-300*-10 [9]
High-end DSLR camera (Nikon D850)44.6 [10] 1202–8-
Ultrahigh-speed camera (Phantom v2640) [11] 6412,5000.3–4-
Event camera [12] 12050,000 - 300,000**0.1–130

* Indicates human perception temporal resolution, including cognitive processing time. **Refers to change recognition rates, and varies according to signal and sensor model.

Types

Temporal contrast sensors (such as DVS [4] (Dynamic Vision Sensor), or sDVS [13] (sensitive-DVS)) produce events that indicate polarity (increase or decrease in brightness), while temporal image sensors [5] indicate the instantaneous intensity with each event. The DAVIS [14] (Dynamic and Active-pixel Vision Sensor) contains a global shutter active pixel sensor (APS) in addition to the dynamic vision sensor (DVS) that shares the same photosensor array. Thus, it has the ability to produce image frames alongside events. Many event cameras additionally carry an inertial measurement unit (IMU).

Retinomorphic sensors

Left: schematic cross-sectional diagram of photosensitive capacitor. Center: circuit diagram of retinomorphic sensor, with photosensitive capacitor at top. Right: Expected transient response of retinomorphic sensor to application of constant illumination. Retinomorphic Sensor.jpg
Left: schematic cross-sectional diagram of photosensitive capacitor. Center: circuit diagram of retinomorphic sensor, with photosensitive capacitor at top. Right: Expected transient response of retinomorphic sensor to application of constant illumination.

Another class of event sensors are so-called retinomorphic sensors. While the term retinomorphic has been used to describe event sensors generally, [15] [16] in 2020 it was adopted as the name for a specific sensor design based on a resistor and photosensitive capacitor in series. [17] These capacitors are distinct from photocapacitors, which are used to store solar energy, [18] and are instead designed to change capacitance under illumination. They charge/discharge slightly when the capacitance is changed, but otherwise remain in equilibrium. When a photosensitive capacitor is placed in series with a resistor, and an input voltage is applied across the circuit, the result is a sensor that outputs a voltage when the light intensity changes, but otherwise does not.

Unlike other event sensors (typically a photodiode and some other circuit elements), these sensors produce the signal inherently. They can hence be considered a single device that produces the same result as a small circuit in other event cameras. Retinomorphic sensors have to-date only been studied in a research environment. [19] [20] [21] [22]

Algorithms

A pedestrian runs in front of car headlights at night. Left: image taken with a conventional camera exhibits severe motion blur and underexposure. Right: image reconstructed by combining the left image with events from an event camera. Night run reconstruction.png
A pedestrian runs in front of car headlights at night. Left: image taken with a conventional camera exhibits severe motion blur and underexposure. Right: image reconstructed by combining the left image with events from an event camera.

Image reconstruction

Image reconstruction from events has the potential to create images and video with high dynamic range, high temporal resolution and reduced motion blur. Image reconstruction can be achieved using temporal smoothing, e.g. high-pass or complementary filter. [23] Alternative methods include optimization [24] and gradient estimation [25] followed by Poisson integration.

Spatial convolutions

The concept of spatial event-driven convolution was postulated in 1999 [26] (before the DVS), but later generalized during EU project CAVIAR [27] (during which the DVS was invented) by projecting event-by-event an arbitrary convolution kernel around the event coordinate in an array of integrate-and-fire pixels. [28] Extension to multi-kernel event-driven convolutions [29] allows for event-driven deep convolutional neural networks. [30]

Motion detection and tracking

Segmentation and detection of moving objects viewed by an event camera can seem to be a trivial task, as it is done by the sensor on-chip. However, these tasks are difficult, because events carry little information [31] and do not contain useful visual features like texture and color. [32] These tasks become further challenging given a moving camera, [31] because events are triggered everywhere on the image plane, produced by moving objects and the static scene (whose apparent motion is induced by the camera’s ego-motion). Some of the recent approaches to solving this problem include the incorporation of motion-compensation models [33] [34] and traditional clustering algorithms. [35] [36] [32] [37]

Potential applications

Potential applications include most tasks classically fitting conventional camera, but with emphasis on machine vision tasks (such as object recognition, autonomous vehicles, and robotics. [21] ). The US military is considering infrared and other event cameras because of their lower power consumption and reduced heat generation. [7]

Considering the advantages the event camera possesses, compared to conventional image sensors, it is considered fitting for applications requiring low power consumption, low latency, and difficulty to stabalize camera line of sight. These applications include the aforementioned autonomous systems, but also space imaging, security, defense and industrial monitoring. It is notable that while research into color sensing with event cameras is underway, [38] it is not yet convenient for use with applications requiring color sensing.

See also

Related Research Articles

Neuromorphic computing is an approach to computing that is inspired by the structure and function of the human brain. A neuromorphic computer/chip is any device that uses physical artificial neurons to do computations. In recent times, the term neuromorphic has been used to describe analog, digital, mixed-mode analog/digital VLSI, and software systems that implement models of neural systems. The implementation of neuromorphic computing on the hardware level can be realized by oxide-based memristors, spintronic memories, threshold switches, transistors, among others. Training software-based neuromorphic systems of spiking neural networks can be achieved using error backpropagation, e.g., using Python based frameworks such as snnTorch, or using canonical learning rules from the biological learning literature, e.g., using BindsNet.

<span class="mw-page-title-main">Carver Mead</span> American scientist and engineer

Carver Andress Mead is an American scientist and engineer. He currently holds the position of Gordon and Betty Moore Professor Emeritus of Engineering and Applied Science at the California Institute of Technology (Caltech), having taught there for over 40 years.

<span class="mw-page-title-main">Optical neural network</span>

An optical neural network (ONN) is a physical implementation of an artificial neural network with optical components.

<span class="mw-page-title-main">Image sensor</span> Device that converts images into electronic signals

An image sensor or imager is a sensor that detects and conveys information used to form an image. It does so by converting the variable attenuation of light waves into signals, small bursts of current that convey the information. The waves can be light or other electromagnetic radiation. Image sensors are used in electronic imaging devices of both analog and digital types, which include digital cameras, camera modules, camera phones, optical mouse devices, medical imaging equipment, night vision equipment such as thermal imaging devices, radar, sonar, and others. As technology changes, electronic and digital imaging tends to replace chemical and analog imaging.

<span class="mw-page-title-main">Active-pixel sensor</span> Image sensor, consisting of an integrated circuit

An active-pixel sensor (APS) is an image sensor, which was invented by Peter J.W. Noble in 1968, where each pixel sensor unit cell has a photodetector and one or more active transistors. In a metal–oxide–semiconductor (MOS) active-pixel sensor, MOS field-effect transistors (MOSFETs) are used as amplifiers. There are different types of APS, including the early NMOS APS and the now much more common complementary MOS (CMOS) APS, also known as the CMOS sensor. CMOS sensors are used in digital camera technologies such as cell phone cameras, web cameras, most modern digital pocket cameras, most digital single-lens reflex cameras (DSLRs), mirrorless interchangeable-lens cameras (MILCs), and lensless imaging for cells.

<span class="mw-page-title-main">Color filter array</span>

In digital imaging, a color filter array (CFA), or color filter mosaic (CFM), is a mosaic of tiny color filters placed over the pixel sensors of an image sensor to capture color information.

The asynchronous array of simple processors (AsAP) architecture comprises a 2-D array of reduced complexity programmable processors with small scratchpad memories interconnected by a reconfigurable mesh network. AsAP was developed by researchers in the VLSI Computation Laboratory (VCL) at the University of California, Davis and achieves high performance and energy-efficiency, while using a relatively small circuit area. It was made in 2006.

<span class="mw-page-title-main">Spiking neural network</span> Artificial neural network that mimics neurons

Spiking neural networks (SNNs) are artificial neural networks (ANN) that more closely mimic natural neural networks. In addition to neuronal and synaptic state, SNNs incorporate the concept of time into their operating model. The idea is that neurons in the SNN do not transmit information at each propagation cycle, but rather transmit information only when a membrane potential—an intrinsic quality of the neuron related to its membrane electrical charge—reaches a specific value, called the threshold. When the membrane potential reaches the threshold, the neuron fires, and generates a signal that travels to other neurons which, in turn, increase or decrease their potentials in response to this signal. A neuron model that fires at the moment of threshold crossing is also called a spiking neuron model.

<span class="mw-page-title-main">Object detection</span> Computer technology related to computer vision and image processing

Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including image retrieval and video surveillance.

<span class="mw-page-title-main">SpiNNaker</span>

SpiNNaker is a massively parallel, manycore supercomputer architecture designed by the Advanced Processor Technologies Research Group (APT) at the Department of Computer Science, University of Manchester. It is composed of 57,600 processing nodes, each with 18 ARM9 processors and 128 MB of mobile DDR SDRAM, totalling 1,036,800 cores and over 7 TB of RAM. The computing platform is based on spiking neural networks, useful in simulating the human brain.

A cognitive computer is a computer that hardwires artificial intelligence and machine learning algorithms into an integrated circuit that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic engineering approach. Synonyms include neuromorphic chip and cognitive chip.

Donhee Ham is the John A. and Elizabeth S. Armstrong Professor of Engineering and Applied Sciences at Harvard University and Fellow of Samsung Electronics.

<span class="mw-page-title-main">MNIST database</span> Database of handwritten digits

The MNIST database is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

In the domain of physics and probability, the filters, random fields, and maximum entropy (FRAME) model is a Markov random field model of stationary spatial processes, in which the energy function is the sum of translation-invariant potential functions that are one-dimensional non-linear transformations of linear filter responses. The FRAME model was originally developed by Song-Chun Zhu, Ying Nian Wu, and David Mumford for modeling stochastic texture patterns, such as grasses, tree leaves, brick walls, water waves, etc. This model is the maximum entropy distribution that reproduces the observed marginal histograms of responses from a bank of filters, where for each filter tuned to a specific scale and orientation, the marginal histogram is pooled over all the pixels in the image domain. The FRAME model is also proved to be equivalent to the micro-canonical ensemble, which was named the Julesz ensemble. Gibbs sampler is adopted to synthesize texture images by drawing samples from the FRAME model.

<span class="mw-page-title-main">Video super-resolution</span> Generating high-resolution video frames from given low-resolution ones

Video super-resolution (VSR) is the process of generating high-resolution video frames from the given low-resolution video frames. Unlike single-image super-resolution (SISR), the main goal is not only to restore more fine details while saving coarse ones, but also to preserve motion consistency.

<span class="mw-page-title-main">Edoardo Charbon</span> Swiss quantum engineer

Edoardo Charbon is a Swiss electrical engineer. He is a professor of quantum engineering at EPFL and the head of the Laboratory of Advanced Quantum Architecture (AQUA) at the School of Engineering.

<span class="mw-page-title-main">Retinomorphic sensor</span> Optical sensor

Retinomorphic sensors are a type of event-driven optical sensor which produce a signal in response to changes in light intensity, rather than to light intensity itself. This is in contrast to conventional optical sensors such as charge coupled device (CCD) or complementary metal oxide semiconductor (CMOS) based sensors, which output a signal that increases with increasing light intensity. Because they respond to movement only, retinomorphic sensors are hoped to enable faster tracking of moving objects than conventional image sensors, and have potential applications in autonomous vehicles, robotics, and neuromorphic engineering.

Small object detection is a particular case of object detection where various techniques are employed to detect small objects in digital images and videos. "Small objects" are objects having a small pixel footprint in the input image. In areas such as aerial imagery, state-of-the-art object detection techniques under performed because of small objects.

<span class="mw-page-title-main">Peter L. P. Dillon</span> American physicist

Peter L. P. Dillon is an American physicist, and the inventor of integral color image sensors.. and single-chip color video cameras. The curator of the Technology Collection at the George Eastman Museum, Todd Gustavson, has stated that "the color sensor technology developed by Peter Dillon has revolutionized all forms of color photography. These color sensors are now ubiquitous in products such as smart phone cameras, digital cameras and camcorders, digital cinema cameras, medical cameras, automobile cameras, and drones". Dillon joined Kodak Research Labs in 1959 and retired from Kodak in 1991. He lives in Pittsford, New York.

References

  1. Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping (2017). "CIFAR10-DVS: An Event-Stream Dataset for Object Classification". Frontiers in Neuroscience. 11: 309. doi: 10.3389/fnins.2017.00309 . ISSN   1662-453X. PMC   5447775 . PMID   28611582.
  2. Sarmadi, Hamid; Muñoz-Salinas, Rafael; Olivares-Mendez, Miguel A.; Medina-Carnicer, Rafael (2021). "Detection of Binary Square Fiducial Markers Using an Event Camera". IEEE Access. 9: 27813–27826. arXiv: 2012.06516 . Bibcode:2021IEEEA...927813S. doi:10.1109/ACCESS.2021.3058423. ISSN   2169-3536. S2CID   228375825.
  3. Liu, Min; Delbruck, Tobi (May 2017). "Block-matching optical flow for dynamic vision sensors: Algorithm and FPGA implementation". 2017 IEEE International Symposium on Circuits and Systems (ISCAS). pp. 1–4. arXiv: 1706.05415 . doi:10.1109/ISCAS.2017.8050295. ISBN   978-1-4673-6853-7. S2CID   2283149 . Retrieved 27 June 2021.
  4. 1 2 3 Lichtsteiner, P.; Posch, C.; Delbruck, T. (February 2008). "A 128×128 120 dB 15μs Latency Asynchronous Temporal Contrast Vision Sensor" (PDF). IEEE Journal of Solid-State Circuits. 43 (2): 566–576. Bibcode:2008IJSSC..43..566L. doi:10.1109/JSSC.2007.914337. ISSN   0018-9200. S2CID   6119048. Archived from the original (PDF) on 2021-05-03. Retrieved 2019-12-06.
  5. 1 2 Posch, C.; Matolin, D.; Wohlgenannt, R. (January 2011). "A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS". IEEE Journal of Solid-State Circuits. 46 (1): 259–275. Bibcode:2011IJSSC..46..259P. doi:10.1109/JSSC.2010.2085952. ISSN   0018-9200. S2CID   21317717.
  6. Longinotti, Luca. "Product Specifications". iniVation. Archived from the original on 2019-04-02. Retrieved 2019-04-21.
  7. 1 2 "A new type of camera". The Economist. 2022-01-29. ISSN   0013-0613 . Retrieved 2022-02-02.
  8. Hu, Yuhuang; Liu, Shih-Chii; Delbruck, Tobi (2021-04-19). "v2e: From Video Frames to Realistic DVS Events". arXiv: 2006.07722 [cs.CV].
  9. Skorka, Orit (2011-07-01). "Toward a digital camera to rival the human eye". Journal of Electronic Imaging. 20 (3): 033009–033009–18. Bibcode:2011JEI....20c3009S. doi:10.1117/1.3611015. ISSN   1017-9909. S2CID   9340738.
  10. DxO. "Nikon D850 : Tests and Reviews | DxOMark". www.dxomark.com. Retrieved 2019-04-22.
  11. "Phantom v2640". www.phantomhighspeed.com. Retrieved 2019-04-22.
  12. Longinotti, Luca. "Product Specifications". iniVation. Archived from the original on 2019-04-02. Retrieved 2019-04-22.
  13. Serrano-Gotarredona, T.; Linares-Barranco, B. (March 2013). "A 128x128 1.5% Contrast Sensitivity 0.9% FPN 3μs Latency 4mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Amplifiers" (PDF). IEEE Journal of Solid-State Circuits. 48 (3): 827–838. Bibcode:2013IJSSC..48..827S. doi:10.1109/JSSC.2012.2230553. ISSN   0018-9200. S2CID   6686013.
  14. Brandli, C.; Berner, R.; Yang, M.; Liu, S.; Delbruck, T. (October 2014). "A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor". IEEE Journal of Solid-State Circuits. 49 (10): 2333–2341. Bibcode:2014IJSSC..49.2333B. doi: 10.1109/JSSC.2014.2342715 . ISSN   0018-9200.
  15. Boahen, K. (1996). "Retinomorphic vision systems". Proceedings of Fifth International Conference on Microelectronics for Neural Networks. pp. 2–14. doi:10.1109/MNNFS.1996.493766. ISBN   0-8186-7373-7. S2CID   62609792.
  16. Posch, Christoph; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe; Delbruck, Tobi (2014). "Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output". Proceedings of the IEEE. 102 (10): 1470–1484. doi:10.1109/JPROC.2014.2346153. hdl: 11441/102353 . ISSN   1558-2256. S2CID   11513955.
  17. Trujillo Herrera, Cinthya; Labram, John G. (2020-12-07). "A perovskite retinomorphic sensor". Applied Physics Letters. 117 (23): 233501. Bibcode:2020ApPhL.117w3501T. doi: 10.1063/5.0030097 . ISSN   0003-6951. S2CID   230546095.
  18. Miyasaka, Tsutomu; Murakami, Takurou N. (2004-10-25). "The photocapacitor: An efficient self-charging capacitor for direct storage of solar energy". Applied Physics Letters. 85 (17): 3932–3934. Bibcode:2004ApPhL..85.3932M. doi:10.1063/1.1810630. ISSN   0003-6951.
  19. "Perovskite sensor sees more like the human eye". Physics World. 2021-01-18. Retrieved 2021-10-28.
  20. "Simple Eyelike Sensors Could Make AI Systems More Efficient". Inside Science. 8 December 2020. Retrieved 2021-10-28.
  21. 1 2 Hambling, David. "AI vision could be improved with sensors that mimic human eyes". New Scientist. Retrieved 2021-10-28.
  22. "An eye for an AI: Optic device mimics human retina". BBC Science Focus Magazine. Retrieved 2021-10-28.
  23. 1 2 Scheerlinck, Cedric; Barnes, Nick; Mahony, Robert (2019). "Continuous-Time Intensity Estimation Using Event Cameras". Computer Vision – ACCV 2018. Lecture Notes in Computer Science. Vol. 11365. Springer International Publishing. pp. 308–324. arXiv: 1811.00386 . doi:10.1007/978-3-030-20873-8_20. ISBN   9783030208738. S2CID   53182986.
  24. Pan, Liyuan; Scheerlinck, Cedric; Yu, Xin; Hartley, Richard; Liu, Miaomiao; Dai, Yuchao (June 2019). "Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera". 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE. pp. 6813–6822. arXiv: 1811.10180 . doi:10.1109/CVPR.2019.00698. ISBN   978-1-7281-3293-8. S2CID   53749928.
  25. Scheerlinck, Cedric; Barnes, Nick; Mahony, Robert (April 2019). "Asynchronous Spatial Image Convolutions for Event Cameras". IEEE Robotics and Automation Letters. 4 (2): 816–822. arXiv: 1812.00438 . doi:10.1109/LRA.2019.2893427. ISSN   2377-3766. S2CID   59619729.
  26. Serrano-Gotarredona, T.; Andreou, A.; Linares-Barranco, B. (Sep 1999). "AER Image Filtering Architecture for Vision Processing Systems". IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications. 46 (9): 1064–1071. doi:10.1109/81.788808. hdl: 11441/76405 . ISSN   1057-7122.
  27. Serrano-Gotarredona, R.; et, al (Sep 2009). "CAVIAR: A 45k-Neuron, 5M-Synapse, 12G-connects/sec AER Hardware Sensory-Processing-Learning-Actuating System for High Speed Visual Object Recognition and Tracking". IEEE Transactions on Neural Networks. 20 (9): 1417–1438. doi:10.1109/TNN.2009.2023653. hdl: 10261/86527 . ISSN   1045-9227. PMID   19635693. S2CID   6537174.
  28. Serrano-Gotarredona, R.; Serrano-Gotarredona, T.; Acosta-Jimenez, A.; Linares-Barranco, B. (Dec 2006). "A Neuromorphic Cortical-Layer Microchip for Spike-Based Event Processing Vision Systems". IEEE Transactions on Circuits and Systems I: Regular Papers. 53 (12): 2548–2566. doi:10.1109/TCSI.2006.883843. hdl: 10261/7823 . ISSN   1549-8328. S2CID   8287877.
  29. Camuñas-Mesa, L.; et, al (Feb 2012). "An Event-Driven Multi-Kernel Convolution Processor Module for Event-Driven Vision Sensors". IEEE Journal of Solid-State Circuits. 47 (2): 504–517. Bibcode:2012IJSSC..47..504C. doi:10.1109/JSSC.2011.2167409. hdl: 11441/93004 . ISSN   0018-9200. S2CID   23238741.
  30. Pérez-Carrasco, J.A.; Zhao, B.; Serrano, C.; Acha, B.; Serrano-Gotarredona, T.; Chen, S.; Linares-Barranco, B. (November 2013). "Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate-Coding and Coincidence Processing. Application to Feed-Forward ConvNets". IEEE Transactions on Pattern Analysis and Machine Intelligence. 35 (11): 2706–2719. doi:10.1109/TPAMI.2013.71. hdl: 11441/79657 . ISSN   0162-8828. PMID   24051730. S2CID   170040.
  31. 1 2 Gallego, Guillermo; Delbruck, Tobi; Orchard, Garrick Michael; Bartolozzi, Chiara; Taba, Brian; Censi, Andrea; Leutenegger, Stefan; Davison, Andrew; Conradt, Jorg; Daniilidis, Kostas; Scaramuzza, Davide (2020). "Event-based Vision: A Survey". IEEE Transactions on Pattern Analysis and Machine Intelligence. PP (1): 154–180. arXiv: 1904.08405 . doi:10.1109/TPAMI.2020.3008413. ISSN   1939-3539. PMID   32750812. S2CID   234740723.
  32. 1 2 Mondal, Anindya; R, Shashant; Giraldo, Jhony H.; Bouwmans, Thierry; Chowdhury, Ananda S. (2021). "Moving Object Detection for Event-based Vision using Graph Spectral Clustering". 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). pp. 876–884. arXiv: 2109.14979 . doi:10.1109/ICCVW54120.2021.00103. ISBN   978-1-6654-0191-3. S2CID   238227007 via IEEE Xplore.
  33. Mitrokhin, Anton; Fermuller, Cornelia; Parameshwara, Chethan; Aloimonos, Yiannis (October 2018). "Event-Based Moving Object Detection and Tracking". 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid: IEEE. pp. 1–9. arXiv: 1803.04523 . doi:10.1109/IROS.2018.8593805. ISBN   978-1-5386-8094-0. S2CID   3845250.
  34. Stoffregen, Timo; Gallego, Guillermo; Drummond, Tom; Kleeman, Lindsay; Scaramuzza, Davide (2019). "Event-Based Motion Segmentation by Motion Compensation". 2019 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 7244–7253. arXiv: 1904.01293 . doi:10.1109/ICCV.2019.00734. ISBN   978-1-7281-4803-8. S2CID   91183976.
  35. Piątkowska, Ewa; Belbachir, Ahmed Nabil; Schraml, Stephan; Gelautz, Margrit (June 2012). "Spatiotemporal multiple persons tracking using Dynamic Vision Sensor". 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. pp. 35–40. doi:10.1109/CVPRW.2012.6238892. ISBN   978-1-4673-1612-5. S2CID   310741.
  36. Chen, Guang; Cao, Hu; Aafaque, Muhammad; Chen, Jieneng; Ye, Canbo; Röhrbein, Florian; Conradt, Jörg; Chen, Kai; Bing, Zhenshan; Liu, Xingbo; Hinz, Gereon (2018-12-02). "Neuromorphic Vision Based Multivehicle Detection and Tracking for Intelligent Transportation System". Journal of Advanced Transportation. 2018: e4815383. doi: 10.1155/2018/4815383 . ISSN   0197-6729.
  37. Mondal, Anindya; Das, Mayukhmali (2021-11-08). "Moving Object Detection for Event-based Vision using k-means Clustering". 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON). pp. 1–6. arXiv: 2109.01879 . doi:10.1109/UPCON52273.2021.9667636. ISBN   978-1-6654-0962-9. S2CID   237420620.
  38. "CED: Color Event Camera Dataset". rpg.ifi.uzh.ch. Retrieved 2024-04-08.