Sensor fusion is the process of combining sensor data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. For instance, one could potentially obtain a more accurate location estimate of an indoor object by combining multiple data sources such as video cameras and WiFi localization signals. The term uncertainty reduction in this case can mean more accurate, more complete, or more dependable, or refer to the result of an emerging view, such as stereoscopic vision (calculation of depth information by combining two-dimensional images from two cameras at slightly different viewpoints). [1] [2]
The data sources for a fusion process are not specified to originate from identical sensors. One can distinguish direct fusion, indirect fusion and fusion of the outputs of the former two. Direct fusion is the fusion of sensor data from a set of heterogeneous or homogeneous sensors, soft sensors, and history values of sensor data, while indirect fusion uses information sources like a priori knowledge about the environment and human input.
Sensor fusion is also known as (multi-sensor) data fusion and is a subset of information fusion .
Sensor fusion is a term that covers a number of methods and algorithms, including:
Two example sensor fusion calculations are illustrated below.
Let and denote two sensor measurements with noise variances and , respectively. One way of obtaining a combined measurement is to apply inverse-variance weighting, which is also employed within the Fraser-Potter fixed-interval smoother, namely [6]
where is the variance of the combined estimate. It can be seen that the fused result is simply a linear combination of the two measurements weighted by their respective noise variances.
Another (equivalent) method to fuse two measurements is to use the optimal Kalman filter. Suppose that the data is generated by a first-order system and let denote the solution of the filter's Riccati equation. By applying Cramer's rule within the gain calculation it can be found that the filter gain is given by:[ citation needed ]
By inspection, when the first measurement is noise free, the filter ignores the second measurement and vice versa. That is, the combined estimate is weighted by the quality of the measurements.
In sensor fusion, centralized versus decentralized refers to where the fusion of the data occurs. In centralized fusion, the clients simply forward all of the data to a central location, and some entity at the central location is responsible for correlating and fusing the data. In decentralized, the clients take full responsibility for fusing the data. "In this case, every sensor or platform can be viewed as an intelligent asset having some degree of autonomy in decision-making." [7]
Multiple combinations of centralized and decentralized systems exist.
Another classification of sensor configuration refers to the coordination of information flow between sensors. [8] [9] These mechanisms provide a way to resolve conflicts or disagreements and to allow the development of dynamic sensing strategies. Sensors are in redundant (or competitive) configuration if each node delivers independent measures of the same properties. This configuration can be used in error correction when comparing information from multiple nodes. Redundant strategies are often used with high level fusions in voting procedures. [10] [11] Complementary configuration occurs when multiple information sources supply different information about the same features. This strategy is used for fusing information at raw data level within decision-making algorithms. Complementary features are typically applied in motion recognition tasks with Neural network, [12] [13] Hidden Markov model, [14] [15] Support-vector machine, [16] clustering methods and other techniques. [16] [15] Cooperative sensor fusion uses the information extracted by multiple independent sensors to provide information that would not be available from single sensors. For example, sensors connected to body segments are used for the detection of the angle between them. Cooperative sensor strategy gives information impossible to obtain from single nodes. Cooperative information fusion can be used in motion recognition, [17] gait analysis, motion analysis, [18] [19] ,. [20]
There are several categories or levels of sensor fusion that are commonly used. [21] [22] [23] [24] [25] [26]
Sensor fusion level can also be defined basing on the kind of information used to feed the fusion algorithm. [27] More precisely, sensor fusion can be performed fusing raw data coming from different sources, extrapolated features or even decision made by single nodes.
One application of sensor fusion is GPS/INS, where Global Positioning System and inertial navigation system data is fused using various different methods, e.g. the extended Kalman filter. This is useful, for example, in determining the attitude of an aircraft using low-cost sensors. [32] Another example is using the data fusion approach to determine the traffic state (low traffic, traffic jam, medium flow) using road side collected acoustic, image and sensor data. [33] In the field of autonomous driving, sensor fusion is used to combine the redundant information from complementary sensors in order to obtain a more accurate and reliable representation of the environment. [34]
Although technically not a dedicated sensor fusion method, modern Convolutional neural network based methods can simultaneously process many channels of sensor data (such as Hyperspectral imaging with hundreds of bands [35] ) and fuse relevant information to produce classification results.
Gait analysis is the systematic study of animal locomotion, more specifically the study of human motion, using the eye and the brain of observers, augmented by instrumentation for measuring body movements, body mechanics, and the activity of the muscles. Gait analysis is used to assess and treat individuals with conditions affecting their ability to walk. It is also commonly used in sports biomechanics to help athletes run more efficiently and to identify posture-related or movement-related problems in people with injuries.
Simultaneous localization and mapping (SLAM) is the computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent's location within it. While this initially appears to be a chicken or the egg problem, there are several algorithms known to solve it in, at least approximately, tractable time for certain environments. Popular approximate solution methods include the particle filter, extended Kalman filter, covariance intersection, and GraphSLAM. SLAM algorithms are based on concepts in computational geometry and computer vision, and are used in robot navigation, robotic mapping and odometry for virtual reality or augmented reality.
Wireless sensor networks (WSNs) refer to networks of spatially dispersed and dedicated sensors that monitor and record the physical conditions of the environment and forward the collected data to a central location. WSNs can measure environmental conditions such as temperature, sound, pollution levels, humidity and wind.
The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.
A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to the uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior. A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that can not be unrolled.
In computer networking, linear network coding is a program in which intermediate nodes transmit data from source nodes to sink nodes by means of linear combinations.
Machine olfaction is the automated simulation of the sense of smell. An emerging application in modern engineering, it involves the use of robots or other automated systems to analyze air-borne chemicals. Such an apparatus is often called an electronic nose or e-nose. The development of machine olfaction is complicated by the fact that e-nose devices to date have responded to a limited number of chemicals, whereas odors are produced by unique sets of odorant compounds. The technology, though still in the early stages of development, promises many applications, such as: quality control in food processing, detection and diagnosis in medicine, detection of drugs, explosives and other dangerous or illegal substances, disaster response, and environmental monitoring.
A wireless ad hoc network (WANET) or mobile ad hoc network (MANET) is a decentralized type of wireless network. The network is ad hoc because it does not rely on a pre-existing infrastructure, such as routers or wireless access points. Instead, each node participates in routing by forwarding data for other nodes. The determination of which nodes forward data is made dynamically on the basis of network connectivity and the routing algorithm in use.
Data fusion is the process of integrating multiple data sources to produce more consistent, accurate, and useful information than that provided by any individual data source.
GPS/INS is the use of GPS satellite signals to correct or calibrate a solution from an inertial navigation system (INS). The method is applicable for any GNSS/INS system.
An indoor positioning system (IPS) is a network of devices used to locate people or objects where GPS and other satellite technologies lack precision or fail entirely, such as inside multistory buildings, airports, alleys, parking garages, and underground locations.
Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.
In wireless communication, spatial correlation is the correlation between a signal's spatial direction and the average received signal gain. Theoretically, the performance of wireless communication systems can be improved by having multiple antennas at the transmitter and the receiver. The idea is that if the propagation channels between each pair of transmit and receive antennas are statistically independent and identically distributed, then multiple independent channels with identical characteristics can be created by precoding and be used for either transmitting multiple data streams or increasing the reliability. In practice, the channels between different antennas are often correlated and therefore the potential multi antenna gains may not always be obtainable.
Location estimation in wireless sensor networks is the problem of estimating the location of an object from a set of noisy measurements. These measurements are acquired in a distributed manner by a set of sensors.
Covariance intersection (CI) is an algorithm for combining two or more estimates of state variables in a Kalman filter when the correlation between them is unknown.
The Brooks–Iyengar algorithm or FuseCPA Algorithm or Brooks–Iyengar hybrid algorithm is a distributed algorithm that improves both the precision and accuracy of the interval measurements taken by a distributed sensor network, even in the presence of faulty sensors. The sensor network does this by exchanging the measured value and accuracy value at every node with every other node, and computes the accuracy range and a measured value for the whole network from all of the values collected. Even if some of the data from some of the sensors is faulty, the sensor network will not malfunction. The algorithm is fault-tolerant and distributed. It could also be used as a sensor fusion method. The precision and accuracy bound of this algorithm have been proved in 2016.
Wireless sensor networks (WSN) are a spatially distributed network of autonomous sensors used for monitoring an environment. Energy cost is a major limitation for WSN requiring the need for energy efficient networks and processing. One of major energy costs in WSN is the energy spent on communication between nodes and it is sometimes desirable to only send data to a gateway node when an event of interest is triggered at a sensor. Sensors will then only open communication during a probable event, saving on communication costs. Fields interested in this type of network include surveillance, home automation, disaster relief, traffic control, health care and more.
Fusion adaptive resonance theory (fusion ART) is a generalization of self-organizing neural networks known as the original Adaptive Resonance Theory models for learning recognition categories across multiple pattern channels. There is a separate stream of work on fusion ARTMAP, that extends fuzzy ARTMAP consisting of two fuzzy ART modules connected by an inter-ART map field to an extended architecture consisting of multiple ART modules.
Data augmentation is a statistical technique which allows maximum likelihood estimation from incomplete data. Data augmentation has important applications in Bayesian analysis, and the technique is widely used in machine learning to reduce overfitting when training machine learning models, achieved by training models on several slightly-modified copies of existing data.
Video super-resolution (VSR) is the process of generating high-resolution video frames from the given low-resolution video frames. Unlike single-image super-resolution (SISR), the main goal is not only to restore more fine details while saving coarse ones, but also to preserve motion consistency.
{{cite conference}}
: CS1 maint: multiple names: authors list (link)