Sensor fusion is a process of combining sensor data or data derived from disparate sources so that the resulting information has less uncertainty than would be possible if these sources were used individually. For instance, one could potentially obtain a more accurate location estimate of an indoor object by combining multiple data sources such as video cameras and WiFi localization signals. The term uncertainty reduction in this case can mean more accurate, more complete, or more dependable, or refer to the result of an emerging view, such as stereoscopic vision (calculation of depth information by combining two-dimensional images from two cameras at slightly different viewpoints). [1] [2]
The data sources for a fusion process are not specified to originate from identical sensors. One can distinguish direct fusion, indirect fusion and fusion of the outputs of the former two. Direct fusion is the fusion of sensor data from a set of heterogeneous or homogeneous sensors, soft sensors, and history values of sensor data, while indirect fusion uses information sources like a priori knowledge about the environment and human input.
Sensor fusion is also known as (multi-sensor) data fusion and is a subset of information fusion .
Sensor fusion is a term that covers a number of methods and algorithms, including:
Two example sensor fusion calculations are illustrated below.
Let and denote two estimates from two independent sensor measurements, with noise variances and , respectively. One way of obtaining a combined estimate is to apply inverse-variance weighting, which is also employed within the Fraser-Potter fixed-interval smoother, namely [6]
where is the variance of the combined estimate. It can be seen that the fused result is simply a linear combination of the two measurements weighted by their respective information.
It is worth noting that if is a random variable. The estimates and will be correlated through common process noise, which will cause the estimate to lose conservativeness. [7]
Another (equivalent) method to fuse two measurements is to use the optimal Kalman filter. Suppose that the data is generated by a first-order system and let denote the solution of the filter's Riccati equation. By applying Cramer's rule within the gain calculation it can be found that the filter gain is given by:[ citation needed ]
By inspection, when the first measurement is noise free, the filter ignores the second measurement and vice versa. That is, the combined estimate is weighted by the quality of the measurements.
In sensor fusion, centralized versus decentralized refers to where the fusion of the data occurs. In centralized fusion, the clients simply forward all of the data to a central location, and some entity at the central location is responsible for correlating and fusing the data. In decentralized, the clients take full responsibility for fusing the data. "In this case, every sensor or platform can be viewed as an intelligent asset having some degree of autonomy in decision-making." [8]
Multiple combinations of centralized and decentralized systems exist.
Another classification of sensor configuration refers to the coordination of information flow between sensors. [9] [10] These mechanisms provide a way to resolve conflicts or disagreements and to allow the development of dynamic sensing strategies. Sensors are in redundant (or competitive) configuration if each node delivers independent measures of the same properties. This configuration can be used in error correction when comparing information from multiple nodes. Redundant strategies are often used with high level fusions in voting procedures. [11] [12] Complementary configuration occurs when multiple information sources supply different information about the same features. This strategy is used for fusing information at raw data level within decision-making algorithms. Complementary features are typically applied in motion recognition tasks with neural network, [13] [14] hidden Markov model, [15] [16] support vector machine, [17] clustering methods and other techniques. [17] [16] Cooperative sensor fusion uses the information extracted by multiple independent sensors to provide information that would not be available from single sensors. For example, sensors connected to body segments are used for the detection of the angle between them. Cooperative sensor strategy gives information impossible to obtain from single nodes. Cooperative information fusion can be used in motion recognition, [18] gait analysis, motion analysis, [19] [20] ,. [21]
There are several categories or levels of sensor fusion that are commonly used. [22] [23] [24] [25] [26] [27]
Sensor fusion level can also be defined basing on the kind of information used to feed the fusion algorithm. [28] More precisely, sensor fusion can be performed fusing raw data coming from different sources, extrapolated features or even decision made by single nodes.
One application of sensor fusion is GPS/INS, where Global Positioning System and inertial navigation system data is fused using various different methods, e.g. the extended Kalman filter. This is useful, for example, in determining the attitude of an aircraft using low-cost sensors. [33] Another example is using the data fusion approach to determine the traffic state (low traffic, traffic jam, medium flow) using road side collected acoustic, image and sensor data. [34] In the field of autonomous driving, sensor fusion is used to combine the redundant information from complementary sensors in order to obtain a more accurate and reliable representation of the environment. [35]
Although technically not a dedicated sensor fusion method, modern convolutional neural network based methods can simultaneously process many channels of sensor data (such as hyperspectral imaging with hundreds of bands [36] ) and fuse relevant information to produce classification results.
Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data, potentially existing across non-linear manifolds which cannot be adequately captured by linear decomposition methods, onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.
Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.
Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text. A matrix containing word counts per document is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents.
Wireless sensor networks (WSNs) refer to networks of spatially dispersed and dedicated sensors that monitor and record the physical conditions of the environment and forward the collected data to a central location. WSNs can measure environmental conditions such as temperature, sound, pollution levels, humidity and wind.
The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.
In graph theory and network analysis, indicators of centrality assign numbers or rankings to nodes within a graph corresponding to their network position. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, super-spreaders of disease, and brain networks. Centrality concepts were first developed in social network analysis, and many of the terms used to measure centrality reflect their sociological origin.
Recurrent neural networks (RNNs) are a class of artificial neural network commonly used for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.
In computer networking, linear network coding is a program in which intermediate nodes transmit data from source nodes to sink nodes by means of linear combinations.
The structural similarityindex measure (SSIM) is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. It is also used for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measurement or prediction of image quality is based on an initial uncompressed or distortion-free image as reference.
Machine olfaction is the automated simulation of the sense of smell. An emerging application in modern engineering, it involves the use of robots or other automated systems to analyze air-borne chemicals. Such an apparatus is often called an electronic nose or e-nose. The development of machine olfaction is complicated by the fact that e-nose devices to date have responded to a limited number of chemicals, whereas odors are produced by unique sets of odorant compounds. The technology, though still in the early stages of development, promises many applications, such as: quality control in food processing, detection and diagnosis in medicine, detection of drugs, explosives and other dangerous or illegal substances, disaster response, and environmental monitoring.
Long short-term memory (LSTM) is a type of recurrent neural network (RNN) aimed at mitigating the vanishing gradient problem commonly encountered by traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, hidden Markov models, and other sequence learning methods. It aims to provide a short-term memory for RNN that can last thousands of timesteps. The name is made in analogy with long-term memory and short-term memory and their relationship, studied by cognitive psychologists since the early 20th century.
Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing.
Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Nyquist–Shannon sampling theorem. There are two conditions under which recovery is possible. The first one is sparsity, which requires the signal to be sparse in some domain. The second one is incoherence, which is applied through the isometric property, which is sufficient for sparse signals. Compressed sensing has applications in, for example, MRI where the incoherence condition is typically satisfied.
Location estimation in wireless sensor networks is the problem of estimating the location of an object from a set of noisy measurements. These measurements are acquired in a distributed manner by a set of sensors.
In the mathematical theory of artificial neural networks, universal approximation theorems are theorems of the following form: Given a family of neural networks, for each function from a certain function space, there exists a sequence of neural networks from the family, such that according to some criterion. That is, the family of neural networks is dense in the function space.
The Brooks–Iyengar algorithm or FuseCPA Algorithm or Brooks–Iyengar hybrid algorithm is a distributed algorithm that improves both the precision and accuracy of the interval measurements taken by a distributed sensor network, even in the presence of faulty sensors. The sensor network does this by exchanging the measured value and accuracy value at every node with every other node, and computes the accuracy range and a measured value for the whole network from all of the values collected. Even if some of the data from some of the sensors is faulty, the sensor network will not malfunction. The algorithm is fault-tolerant and distributed. It could also be used as a sensor fusion method. The precision and accuracy bound of this algorithm have been proved in 2016.
In graph theory, betweenness centrality is a measure of centrality in a graph based on shortest paths. For every pair of vertices in a connected graph, there exists at least one shortest path between the vertices such that either the number of edges that the path passes through or the sum of the weights of the edges is minimized. The betweenness centrality for each vertex is the number of these shortest paths that pass through the vertex.
Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes need to be tuned. These hidden nodes can be randomly assigned and never updated, or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to learning a linear model.
Wireless sensor networks (WSN) are a spatially distributed network of autonomous sensors used for monitoring an environment. Energy cost is a major limitation for WSN requiring the need for energy efficient networks and processing. One of major energy costs in WSN is the energy spent on communication between nodes and it is sometimes desirable to only send data to a gateway node when an event of interest is triggered at a sensor. Sensors will then only open communication during a probable event, saving on communication costs. Fields interested in this type of network include surveillance, home automation, disaster relief, traffic control, health care and more.
A graph neural network (GNN) belongs to a class of artificial neural networks for processing data that can be represented as graphs.
{{cite conference}}
: CS1 maint: multiple names: authors list (link)