Information integration

Last updated

Information integration (II) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources. Typically, information integration refers to textual representations of knowledge but is sometimes applied to rich-media content. Information fusion, which is a related term, involves the combination of information into a new set of information towards reducing redundancy and uncertainty. [1]

Contents

Examples of technologies available to integrate information include deduplication, and string metrics which allow the detection of similar text in different data sources by fuzzy matching. A host of methods for these research areas are available such as those presented in the International Society of Information Fusion. Other methods rely on causal estimates of the outcomes based on a model of the sources. [2]

See also

Books

Related Research Articles

Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human visual system can do.

Graph drawing

Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, linguistics, and bioinformatics.

Wireless sensor network

Wireless sensor network (WSN) refers to a group of spatially dispersed and dedicated sensors for monitoring and recording the physical conditions of the environment and organizing the collected data at a central location. WSNs measure environmental conditions like temperature, sound, pollution levels, humidity, wind, and so on.

Sensor fusion

Sensor fusion is combining of sensory data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. The term uncertainty reduction in this case can mean more accurate, more complete, or more dependable, or refer to the result of an emerging view, such as stereoscopic vision.

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

Multimodal interaction provides the user with multiple modes of interacting with a system. A multimodal interface provides several distinct tools for input and output of data. For example, a multimodal question answering system employs multiple modalities at both question (input) and answer (output) level.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

In predictive analytics and machine learning, the concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes.

Multivariate analysis (MVA) is based on the principles of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time. Typically, MVA is used to address the situations where multiple measurements are made on each experimental unit and the relations among these measurements and their structures are important. A modern, overlapping categorization of MVA includes:

Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causality, knowledge representation, bioinformatics and health informatics researcher and educator.

In computer science and operations research, a memetic algorithm (MA) is an extension of the traditional genetic algorithm. It uses a local search technique to reduce the likelihood of the premature convergence.

Data fusion

Data fusion is the process of integrating multiple data sources to produce more consistent, accurate, and useful information than that provided by any individual data source.

The image fusion process is defined as gathering all the important information from multiple images, and their inclusion into fewer images, usually a single one. This single image is more informative and accurate than any single source image, and it consists of all the necessary information. The purpose of image fusion is not only to reduce the amount of data but also to construct images that are more appropriate and understandable for the human and machine perception. In computer vision, multisensor image fusion is the process of combining relevant information from two or more images into a single image. The resulting image will be more informative than any of the input images.

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

In radar technology and similar fields, track-before-detect (TBD) is a concept according to which a signal is tracked before declaring it a target. In this approach, the sensor data about a tentative target are integrated over time and may yield detection in cases when signals from any particular time instance are too weak against clutter to register a detected target.

Multilinear subspace learning Approach to dimensionality reduction

Multilinear subspace learning is an approach to dimensionality reduction. Dimensionality reduction can be performed on a data tensor whose observations have been vectorized and organized into a data tensor, or whose observations are matrices that are concatenated into a data tensor. Here are some examples of data tensors whose observations are vectorized or whose observations are matrices concatenated into data tensor images (2D/3D), video sequences (3D/4D), and hyperspectral cubes (3D/4D).

Data mining, the process of discovering patterns in large data sets, has been used in many applications.

VR positional tracking

In virtual reality (VR), positional tracking detects the precise position of the head-mounted displays, controllers, other objects or body parts within Euclidean space. Because the purpose of VR is to emulate perceptions of reality, it is paramount that positional tracking be both accurate and precise so as not to break the illusion of three-dimensional space. Several methods of tracking the position and orientation of the display and any associated objects or devices have been developed to achieve this. All of said methods utilize sensors which repeatedly record signals from transmitters on or near the tracked object(s), and then send that data to the computer in order to maintain an approximation of their physical locations. By and large, these physical locations are identified and defined using one or more of three coordinate systems: the Cartesian rectilinear system, the spherical polar system, and the cylindrical system. Many interfaces have also been designed to monitor and control one’s movement within and interaction with the virtual 3D space; such interfaces must work closely with positional tracking systems to provide a seamless user experience.

Multimodal sentiment analysis is a new dimension of the traditional text-based sentiment analysis, which goes beyond the analysis of texts, and includes other modalities such as audio and visual data. It can be bimodal, which includes different combinations of two modalities, or trimodal, which incorporates three modalities. With the extensive amount of social media data available online in different forms such as videos and images, the conventional text-based sentiment analysis has evolved into more complex models of multimodal sentiment analysis, which can be applied in the development of virtual assistants, analysis of YouTube movie reviews, analysis of news videos, and emotion recognition such as depression monitoring, among others.

References

  1. Haghighat, Mohammad; Abdel-Mottaleb, Mohamed; Alhalabi, Wadee (2016). "Discriminant Correlation Analysis: Real-Time Feature Level Fusion for Multimodal Biometric Recognition". IEEE Transactions on Information Forensics and Security. 11 (9): 1984–1996. doi:10.1109/TIFS.2016.2569061. S2CID   15624506.
  2. P.K. Davis, D. Manheim, W.L. Perry, J. Hollywood (2015). In Proceedings of the 2015 Winter Simulation Conference (WSC '15). IEEE Press, Piscataway, NJ, USA, 2586-2597.