Knowledge retrieval seeks to return information in a structured form, consistent with human cognitive processes as opposed to simple lists of data items. It draws on a range of fields including epistemology (theory of knowledge), cognitive psychology, cognitive neuroscience, logic and inference, machine learning and knowledge discovery, linguistics, and information technology.
In the field of retrieval systems, established approaches include:
Both approaches require a user to read and analyze often long lists of data sets or documents in order to extract meaning.
The goal of knowledge retrieval systems is to reduce the burden of those processes by improved search and representation. This improvement is needed to leverage the increasing data volumes available on the Internet. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
Data Retrieval and Information Retrieval are earlier and more basic forms of information access. [12]
Data Retrieval | Information Retrieval | Knowledge Retrieval | |
---|---|---|---|
Match | Boolean match | partial match, best match | partial match, best match |
Inference | deductive inference | inductive inference | deductive inference, inductive inference, associative reasoning, analogical reasoning |
Model | deterministic model | statistical and probabilistic model | semantic model, inference model |
Query | artificial language | natural language | knowledge structure, natural language |
Organization | table, index | table, index | knowledge unit, knowledge structure |
Representation | number, rule | natural language, markup language | concept graph, predicate logic, production rule, frame, semantic network, ontology |
Storage | database | document collections | knowledge base |
Retrieved Results | data set | sections or documents | a set of knowledge unit |
Knowledge retrieval focuses on the knowledge level. We need to examine how to extract, represent, and use the knowledge in data and information. [13] Knowledge retrieval systems provide knowledge to users in a structured way. Compared to data retrieval and information retrieval, they use different inference models, retrieval methods, result organization, etc. Table 1, extending van Rijsbergen's comparison of the difference between data retrieval and information retrieval, [14] summarizes the main characteristics of data retrieval, information retrieval, and knowledge retrieval. [15] The core of data retrieval and information retrieval is retrieval subsystems. Data retrieval gets results through Boolean match. [16] Information retrieval uses partial match and best match. Knowledge retrieval is also based on partial match and best match.
From an inference perspective, data retrieval uses deductive inference, and information retrieval uses inductive inference. [14] Considering the limitations from the assumptions of different logics, traditional logic systems (e.g., Horn subset of first order logic) cannot reason efficiently. [17] Associative reasoning, analogical reasoning and the idea of unifying reasoning and search may be effective methods of reasoning at the web scale. [17] [18]
From the retrieval perspective, knowledge retrieval systems focus on semantics and better organization of information. Data retrieval and information retrieval organize the data and documents by indexing, while knowledge retrieval organize information by indicating connections between elements in those documents.
From computer science perspective, a logic framework concentrating on fuzziness of knowledge queries has been proposed and investigated in detail. [19] Markup languages for knowledge reasoning and relevant strategies have been investigated, which may serve as possible logic reasoning foundations for text based knowledge retrieval. [3]
From cognitive science perspective, especially from cognitive psychology and cognitive neuroscience perspective, the neurobiological basis for knowledge retrieval in the human brain has been investigated, and may serve as a cognitive model for knowledge retrieval. [20] [21]
Knowledge retrieval can draw results from the following related theories and technologies: [12]
Topics listed under each entry serve as examples and do not form a complete list. And many related disciplines should be added as the field grows mature.
Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.
Knowledge representation and reasoning is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can use to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets.
A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.
In artificial intelligence (AI), commonsense reasoning is a human-like ability to make presumptions about the type and essence of ordinary situations humans encounter every day. These assumptions include judgments about the nature of physical objects, taxonomic properties, and peoples' intentions. A device that exhibits commonsense reasoning might be capable of drawing conclusions that are similar to humans' folk psychology and naive physics.
The expression computational intelligence (CI) usually refers to the ability of a computer to learn a specific task from data or experimental observation. Even though it is commonly considered a synonym of soft computing, there is still no commonly accepted definition of computational intelligence.
In artificial intelligence research, commonsense knowledge consists of facts about the everyday world, such as "Lemons are sour", or "Cows say moo", that all humans are expected to know. It is currently an unsolved problem in Artificial General Intelligence. The first AI program to address common sense knowledge was Advice Taker in 1959 by John McCarthy.
Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor.
James Ze Wang is a Chinese-American computer scientist. He is a distinguished professor of the College of Information Sciences and Technology at Pennsylvania State University. He is also an affiliated professor of the Molecular, Cellular, and Integrative Biosciences Program; the Computational Science Graduate Minor; and the Social Data Analytics Graduate Program. He is co-director of the Intelligent Information Systems Laboratory. He was a visiting professor of the Robotics Institute at Carnegie Mellon University from 2007 to 2008. In 2011 and 2012, he served as a program manager in the Office of International Science and Engineering at the National Science Foundation. He is the second son of Chinese mathematician Wang Yuan.
Ronald Fagin is an American mathematician and computer scientist, and IBM Fellow at the IBM Almaden Research Center. He is known for his work in database theory, finite model theory, and reasoning about knowledge.
Informatics is the study of computational systems. According to the ACM Europe Council and Informatics Europe, informatics is synonymous with computer science and computing as a profession, in which the central notion is transformation of information. In other countries, the term "informatics" is used with a different meaning in the context of library science, in which case it is synonymous with data storage and retrieval.
Uncertain inference was first described by C. J. van Rijsbergen as a way to formally define a query and document relationship in Information retrieval. This formalization is a logical implication with an attached measure of uncertainty.
Flora-2 is an open source semantic rule-based system for knowledge representation and reasoning. The language of the system is derived from F-logic, HiLog, and Transaction logic. Being based on F-logic and HiLog implies that object-oriented syntax and higher-order representation are the major features of the system. Flora-2 also supports a form of defeasible reasoning called Logic Programming with Defaults and Argumentation Theories (LPDA). Applications include intelligent agents, Semantic Web, knowledge-bases networking, ontology management, integration of information, security policy analysis, automated database normalization, and more.
Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative and often recursive programs from incomplete specifications, such as input/output examples or constraints.
Alan F. Blackwell is a New Zealand-British cognition scientist and professor at the Computer Laboratory, University of Cambridge, known for his work on diagrammatic representation, on data and language modelling, investment modelling, and end-user software engineering.
This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.
Cognitive city is a term which expands the concept of the smart city with the aspect of cognition or refers to a virtual environment where goal-driven communities gather to share knowledge. A physical cognitive city differs from conventional cities and smart cities in the fact that it is steadily learning through constant interaction with its citizens through advanced information and communications technologies (ICT) based ICT standards and that, based on this exchange of information, it becomes continuously more efficient, more sustainable and more resilient. A virtual cognitive city differs from social media platforms and project management platforms in that shared data is critical for the group's performance, and the community consists of members spanning diverse expertise, backgrounds, motivations, and geographies but with a common desire to solve large problems. The virtual cognitive city is steadily learning through constant metadata generated by activity in the user community.
Antonio Lieto is an Italian cognitive scientist and computer scientist at the University of Turin and a Research Associate at the Institute of High Performance Computing of the Italian National Research Council focusing on cognitive architectures and computational models of cognition, commonsense reasoning and models of mental representation, and persuasive technologies. He teaches Artificial Intelligence and "Design and Evaluation of Cognitive Artificial Systems" at the Department of Computer Science of the University of Turin.