Completeness (knowledge bases)

Last updated

The term completeness as applied to knowledge bases refers to two different concepts.

Contents

Formal logic

In formal logic, a knowledge base KB is complete if there is no formula α such that KB ⊭ α and KB ⊭ ¬α.

Example of knowledge base with incomplete knowledge:

KB := { A ∨ B }

Then we have KB ⊭ A and KB ⊭ ¬A.

In some cases, a consistent knowledge base can be made complete with the closed world assumption—that is, adding all not-entailed literals as negations to the knowledge base. In the above example though, this would not work because it would make the knowledge base inconsistent:

KB' = { A ∨ B, ¬A, ¬B }

In the case where KB := { P(a), Q(a), Q(b) }, KB ⊭ P(b) and KB ⊭ ¬P(b), so, with the closed world assumption, KB' = { P(a), ¬P(b), Q(a), Q(b) }, where KB' ⊨ ¬P(b).

Data management

In data management, completeness is metaknowledge that can be asserted for parts of the KB via completeness assertions. [1] [2]

As example, a knowledge base may contain complete information for predicates R and S, while nothing is asserted for predicate T. Then consider the following queries:

 Q1 :- R(x), S(x)  Q2 :- R(x), T(x)

For Query 1, the knowledge base would return a complete answer, as only predicates that are themselves complete are intersected. For Query 2, no such conclusion could be made, as predicate T is potentially incomplete.

See also

Related Research Articles

<span class="mw-page-title-main">Cyc</span> Artificial intelligence project

Cyc is a long-term artificial intelligence project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge, Cyc focuses on implicit knowledge that other AI platforms may take for granted. This is contrasted with facts one might find somewhere on the internet or retrieve via a search engine or Wikipedia. Cyc enables semantic reasoners to perform human-like reasoning and be less "brittle" when confronted with novel situations.

Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic programming language families include Prolog, answer set programming (ASP) and Datalog. In all of these languages, rules are written in the form of clauses:

Prolog is a logic programming language associated with artificial intelligence and computational linguistics.

<i>Principia Mathematica</i> Book on the foundations of mathematics

The Principia Mathematica is a three-volume work on the foundations of mathematics written by mathematician–philosophers Alfred North Whitehead and Bertrand Russell and published in 1910, 1912, and 1913. In 1925–1927, it appeared in a second edition with an important Introduction to the Second Edition, an Appendix A that replaced ✸9 and all-new Appendix B and Appendix C. PM is not to be confused with Russell's 1903 The Principles of Mathematics. PM was originally conceived as a sequel volume to Russell's 1903 Principles, but as PM states, this became an unworkable suggestion for practical and philosophical reasons: "The present work was originally intended by us to be comprised in a second volume of Principles of Mathematics... But as we advanced, it became increasingly evident that the subject is a very much larger one than we had supposed; moreover on many fundamental questions which had been left obscure and doubtful in the former work, we have now arrived at what we believe to be satisfactory solutions."

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

Gödel's incompleteness theorems are two theorems of mathematical logic that are concerned with the limits of provability in formal axiomatic theories. These results, published by Kurt Gödel in 1931, are important both in mathematical logic and in the philosophy of mathematics. The theorems are widely, but not universally, interpreted as showing that Hilbert's program to find a complete and consistent set of axioms for all mathematics is impossible.

Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties from Prolog. It is often used as a query language for deductive databases. In recent years, Datalog has found new application in data integration, information extraction, networking, program analysis, security, cloud computing and machine learning.

Belief revision is the process of changing beliefs to take into account a new piece of information. The logical formalization of belief revision is researched in philosophy, in databases, and in artificial intelligence for the design of rational agents.

In propositional logic, a propositional formula is a type of syntactic formula which is well formed and has a truth value. If the values of all variables in a propositional formula are given, it determines a unique truth value. A propositional formula may also be called a propositional expression, a sentence, or a sentential formula.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

In artificial intelligence research, commonsense knowledge consists of facts about the everyday world, such as "Lemons are sour", or "Cows say moo", that all humans are expected to know. It is currently an unsolved problem in Artificial General Intelligence. The first AI program to address common sense knowledge was Advice Taker in 1959 by John McCarthy.

Answer set programming (ASP) is a form of declarative programming oriented towards difficult search problems. It is based on the stable model semantics of logic programming. In ASP, search problems are reduced to computing stable models, and answer set solvers—programs for generating stable models—are used to perform search. The computational process employed in the design of many answer set solvers is an enhancement of the DPLL algorithm and, in principle, it always terminates.

Negation as failure is a non-monotonic inference rule in logic programming, used to derive from failure to derive . Note that can be different from the statement of the logical negation of , depending on the completeness of the inference algorithm and thus also on the formal logic system.

The closed-world assumption (CWA), in a formal system of logic used for knowledge representation, is the presumption that a statement that is true is also known to be true. Therefore, conversely, what is not currently known to be true, is false. The same name also refers to a logical formalization of this assumption by Raymond Reiter. The opposite of the closed-world assumption is the open-world assumption (OWA), stating that lack of knowledge does not imply falsity. Decisions on CWA vs. OWA determine the understanding of the actual semantics of a conceptual expression with the same notations of concepts. A successful formalization of natural language semantics usually cannot avoid an explicit revelation of whether the implicit logical backgrounds are based on CWA or OWA.

Circumscription is a non-monotonic logic created by John McCarthy to formalize the common sense assumption that things are as expected unless otherwise specified. Circumscription was later used by McCarthy in an attempt to solve the frame problem. To implement circumscription in its initial formulation, McCarthy augmented first-order logic to allow the minimization of the extension of some predicates, where the extension of a predicate is the set of tuples of values the predicate is true on. This minimization is similar to the closed-world assumption that what is not known to be true is false.

In a formal system of logic used for knowledge representation, the open-world assumption is the assumption that the truth value of a statement may be true irrespective of whether or not it is known to be true. It is the opposite of the closed-world assumption, which holds that any statement that is true is also known to be true.

The concept of a stable model, or answer set, is used to define a declarative semantics for logic programs with negation as failure. This is one of several standard approaches to the meaning of negation in logic programming, along with program completion and the well-founded semantics. The stable model semantics is the basis of answer set programming.

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.

In database theory, Imieliński–Lipski algebra is an extension of relational algebra onto tables with different types of null values. It is used to operate on relations with incomplete information.

In database theory and knowledge representation, the one of the certain answers is the set of answers to a given query consisting of the intersection of all the complete databases that are consistent with a given knowledge base. The notion of certain answer, investigated in database theory since the 1970s, is indeed defined in the context of open world assumption, where the given knowledge base is assumed to be incomplete.

References

  1. "Integrity = Validity + Completeness". 1989.{{cite journal}}: Cite journal requires |journal= (help)
  2. Levy, Alon (1996). "Obtaining complete answers from incomplete databases".{{cite journal}}: Cite journal requires |journal= (help)