Reification (information retrieval)

Last updated

In information retrieval and natural language processing reification is the process by which an abstract idea about a person, place or thing, is turned into an explicit data model or other object created in a programming language, such as a feature set of demographic [1] or psychographic [2] attributes or both. By means of reification, something that was previously implicit, unexpressed, and possibly inexpressible is explicitly formulated and made available to conceptual (logical or computational) manipulation.

The process by which a natural language statement is transformed so actions and events in it become quantifiable variables is semantic parsing. [3] For example "John chased the duck furiously" can be transformed into something like

(Exists e)(chasing(e) & past_tense(e) & actor(e,John) & furiously(e) & patient(e,duck)).

Another example would be "Sally said John is mean", which could be expressed as something like

(Exists u,v)(saying(u) & past_tense(u) & actor(u,Sally) & that(u,v) & is(v) & actor(v,John) & mean(v)).

Such formal meaning representations allow one to use the tools of classical first-order predicate calculus even for statements which, due to their use of tense, modality, adverbial constructions, propositional arguments (e.g. "Sally said that X"), etc., would have seemed intractable. This is an advantage because predicate calculus is better understood and simpler than the more complex alternatives (higher-order logics, modal logics, temporal logics, etc.), and there exist better automated tools (e.g. automated theorem provers and model checkers) for manipulating it.

Meaning representations can be used for other purposes besides the application of first-order logic; one example is the automatic discovery of synonymous phrases. [4] [5]

The meaning representations are sometimes called quasi-logical forms, and the existential variables are sometimes treated as Skolem constants. [5]

Not all natural language constructs admit a uniform translation to first order logic. See donkey sentence for examples and a discussion.

See also

Related Research Articles

First-order logic—also called predicate logic, predicate calculus, quantificational logic—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over non-logical objects, and allows the use of sentences that contain variables, so that rather than propositions such as "Socrates is a man", one can have expressions in the form "there exists x such that x is Socrates and x is a man", where "there exists" is a quantifier, while x is a variable. This distinguishes it from propositional logic, which does not use quantifiers or relations; in this sense, propositional logic is the foundation of first-order logic.

Knowledge representation and reasoning is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can use to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge, in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning.

Prolog is a logic programming language that has its origins in artificial intelligence, automated theorem proving and computational linguistics.

<span class="mw-page-title-main">Saul Kripke</span> American philosopher and logician (1940–2022)

Saul Aaron Kripke was an American analytic philosopher and logician. He was Distinguished Professor of Philosophy at the Graduate Center of the City University of New York and emeritus professor at Princeton University. Since the 1960s, he has been a central figure in a number of fields related to mathematical and modal logic, philosophy of language and mathematics, metaphysics, epistemology, and recursion theory.

In logic and proof theory, natural deduction is a kind of proof calculus in which logical reasoning is expressed by inference rules closely related to the "natural" way of reasoning. This contrasts with Hilbert-style systems, which instead use axioms as much as possible to express the logical laws of deductive reasoning.

Natural language understanding (NLU) or natural language interpretation (NLI) is a subset of natural language processing in artificial intelligence that deals with machine reading comprehension. NLU has been considered an AI-hard problem.

Metalogic is the metatheory of logic. Whereas logic studies how logical systems can be used to construct valid and sound arguments, metalogic studies the properties of logical systems. Logic concerns the truths that may be derived using a logical system; metalogic concerns the truths that may be derived about the languages and systems that are used to express truths.

Montague grammar is an approach to natural language semantics, named after American logician Richard Montague. The Montague grammar is based on mathematical logic, especially higher-order predicate logic and lambda calculus, and makes use of the notions of intensional logic, via Kripke models. Montague pioneered this approach in the 1960s and early 1970s.

Calculus in its most general sense is any method or system of calculation.

Intensional logic is an approach to predicate logic that extends first-order logic, which has quantifiers that range over the individuals of a universe (extensions), by additional quantifiers that range over terms that may have such individuals as their value (intensions). The distinction between intensional and extensional entities is parallel to the distinction between sense and reference.

<span class="mw-page-title-main">Existential graph</span> Type of diagrammatic or visual notation for logical expressions

An existential graph is a type of diagrammatic or visual notation for logical expressions, created by Charles Sanders Peirce, who wrote on graphical logic as early as 1882, and continued to develop the method until his death in 1914. They include both a separate graphical notation for logical statements and a logical calculus, a formal system of rules of inference that can be used to derive new theorems from known ones.

In artificial intelligence, a fluent is a condition that can change over time. In logical approaches to reasoning about actions, fluents can be represented in first-order logic by predicates having an argument that depends on time. For example, the condition "the box is on the table", if it can change over time, cannot be represented by ; a third argument is necessary to the predicate to specify the time: means that the box is on the table at time . This representation of fluents is modified in the situation calculus by using the sequence of the past actions in place of the current time.

Formal ethics is a formal logical system for describing and evaluating the "form" as opposed to the "content" of ethical principles. Formal ethics was introduced by Harry J. Gensler, in part in his 1990 logic textbook Symbolic Logic: Classical and Advanced Systems, but was more fully developed and justified in his 1996 book Formal Ethics.

The Semantics of Business Vocabulary and Business Rules (SBVR) is an adopted standard of the Object Management Group (OMG) intended to be the basis for formal and detailed natural language declarative description of a complex entity, such as a business. SBVR is intended to formalize complex compliance rules, such as operational rules for an enterprise, security policy, standard compliance, or regulatory compliance rules. Such formal vocabularies and rules can be interpreted and used by computer systems. SBVR is an integral part of the OMG's model-driven architecture (MDA).

Philosophy of logic is the area of philosophy that studies the scope and nature of logic. It investigates the philosophical problems raised by logic, such as the presuppositions often implicitly at work in theories of logic and in their application. This involves questions about how logic is to be defined and how different logical systems are connected to each other. It includes the study of the nature of the fundamental concepts used by logic and the relation of logic to other disciplines. According to a common characterisation, philosophical logic is the part of the philosophy of logic that studies the application of logical methods to philosophical problems, often in the form of extended logical systems like modal logic. But other theorists draw the distinction between the philosophy of logic and philosophical logic differently or not at all. Metalogic is closely related to the philosophy of logic as the discipline investigating the properties of formal logical systems, like consistency and completeness.

In information technology a reasoning system is a software system that generates conclusions from available knowledge using logical techniques such as deduction and induction. Reasoning systems play an important role in the implementation of artificial intelligence and knowledge-based systems.

Formal semantics is the study of grammatical meaning in natural languages using formal tools from logic, mathematics and theoretical computer science. It is an interdisciplinary field, sometimes regarded as a subfield of both linguistics and philosophy of language. It provides accounts of what linguistic expressions mean and how their meanings are composed from the meanings of their parts. The enterprise of formal semantics can be thought of as that of reverse-engineering the semantic components of natural languages' grammars.

Dynamic Syntax (DS) is a grammar formalism and linguistic theory whose overall aim is to explain the real-time processes of language understanding and production, and describe linguistic structures as happening step-by-step over time. Under the DS approach, syntactic knowledge is understood as the ability to incrementally analyse the structure and content of spoken and written language in context and in real-time. While it posits representations similar to those used in Combinatory categorial grammars (CCG), it builds those representations left-to-right going word-by-word. Thus it differs from other syntactic models which generally abstract away from features of everyday conversation such as interruption, backtracking, and self-correction. Moreover, it differs from other approaches in that it does not postulate an independent level of syntactic structure over words.

<span class="mw-page-title-main">Semantic parsing</span>

Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations. Semantic parsing is one of the important tasks in computational linguistics and natural language processing.

References

  1. http://cs.iit.edu/~culotta/pubs/culotta15predicting.pdf
  2. Salvi, Francesco; Manoel Horta Ribeiro; Gallotti, Riccardo; West, Robert (2024). "On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial". arXiv: 2403.14380 [cs.CY].
  3. https://cs.stanford.edu/~pliang/papers/executable-cacm2016.pdf
  4. Dekang Lin and Patrick Pantel, "DIRT – Discovery of Inference Rules from Text", (2001) KDD01-Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
  5. 1 2 Hoifung Poon and Pedro Domingos "Unsupervised Semantic Parsing" (2009) EMNLP09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing