Semantic parameterization

Last updated

Semantic parameterization is a conceptual modeling process for expressing natural language descriptions of a domain in first-order predicate logic. [1] [2] [3] The process yields a formalization of natural language sentences in Description Logic to answer the who,what and where questions in the Inquiry-Cycle Model (ICM) developed by Colin Potts and his colleagues at the Georgia Institute of Technology. [4] The parameterization process complements the Knowledge Acquisition and autOmated Specification (KAOS) method, [5] which formalizes answers to the when, why and how ICM questions in Temporal Logic, to complete the ICM formalization. The artifacts used in the parameterization process include a dictionary that aligns the domain lexicon with unique concepts, distinguishing between synonyms and polysemes, and several natural language patterns that aid in mapping common domain descriptions to formal specifications.

Contents

Relationship to other theories

Semantic Parameterization defines a meta-model consisting of eight roles that are domain-independent and reusable. Seven of these roles correspond to Jeffrey Gruber's thematic relations [6] and case roles in Charles Fillmore's case grammar: [7]

Meta-model Mapping to Case Frames and Thematic Relations
Breaux's Meta-modelFillmore's Case RolesThematic Relations
SubjectAgentiveAgent
Action
ObjectObjective/ FactitiveTheme/ Patient
TargetDativeGoal
SourceSourceSource
InstrumentInstrumentalInstrument
PurposePurposive
LocationLocativeLocation
ComitativeAccompaniment

The Inquiry-Cycle Model (ICM) was introduced to drive elicitation between engineers and stakeholders in requirements engineering. [4] The ICM consists of who, what, where, why, how and when questions. All but the when questions, which require a Temporal Logic to represent such phenomena, have been aligned with the meta-model in semantic parameterization using Description Logic (DL).

Mapping from DL roles to questions in the Inquiry-Cycle Model
DL Role in Meta-modelICM Question
isSubjectOf.ActivityWho performs the action?
isObjectOf.ActivityUpon what is the action performed?
isTargetOf.ActivityWith whom is the transaction performed?
isPurposeOf.ActivityWhy is the action performed?
isInstrumentOf.ActivityHow is the action performed?
isLocationOf.ActivityWhere is the action performed?

Introduction with Example

The semantic parameterization process is based on Description Logic, wherein the TBox is composed of words in a dictionary, including nouns, verbs, and adjectives, and the ABox is partitioned into two sets of assertions: 1) those assertions that come from words in the natural language statement, called the grounding, and 2) those assertions that are inferred by the (human) modeler, called the meta-model. Consider the following unstructured natural language statement (UNLS) (see Breaux et al. [3] for an extended discussion):

UNLS1.0
The customer1,1 must not share2,2 the access-code3,3 of the customer1,1 with someone4,4 who is not the provider5,4.

The modeler first identifies intensional and extensional polysemes and synonyms, denoted by the subscripts: the first subscript uniquely refers to the intensional index, i.e., the same first index in two or more words refer to the same concept in the TBox; the second subscript uniquely refers to the extensional index, i.e., two same second index in two or more words refer to the same individual in the ABox. This indexing step aligns words in the statement and concepts in the dictionary. Next, the modeler identifies concepts from the dictionary to compose the meta-model. The following table illustrates the complete DL expression that results from applying semantic parameterization.

The grounding G and meta-model M derived from UNLS1.0
Grounding (G)Meta-model (M)
Customer(p1)

⨅ Share(p2)
⨅ isAccessCodeOf(p3, p1)
⨅ Someone(p4)
⨅ Provider(p4)

Activity(p5)

⨅ hasSubject(p5, p1)
⨅ hasAction(p5, p2)
⨅ hasObject(p5, p3)
⨅ hasTarget(p5, p4)
⨅ isRefrainmentOf(p5, p1)

Related Research Articles

Definition Statement that attaches a meaning to a term

A definition is a statement of the meaning of a term. Definitions can be classified into two large categories, intensional definitions and extensional definitions. Another important category of definitions is the class of ostensive definitions, which convey the meaning of a term by pointing out examples. A term may have many different senses and multiple meanings, and thus require multiple definitions.

In linguistics, logic, philosophy, and other fields, an intension is any property or quality connoted by a word, phrase, or another symbol. In the case of a word, the word's definition often implies an intension. For instance, the intensions of the word plant include properties such as "being composed of cellulose", "alive", and "organism", among others. A comprehension is the collection of all such intensions.

Knowledge representation and reasoning is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets.

Semantics is the linguistic and philosophical study of meaning in language, programming languages, formal logic, and semiotics. It is concerned with the relationship between signifiers—like words, phrases, signs, and symbols—and what they stand for in reality, their denotation.

Semantic network knowledge representation scheme that uses a directed graph to encode knowledge

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields.

The Semantic Web is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable. To enable the encoding of semantics with the data, technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL) are used. These technologies are used to formally represent metadata. For example, ontology can describe concepts, relationships between entities, and categories of things. These embedded semantics offer significant advantages such as reasoning over data and operating with heterogeneous data sources.

Universal Networking Language (UNL) is a declarative formal language specifically designed to represent semantic data extracted from natural language texts. It can be used as a pivot language in interlingual machine translation systems or as a knowledge representation language in information retrieval applications.

In computer science and information science, an ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

Hybrid logic refers to a number of extensions to propositional modal logic with more expressive power, though still less than first-order logic. In formal logic, there is a trade-off between expressiveness and computational tractability. The history of hybrid logic began with Arthur Prior's work in tense logic.

Intensional logic is an approach to predicate logic that extends first-order logic, which has quantifiers that range over the individuals of a universe (extensions), by additional quantifiers that range over terms that may have such individuals as their value (intensions). The distinction between intensional and extensional entities is parallel to the distinction between sense and reference.

In computer science, an ABox is an "assertion component"—a fact associated with a conceptual model or ontologies within a knowledge base.

In computer science, a TBox is a "terminological component"—a conceptualization associated with a set of facts, known as an ABox.

SNePS is a knowledge representation, reasoning, and acting (KRRA) system developed and maintained by Stuart C. Shapiro and colleagues at the State University of New York at Buffalo.

The Semantics of Business Vocabulary and Business Rules (SBVR) is an adopted standard of the Object Management Group (OMG) intended to be the basis for formal and detailed natural language declarative description of a complex entity, such as a business. SBVR is intended to formalize complex compliance rules, such as operational rules for an enterprise, security policy, standard compliance, or regulatory compliance rules. Such formal vocabularies and rules can be interpreted and used by computer systems. SBVR is an integral part of the OMG's model-driven architecture (MDA).

Ontology engineering research field and engineering activity

In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies: formal representations of a set of concepts within a domain and the relationships between those concepts. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.

Following the developments in formal logic with symbolic logic in the late nineteenth century and mathematical logic in the twentieth, topics traditionally treated by logic not being part of formal logic have tended to be termed either philosophy of logic or philosophical logic if no longer simply logic.

In natural language processing and information retrieval, explicit semantic analysis (ESA) is a vectoral representation of text that uses a document corpus as a knowledge base. Specifically, in ESA, a word is represented as a column vector in the tf–idf matrix of the text corpus and a document is represented as the centroid of the vectors representing its words. Typically, the text corpus is English Wikipedia, though other corpora including the Open Directory Project have been used.

References

  1. Travis D. Breaux and Annie I. Antón (2004). Deriving Semantic Models from Privacy Policies Archived 2011-07-28 at the Wayback Machine . North Carolina State University Computer Science Technical Report TR-2004-36.
  2. Travis D. Breaux and Annie I. Antón (2008). "Mining Rule Semantics to Understand Legislative Compliance" Archived 2011-07-28 at the Wayback Machine . North Carolina State University Computer Science Technical Report TR-2005-31.
  3. 1 2 T.D. Breaux, A.I. Anton, J. Doyle, "Semantic parameterization: a process for modeling domain descriptions" Archived 2008-05-17 at the Wayback Machine , ACM Transactions on Software Engineering Methodology, vol. 18, no. 2, Article 5, 2008.
  4. 1 2 C. Potts, K. Takahashi, and A.I. Anton, "Inquiry-based requirements analysis", IEEE Software 11(2): 21–32, 1994.
  5. A. Dardenne, A. van Lamsweerde and S. Fickas, "Goal-Directed Requirements Acquisition", Science of Computer Programming v. 20, North Holland, 1993, pp. 3-50.
  6. J. Gruber, Lexical Structures in Syntax and Semantics, North Holland, New York, 1976.
  7. C. Fillmore, "The Case for Case", Universals in Linguistic Theory, Holt, Rhinehart and Winston, New York, 1968.