SPL notation

Last updated

SPL (Sentence Plan Language) is an abstract notation representing the semantics of a sentence in natural language. [1] In a classical Natural Language Generation (NLG) workflow, an initial text plan (hierarchically or sequentially organized factoids, often modelled in accordance with Rhetorical Structure Theory) is transformed by a sentence planner (generator) component to a sequence of sentence plans modelled in a Sentence Plan Language. A surface generator can be used to transform the SPL notation into natural language sentences.

Probably the most widely used SPL language used today (2022) is AMR (Abstract Meaning Representation, see there for further references), but is owes parts of its popularity to its application to NLP problems other than NLG, e.g., machine translation and semantic parsing. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Natural language processing</span> Field of linguistics and computer science

Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

<span class="mw-page-title-main">Data model</span> Model that organizes elements of data and how they relate to one another and to real-world entities.

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an AI-hard problem.

In computer science, formal methods are mathematically rigorous techniques for the specification, development, analysis, and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analysis can contribute to the reliability and robustness of a design.

Natural language generation (NLG) is a software process that produces natural language output. A widely-cited survey of NLG methods describes NLG as "the subfield of artificial intelligence and computational linguistics that is concerned with the construction of computer systems than can produce understandable texts in English or other human languages from some underlying non-linguistic representation of information".

Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure Programing language.

<span class="mw-page-title-main">Interlingual machine translation</span> Type of machine translation

Interlingual machine translation is one of the classic approaches to machine translation. In this approach, the source language, i.e. the text to be translated is transformed into an interlingua, i.e., an abstract language-independent representation. The target language is then generated from the interlingua. Within the rule-based machine translation paradigm, the interlingual approach is an alternative to the direct approach and the transfer approach.

Computational humor is a branch of computational linguistics and artificial intelligence which uses computers in humor research. It is a relatively new area, with the first dedicated conference organized in 1996.

What you see is what you meant (WYSIWYM) is a text editing interaction technique that emerged from two projects at University of Brighton. It allows users to create abstract knowledge representations such as those required by the Semantic Web using a natural language interface. Natural language understanding (NLU) technology is not employed. Instead, natural language generation (NLG) is used in a highly interactive manner.

The RISE Editor is a free information modeling tool for information system development based on model driven development. Functionality includes automatic interface composition, database generation and updates, data insertion, programming interface publishing and web service generation. The modeling takes place in Entity Relationship Diagrams (ERD). The layout for these diagrams can be changed to Relational Database or Unified Modeling Language (UML), the functionality stays the same though.

<span class="mw-page-title-main">Behavior tree</span>

Behavior trees are a formal, graphical modelling language used primarily in systems and software engineering. Behavior trees employ a well-defined notation to unambiguously represent the hundreds or even thousands of natural language requirements that are typically used to express the stakeholder needs for a large-scale software-integrated system.

In linguistics, realization is the process by which some kind of surface representation is derived from its underlying representation; that is, the way in which some abstract object of linguistic analysis comes to be produced in actual language. Phonemes are often said to be realized by speech sounds. The different sounds that can realize a particular phoneme are called its allophones.

Referring expression generation (REG) is the subtask of natural language generation (NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of referring expressions that identify specific entities called targets.

Junction Grammar is a descriptive model of language developed during the 1960s by Dr. Eldon G. Lytle (1936–2010).

In linguistics, aggregation is a subtask of natural language generation, which involves merging syntactic constituents together. Sometimes aggregation can be done at a conceptual level.

Document Structuring is a subtask of Natural language generation, which involves deciding the order and grouping of sentences in a generated text. It is closely related to the Content determination NLG task.

<span class="mw-page-title-main">Word embedding</span> Method in natural language processing

In natural language processing (NLP), a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.

Abstract Meaning Representation (AMR) is a semantic representation language. AMR graphs are rooted, labeled, directed, acyclic graphs (DAGs), comprising whole sentences. They are intended to abstract away from syntactic representations, in the sense that sentences which are similar in meaning should be assigned the same AMR, even if they are not identically worded. By nature, the AMR language is biased towards English – it is not meant to function as an international auxiliary language.

Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations.

References

  1. Kasper, Robert T. (1989). "A flexible interface for linking applications to Penman's sentence generator". Proceedings of the Workshop on Speech and Natural Language - HLT '89. Philadelphia, Pennsylvania: Association for Computational Linguistics: 153–158. doi: 10.3115/100964.100979 .
  2. L. Banarescu, C. Bonial, S. Cai, M. Georgescu, K. Griffitt, U. Hermjakob, K. Knight, P. Koehn, M. Palmer, and N. Schneider (2013) "Abstract Meaning Representation for Sembanking", Proc. Linguistic Annotation Workshop.