Affix grammar

Last updated

An affix grammar is a two-level grammar formalism used to describe the syntax of languages, mainly computer languages, using an approach based on how natural language is typically described. [1]

Contents

The formalism was invented in 1962 by Lambert Meertens while developing a grammar for generating English sentences. [2] Meertens also applied affix grammars to the description and composition of music, and obtained a special prize from the jury at the 1968 International Federation for Information Processing (IFIP) Congress in Edinburgh for his computer-generated string quartet, Quartet No. 1 in C major for 2 violins, viola and violoncello, based on the first non-context-free affix grammar. [3] [4] The string quartet was published in 1968, as Mathematical Centre Report MR 96. [5]

The grammatical rules of an affix grammar are those of a context-free grammar, except that certain parts in the nonterminals (the affixes) are used as arguments. If the same affix occurs multiple times in a rule, its value must agree, i.e. it must be the same everywhere. In some types of affix grammar, more complex relationships between affix values are possible.

Example

We can describe an extremely simple fragment of English in the following manner:

SentenceSubjectPredicate
SubjectNoun
PredicateVerbObject
ObjectNoun
Noun → John
Noun → Mary
Noun → children
Noun → parents
Verb → like
Verb → likes
Verb → help
Verb → helps

This context-free grammar describes simple sentences such as

John likes children
Mary helps John
children help parents
parents like John

With more nouns and verbs, and more rules to introduce other parts of speech, a large range of English sentences can be described; so this is a promising approach for describing the syntax of English.

However, the given grammar also describes sentences such as

John like children
children helps parents

These sentences are wrong: in English, subject and verb have a grammatical number, which must agree.

An affix grammar can express this directly:

SentenceSubject + number Predicate + number
Subject + numberNoun + number
Predicate + numberVerb + number Object
ObjectNoun + number
Noun + singular → John
Noun + singular → Mary
Noun + plural → children
Noun + plural → parents
Verb + singular → likes
Verb + plural → like
Verb + singular → helps
Verb + plural → help

This grammar only describes correct English sentences, although it could be argued that

John likes John

is still incorrect and should instead read

John likes himself

This, too, can be incorporated using affixes, if the means of describing the relationships between different affix values are powerful enough. As remarked above, these means depend on the type of affix grammar chosen.

Types

In the simplest type of affix grammar, affixes can only take values from a finite domain, and affix values can only be related through agreement, as in the example. Applied in this way, affixes increase compactness of grammars, but do not add expressive power.

Another approach is to allow affixes to take arbitrary strings as values and allow concatenations of affixes to be used in rules. The ranges of allowable values for affixes can be described with context-free grammar rules. This produces the formalism of two-level grammars, also known as Van Wijngaarden grammars or 2VW grammars. These have been successfully used to describe complicated languages, in particular, the syntax of the Algol 68 programming language. However, it turns out that, even though affix values can only be manipulated with string concatenation, this formalism is Turing complete; hence, even the most basic questions about the language described by an arbitrary 2VW grammar are undecidable in general.

Extended Affix Grammars, developed in the 1980s, are a more restricted version of the same idea. They were mainly applied to describe the grammar of natural language, e.g. English.

Another possibility is to allow the values of affixes to be computed by code written in some programming language. Two basic approaches have been used:

Related Research Articles

In linguistics, morphology is the study of words, including the principles by which they are formed, and how they relate to one another within a language. Most approaches to morphology investigate the structure of words in terms of morphemes, which are the smallest units in a language with some independent meaning. Morphemes include roots that can exist as words by themselves, but also categories such as affixes that can only appear as part of a larger word. For example, in English the root catch and the suffix -ing are both morphemes; catch may appear as its own word, or it may be combined with -ing to form the new word catching. Morphology also analyzes how words behave as parts of speech, and how they may be inflected to express grammatical categories including number, tense, and aspect. Concepts such as productivity are concerned with how speakers create words in specific contexts, which evolves over the history of a language.

A verb is a word that in syntax generally conveys an action, an occurrence, or a state of being. In the usual description of English, the basic form, with or without the particle to, is the infinitive. In many languages, verbs are inflected to encode tense, aspect, mood, and voice. A verb may also agree with the person, gender or number of some of its arguments, such as its subject, or object. Verbs have tenses: present, to indicate that an action is being carried out; past, to indicate that an action has been done; future, to indicate that an action will be done.

In grammar, a noun is a word that represents a concrete or abstract thing, such as living creatures, places, actions, qualities, states of existence, and ideas. A noun may serve as an object or subject within a phrase, clause, or sentence.

English grammar is the set of structural rules of the English language. This includes the structure of words, phrases, clauses, sentences, and whole texts.

<span class="mw-page-title-main">Arabic grammar</span> Grammar of the Arabic language

Arabic grammar is the grammar of the Arabic language. Arabic is a Semitic language and its grammar has many similarities with the grammar of other Semitic languages. Classical Arabic and Modern Standard Arabic have largely the same grammar; colloquial spoken varieties of Arabic can vary in different ways.

In computer science, a Van Wijngaarden grammar is a formalism for defining formal languages. The name derives from the formalism invented by Adriaan van Wijngaarden for the purpose of defining the ALGOL 68 programming language. The resulting specification remains its most notable application.

In linguistics, a grammatical category or grammatical feature is a property of items within the grammar of a language. Within each category there are two or more possible values, which are normally mutually exclusive. Frequently encountered grammatical categories include:

In linguistics, especially within generative grammar, phi features are the morphological expression of a semantic process in which a word or morpheme varies with the form of another word or phrase in the same sentence. This variation can include person, number, gender, and case, as encoded in pronominal agreement with nouns and pronouns. Several other features are included in the set of phi-features, such as the categorical features ±N (nominal) and ±V (verbal), which can be used to describe lexical categories and case features.

The term predicate is used in two ways in linguistics and its subfields. The first defines a predicate as everything in a standard declarative sentence except the subject, and the other defines it as only the main content verb or associated predicative expression of a clause. Thus, by the first definition, the predicate of the sentence Frank likes cake is likes cake, while by the second definition, it is only the content verb likes, and Frank and cake are the arguments of this predicate. The conflict between these two definitions can lead to confusion.

Traditional grammar is a framework for the description of the structure of a language. The roots of traditional grammar are in the work of classical Greek and Latin philologists. The formal study of grammar based on these models became popular during the Renaissance.

In linguistics, agreement or concord occurs when a word changes form depending on the other words to which it relates. It is an instance of inflection, and usually involves making the value of some grammatical category "agree" between varied words or parts of the sentence.

Lambert Guillaume Louis Théodore Meertens or L.G.L.T. Meertens is a Dutch computer scientist and professor. As of 2020, he is a researcher at the Kestrel Institute, a nonprofit computer science research center in Palo Alto's Stanford Research Park.

Somali is an agglutinative language, using many affixes and particles to determine and alter the meaning of words. As in other related Afroasiatic languages, Somali nouns are inflected for gender, number and case, while verbs are inflected for persons, number, tenses, and moods.

Cebuano grammar encompasses the rules that define the Cebuano language, the most widely spoken of all the languages in the Visayan Group of languages, spoken in Cebu, Bohol, Siquijor, part of Leyte island, part of Samar island, Negros Oriental, especially in Dumaguete, and the majority of cities and provinces of Mindanao.

<span class="mw-page-title-main">Inflection</span> Process of word formation

In linguistic morphology, inflection is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and definiteness. The inflection of verbs is called conjugation, and one can refer to the inflection of nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions and postpositions, numerals, articles, etc., as declension.

A "Nominal" sentence is a linguistic term that refers to a nonverbal sentence. As a nominal sentence does not have a verbal predicate, it may contain a nominal predicate, an adjectival predicate, in Semitic languages also an adverbial predicate or even a prepositional predicate. In Egyptian-Coptic, however, as in the majority of African languages, sentences with adverbial or prepositional predicate show a distinctly different structure. The relation of nominal sentences to verbal sentences is a question of tense marking. In most languages with nominal sentences such as Russian, Arabic and Hebrew, the copular verb does not surface in indicatival present tense sentences. Conversely, these languages allow the copular verb in non-present sentences.

This article deals with the grammar of the Udmurt language.

The Kwaio language, or Koio, is spoken in the centre of Malaita Island in the Solomon Islands. It is spoken by about 13,000 people.

<span class="mw-page-title-main">English possessive</span> Possessive words and phrases in the English language

In English, possessive words or phrases exist for nouns and most pronouns, as well as some noun phrases. These can play the roles of determiners or of nouns.

A bare noun is a noun that is used without a surface determiner or quantifier. In natural languages, the distribution of bare nouns is subject to various language-specific constraints. Under the DP hypothesis a noun in an argument position must have a determiner or quantifier that introduces the noun, warranting special treatment of the bare nouns that seemingly contradict this. As a result, bare nouns have attracted extensive study in the fields of both semantics and syntax.

References

  1. Koster, Cornelis HA. "Affix grammars for natural languages." Attribute Grammars, Applications and Systems. Springer, Berlin, Heidelberg, 1991.
  2. Koster, C. H. A. (1965). On the construction of ALGOL-procedures for generating, analysing and translating sentences in natural languages (Technical report). Mathematical Centre. MR72.
  3. Kassler, Michael (1969). "Report from Edinburgh". Perspectives of New Music . 7 (2): 175–177. doi:10.2307/832302. JSTOR   832302..
  4. Quartet No. 1 in C major for 2 violins, viola and violoncello. Score and links to mp3 sound files of a performance by the Amsterdam String Quartet (1968).
  5. Meertens, Lambert (1968). Quartet no. 1 in C major for 2 violins, viola and violoncello (Technical report). Mathematical Centre. MR96.