Bracketing (linguistics)

Last updated

In linguistics, particularly linguistic morphology, bracketing is a term of art that refers to how an utterance can be represented as a hierarchical tree of constituent parts. Analysis techniques based on bracketing are used at different levels of grammar, but are particularly associated with morphologically complex words.

Linguistics is the scientific study of language. It involves analysing language form, language meaning, and language in context. The earliest activities in the documentation and description of language have been attributed to the 6th-century-BC Indian grammarian Pāṇini who wrote a formal description of the Sanskrit language in his Aṣṭādhyāyī.

In linguistics, morphology is the study of words, how they are formed, and their relationship to other words in the same language. It analyzes the structure of words and parts of words, such as stems, root words, prefixes, and suffixes. Morphology also looks at parts of speech, intonation and stress, and the ways context can change a word's pronunciation and meaning. Morphology differs from morphological typology, which is the classification of languages based on their use of words, and lexicology, which is the study of words and how they make up a language's vocabulary.

Utterance smallest unit of speech

In spoken language analysis, an utterance is the smallest unit of speech. It is a continuous piece of speech beginning and ending with a clear pause. In the case of oral languages, it is generally but not always bounded by silence. Utterances do not exist in written language, only their representations do. They can be represented and delineated in written language in many ways.

Contents

To give an example of bracketing in English, consider the word uneventful. This word is made of three parts, the prefix un-, the root event, and the suffix -ful. An English speaker should have no trouble parsing this word as "lacking in significant events". [1] However, imagine a foreign linguist with access to a dictionary of English roots and affixes, but only a superficial understanding of English grammar. Conceivably, he or she could understand uneventful as one of:

English language West Germanic language

English is a West Germanic language that was first spoken in early medieval England and eventually became a global lingua franca. It is named after the Angles, one of the Germanic tribes that migrated to the area of Great Britain that later took their name, as England. Both names derive from Anglia, a peninsula in the Baltic Sea. The language is closely related to Frisian and Low Saxon, and its vocabulary has been significantly influenced by other Germanic languages, particularly Norse, and to a greater extent by Latin and French.

We can represent these two understandings of uneventful with the bracketings and , respectively. Here, bracketing gives the linguist a convenient technique for representing the different ways to parse the word, and for forming hypotheses about why the word is parsed the way it is by speakers of the language.

Since bracketing represents a hierarchical tree, it is associated to some extent with generative grammar. Some theories in cognitive linguistics rely on the idea that bracketing represents to some degree of accuracy how listeners parse complex utterances (e.g. level ordering). In computational linguistics, rules for how a program should parse a word can be represented in terms of possible bracketings.

Generative grammar is a linguistic theory that regards grammar as a system of rules that generates exactly those combinations of words that form grammatical sentences in a given language. Noam Chomsky first used the term in relation to the theoretical linguistics of grammar that he developed in the late 1950s. Linguists who follow the generative approach have been called generativists. The generative school has focused on the study of syntax and addressed other aspects of a language's structure, including morphology and phonology.

Cognitive linguistics (CL) is an interdisciplinary branch of linguistics, combining knowledge and research from both psychology and linguistics. It describes how language interacts with cognition, how language forms our thoughts, and the evolution of language parallel with the change in the common mindset across time.

Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective, as well as the study of appropriate computational approaches to linguistic questions.

It is not completely clear that bracketing accurately represents the structure of utterances. In particular, there are bracketing paradoxes that challenge this idea. However, there is some evidence for bracketing, such as the creation of new words via rebracketing.

In linguistic morphology, the bracketing paradox concerns morphologically complex words which have more than one analysis, or bracketing, e.g., one for phonology and one for semantics, and the two are not compatible, or brackets do not align.

Rebracketing

Rebracketing is a type of folk etymology that can result in the creation of new words. An often cited example in English is certain common nicknames that begin with N, where the given name does not begin with N (e.g. Ned for Edward, Nelly for Ellen). In Old English, the first person possessive pronoun was mīn . Old English speakers commonly addressed family and close friends with "min <Name>", for example, "min Ed". Over time, the pronoun shifted from min to mi [3] and children learning the language rebracketed the utterance /mined/ from the original "min Ed" () to "mi Ned" (). A similar process is responsible for the word "nickname".

Folk etymology or reanalysis – sometimes called pseudo-etymology, popular etymology, or analogical reformation – is a change in a word or phrase resulting from the replacement of an unfamiliar form by a more familiar one. The form or the meaning of an archaic, foreign, or otherwise unfamiliar word is reanalyzed as resembling more familiar words or morphemes. Rebracketing is a form of folk etymology in which a word is broken down or "bracketed" into a new set of supposed elements. Back-formation, creating a new word by removing or changing parts of an existing word, is often based on folk etymology.

Nickname informal name of a person, place, or thing, for affection or ridicule

A nickname is a substitute for the proper name of a familiar person, place, or thing - commonly used for affection.

Old English, or Anglo-Saxon, is the earliest historical form of the English language, spoken in England and southern and eastern Scotland in the early Middle Ages. It was brought to Great Britain by Anglo-Saxon settlers probably in the mid-5th century, and the first Old English literary works date from the mid-7th century. After the Norman conquest of 1066, English was replaced, for a time, as the language of the upper classes by Anglo-Norman, a relative of French. This is regarded as marking the end of the Old English era, as during this period the English language was heavily influenced by Anglo-Norman, developing into a phase known now as Middle English.

See also

Notes

  1. Definition from American Heritage Dictionary
  2. un- is usually attached to adjectives, but less productively can also attach to nouns (e.g. un-Cola, The Ungame)
  3. This shift was never complete. In Modern English, we still have mine in addition to my, used in different contexts.

Related Research Articles

In formal language theory, a context-free grammar (CFG) is a certain type of formal grammar: a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the rule

A morpheme is the smallest grammatical unit in a language. A morpheme is not identical to a word, and the principal difference between the two is that a morpheme may or may not stand alone, whereas a word, by definition, is freestanding. The linguistics field of study dedicated to morphemes is called morphology. When a morpheme stands by itself, it is considered as a root because it has a meaning of its own and when it depends on another morpheme to express an idea, it is an affix because it has a grammatical function. Every word comprises one or more morphemes.

Deixis Words requiring context to understand their meaning

In linguistics, deixis refers to words and phrases, such as "me" or "here", that cannot be fully understood without additional contextual information—in this case, the identity of the speaker ("me") and the speaker's location ("here"). Words are deictic if their semantic meaning is fixed but their denotational meaning varies depending on time and/or place. Words or phrases that require contextual information to convey any meaning—for example, English pronouns—are deictic. Deixis is closely related to anaphora, as will be further explained below. Although this article deals primarily with deixis in spoken language, the concept is sometimes applied to written language, gestures, and communication media as well. In linguistic anthropology, deixis is treated as a particular subclass of the more general semiotic phenomenon of indexicality, a sign "pointing to" some aspect of its context of occurrence.

In traditional grammar, a part of speech is a category of words which have similar grammatical properties. Words that are assigned to the same part of speech generally display similar behavior in terms of syntax—they play similar roles within the grammatical structure of sentences—and sometimes in terms of morphology, in that they undergo inflection for similar properties.

A noun phrase or nominal phrase is a phrase that has a noun as its head or shows the same grammatical function as such a phrase. Noun phrases are very common cross-linguistically, and they may be the most frequently occurring phrase type.

Parse tree ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar

A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.

X-bar theory is a theory of syntactic category formation. It embodies two independent claims: one, that phrases may contain intermediate constituents projected from a head X; and two, that this system of projected constituency may be common to more than one category.

Link grammar (LG) is a theory of syntax by Davy Temperley and Daniel Sleator which builds relations between pairs of words, rather than constructing constituents in a phrase structure hierarchy. Link grammar is similar to dependency grammar, but dependency grammar includes a head-dependent relationship, whereas Link Grammar makes the head-dependent relationship optional. Colored Multiplanar Link Grammar (CMLG) is an extension of LG allowing crossing relations between pairs of words. The relationship between words is indicated with link types, thus making the Link grammar closely related to certain categorial grammars.

In linguistics, anaphora is the use of an expression whose interpretation depends upon another expression in context. In a narrower sense, anaphora is the use of an expression that depends specifically upon an antecedent expression and thus is contrasted with cataphora, which is the use of an expression that depends upon a postcedent expression. The anaphoric (referring) term is called an anaphor. For example, in the sentence Sally arrived, but nobody saw her, the pronoun her is an anaphor, referring back to the antecedent Sally. In the sentence Before her arrival, nobody saw Sally, the pronoun her refers forward to the postcedent Sally, so her is now a cataphor. Usually, an anaphoric expression is a proform or some other kind of deictic (contextually-dependent) expression. Both anaphora and cataphora are species of endophora, referring to something mentioned elsewhere in a dialog or text.

Fula, also known as Fulani or Fulah, is a language spoken as a set of various dialects in a continuum that stretches across some 20 countries in West and Central Africa. Along with other related languages such as Serer and Wolof, it belongs to the Senegambian branch within the Niger–Congo languages, which does not have tones, unlike most other Niger–Congo languages. More broadly, it belongs to the Atlantic geographic grouping within Niger–Congo. It is spoken as a first language by the Fula people from the Senegambia region and Guinea to Cameroon and Sudan and by related groups such as the Toucouleur people in the Senegal River Valley. It is also spoken as a second language by various peoples in the region, such as the Kirdi of northern Cameroon and northeastern Nigeria.

Bunun language language

The Bunun language is spoken by the Bunun people of Taiwan. It is one of the Formosan languages, a geographic group of Austronesian languages, and is subdivided in five dialects: Isbukun, Takbunuaz, Takivatan, Takibaka and Takituduh. Isbukun, the dominant dialect, is mainly spoken in the south of Taiwan. Takbunuaz and Takivatan are mainly spoken in the center of the country. Takibaka and Takituduh both are northern dialects. A sixth dialect, Takipulan, became extinct in the 1970s.

A definite clause grammar (DCG) is a way of expressing grammar, either for natural or formal languages, in a logic programming language such as Prolog. It is closely related to the concept of attribute grammars / affix grammars from which Prolog was originally developed. DCGs are usually associated with Prolog, but similar languages such as Mercury also include DCGs. They are called definite clause grammars because they represent a grammar as a set of definite clauses in first-order logic.

Chimariko language language

Chimariko is an extinct language isolate formerly spoken in northern Trinity County, California, by the inhabitants of several independent communities. While the total area claimed by these communities was remarkably small, Golla (2011:87–89) believes there is evidence that three local dialects were recognized: Trinity River Chimariko, spoken along the Trinity River from the mouth of South Fork at Salyer as far upstream as Big Bar, with a principal village at Burnt Ranch; South Fork Chimariko, spoken around the junction of South Fork and Hayfork Creek, with a principal village at Hyampom; and New River Chimariko, spoken along New River on the southern slopes of the Trinity Alps, with a principal village at Denny.

The term linguistic performance was used by Noam Chomsky in 1960 to describe "the actual use of language in concrete situations". It is used to describe both the production, sometimes called parole, as well as the comprehension of language. Performance is defined in opposition to "competence"; the latter describes the mental knowledge that a speaker or listener has of language.

Rebracketing is a process in historical linguistics where a word originally derived from one source is broken down or bracketed into a different set of factors. It is a form of folk etymology, where the new factors may appear meaningful, or may seem to be the result of valid morphological processes.

Well-formedness is the quality of a clause, word, or other linguistic element that conforms to the grammar of the language of which it is a part. Well-formed words or phrases are grammatical, meaning they obey all relevant rules of grammar. In contrast, a form that violates some grammar rule is ill-formed and does not constitute part of the language.

In computer programming, a parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output. In this context, a parser is a function accepting strings as input and returning some structure as output, typically a parse tree or a set of indices representing locations in the string where parsing stopped successfully. Parser combinators enable a recursive descent parsing strategy that facilitates modular piecewise construction and testing. This parsing technique is called combinatory parsing.

In formal language theory, a grammar is a set of production rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form.

Yolmo language Sino-Tibetan language

Yolmo (Hyolmo), or Helambu Sherpa, is a Tibeto-Burman language of the Hyolmo people of Nepal. Yolmo is spoken predominantly in the Helambu and Melamchi valleys in northern Nuwakot District and northwestern Sindhupalchowk District. Dialects are also spoken by smaller populations in Lamjung District and Ilam District, and also in Ramecchap District. It has a very high level of similarity with Kyirong Tibetan, and weaker but still observable similarities to Standard Tibetan and Sherpa.