Language complexity

Last updated

Language complexity is a topic in linguistics which can be divided into several sub-topics such as phonological, morphological, syntactic, and semantic complexity. [1] [2] The subject also carries importance for language evolution. [3]

Contents

Language complexity has been studied less than many other traditional fields of linguistics. While the consensus is turning towards recognizing that complexity is a suitable research area, a central focus has been on methodological choices. Some languages, particularly pidgins and creoles, are considered simpler than most other languages, but there is no direct ranking, and no universal method of measurement although several possibilities are now proposed within different schools of analysis. [4]

History

Throughout the 19th century, differential complexity was taken for granted. The classical languages Latin and Greek, as well as Sanskrit, were considered to possess qualities which could be achieved by the rising European national languages only through an elaboration that would give them the necessary structural and lexical complexity that would meet the requirements of an advanced civilization. At the same time, languages described as 'primitive' were naturally considered to reflect the simplicity of their speakers. [4] [5] On the other hand, Friedrich Schlegel noted that some nations "which appear to be at the very lowest grade of intellectual culture", such as Basque, Sámi and some native American languages, possess a striking degree of elaborateness. [4]

Equal complexity hypothesis

During the 20th century, linguists and anthropologists adopted a standpoint that would reject any nationalist ideas about superiority of the languages of establishment. The first known quote that puts forward the idea that all languages are equally complex comes from Rulon S. Wells III, 1954, who attributes it to Charles F. Hockett. [6] While laymen never ceased to consider certain languages as simple and others as complex, such a view was erased from official contexts. For instance, the 1971 edition of Guinness Book of World Records featured Saramaccan, a creole language, as "the world's least complex language". According to linguists, this claim was "not founded on any serious evidence", and it was removed from later editions. [7] Apparent complexity differences in certain areas were explained with a balancing force by which the simplicity in one area would be compensated with the complexity of another; e.g. David Crystal, 1987:

All languages have a complex grammar: there may be relative simplicity in one respect (e.g., no word-endings), but there seems always to be relative complexity in another (e.g., word-position). [8]

In 2001 creolist John McWhorter argued against the compensation hypothesis. McWhorter contended that it would be absurd if, as languages change, each had a mechanism that calibrated it according to the complexity of all the other 6,000 or so languages around the world. He underscored that linguistics has no knowledge of any such mechanism. [8] Revisiting the idea of differential complexity, McWhorter argued that it is indeed creole languages, such as Saramaccan, that are structurally "much simpler than all but very few older languages". In McWhorter's notion this is not problematic in terms of the equality of creole languages because simpler structures convey logical meanings in the most straightforward manner, while increased language complexity is largely a question of features which may not add much to the functionality, or improve usefulness, of the language. Examples of such features are inalienable possessive marking, switch-reference marking, syntactic asymmetries between matrix and subordinate clauses, grammatical gender, and other secondary features which are most typically absent in creoles. [8] McWhorter's notion that "unnatural" language contact in pidgins, creoles and other contact varieties inevitably destroys "natural" accretions in complexity perhaps represents a recapitulation of 19th-century ideas about the relationship between language contact and complexity. [9]

During the years following McWhorter's article, several books and dozens of articles were published on the topic. [10] [ page needed ] As to date, there have been research projects on language complexity, and several workshops for researchers have been organised by various universities. [1] Among linguists who study this, there is still no universally accepted consensus on this issue.

Complexity metrics

At a general level, language complexity can be characterized as the number and variety of elements, and the elaborateness of their interrelational structure. [11] [12] This general characterisation can be broken down into sub-areas:

Measuring complexity is considered difficult, and the comparison of whole natural languages as a daunting task. On a more detailed level, it is possible to demonstrate that some structures are more complex than others. Phonology and morphology are areas where such comparisons have traditionally been made. For instance, linguistics has tools for the assessment of the phonological system of any given language. As for the study of syntactic complexity, grammatical rules have been proposed as a basis, [8] but generative frameworks, such as the minimalist program and the Simpler Syntax framework, have been less successful in defining complexity and its predictions than non-formal ways of description. [13] [ page needed ]

Many researchers suggest that several different concepts may be needed when approaching complexity: entropy, size, description length, effective complexity, information, connectivity, irreducibility, low probability, syntactic depth etc. Research suggests that while methodological choices affect the results, even rather crude analytic tools may provide a feasible starting point for measuring grammatical complexity. [12]

Computational tools

Related Research Articles

<span class="mw-page-title-main">Functional linguistics</span> Approach to linguistics

Functional linguistics is an approach to the study of language characterized by taking systematically into account the speaker's and the hearer's side, and the communicative needs of the speaker and of the given language community. Linguistic functionalism spawned in the 1920s to 1930s from Ferdinand de Saussure's systematic structuralist approach to language (1916).

<span class="mw-page-title-main">Language</span> Structured system of communication

Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and written forms, and may also be conveyed through sign languages. The vast majority of human languages have developed writing systems that allow for the recording and preservation of the sounds or signs of language. Human language is characterized by its cultural and historical diversity, with significant variations observed between cultures and across time. Human languages possess the properties of productivity and displacement, which enable the creation of an infinite number of sentences, and the ability to refer to objects, events, and ideas that are not immediately present in the discourse. The use of human language relies on social convention and is acquired through learning.

A pidgin, or pidgin language, is a grammatically simplified means of communication that develops between two or more groups of people that do not have a language in common: typically, its vocabulary and grammar are limited and often drawn from several languages. It is most commonly employed in situations such as trade, or where both groups speak languages different from the language of the country in which they reside. Linguists do not typically consider pidgins as full or complete languages.

<span class="mw-page-title-main">Syntax</span> System responsible for combining morphemes into complex structures

In linguistics, syntax is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency), agreement, the nature of crosslinguistic variation, and the relationship between form and meaning (semantics). There are numerous approaches to syntax that differ in their central assumptions and goals.

Linguistic typology is a field of linguistics that studies and classifies languages according to their structural features to allow their comparison. Its aim is to describe and explain the structural diversity and the common properties of the world's languages. Its subdisciplines include, but are not limited to phonological typology, which deals with sound features; syntactic typology, which deals with word order and form; lexical typology, which deals with language vocabulary; and theoretical typology, which aims to explain the universal tendencies.

<span class="mw-page-title-main">Creole language</span> Stable natural languages that have developed from a pidgin

A creole language, or simply creole, is a stable natural language that develops from the process of different languages simplifying and mixing into a new form, and then that form expanding and elaborating into a full-fledged language with native speakers, all within a fairly brief period of time. While the concept is similar to that of a mixed or hybrid language, creoles are often characterized by a tendency to systematize their inherited grammar. Like any language, creoles are characterized by a consistent system of grammar, possess large stable vocabularies, and are acquired by children as their native language. These three features distinguish a creole language from a pidgin. Creolistics, or creology, is the study of creole languages and, as such, is a subfield of linguistics. Someone who engages in this study is called a creolist.

The Middle English creole hypothesis is a proposal that Middle English was a creole, which is usually defined as a language that develops during contact between two groups speaking different languages and that loses much of the grammatical elaboration of its source languages in the process. The vast differences between Old English and Middle English, and English’s status as one of the least structurally elaborated of the Germanic languages, have led some historical linguists to argue that the language underwent creolisation at around the 11th century, shortly after the Norman conquest of England. Other linguists suggest that creolisation began earlier, during the Scandinavian incursions of the 9th and 10th centuries.

In linguistic typology, ergative–absolutive alignment is a type of morphosyntactic alignment in which the single argument ("subject") of an intransitive verb behaves like the object of a transitive verb, and differently from the agent ("subject") of a transitive verb. Examples include Basque, Georgian, Mayan, Tibetan, and certain Indo-European languages. It has also been attributed to the Semitic modern Aramaic languages. Ergative languages are classified into 2 groups: those that are morphologically ergative but syntactically behave as accusative and those that—on top of being ergative morphologically—also show ergativity in syntax. No language has been recorded in which both the morphological and syntactical ergative are present. Languages that belong to the former group are more numerous than those to the latter. Dyirbal is said to be the only representative of syntactic ergativity, yet it displays accusative alignment with certain pronouns.

Construction grammar is a family of theories within the field of cognitive linguistics which posit that constructions, or learned pairings of linguistic patterns with meanings, are the fundamental building blocks of human language. Constructions include words, morphemes, fixed expressions and idioms, and abstract grammatical rules such as the passive voice or the ditransitive. Any linguistic pattern is considered to be a construction as long as some aspect of its form or its meaning cannot be predicted from its component parts, or from other constructions that are recognized to exist. In construction grammar, every utterance is understood to be a combination of multiple different constructions, which together specify its precise meaning and form.

A mixed language is a language that arises among a bilingual group combining aspects of two or more languages but not clearly deriving primarily from any single language. It differs from a creole or pidgin language in that, whereas creoles/pidgins arise where speakers of many languages acquire a common language, a mixed language typically arises in a population that is fluent in both of the source languages.

<i>Syntactic Structures</i> Book by Noam Chomsky

Syntactic Structures is an important work in linguistics by American linguist Noam Chomsky, originally published in 1957. A short monograph of about a hundred pages, it is recognized as one of the most significant and influential linguistic studies of the 20th century. It contains the now-famous sentence "Colorless green ideas sleep furiously", which Chomsky offered as an example of a grammatically correct sentence that has no discernible meaning, thus arguing for the independence of syntax from semantics.

Indosphere is a term coined by the linguist James Matisoff for areas of Indian linguistic and cultural influence in the neighboring Southern Asian, Southeast Asian, and East Asian regions. It is commonly used in areal linguistics in contrast with Sinosphere.

A genderless language is a natural or constructed language that has no distinctions of grammatical gender—that is, no categories requiring morphological agreement between nouns and associated pronouns, adjectives, articles, or verbs.

The language bioprogram theory or language bioprogram hypothesis (LBH) is a theory arguing that the structural similarities between different creole languages cannot be solely attributed to their superstrate and substrate languages. As articulated mostly by Derek Bickerton, creolization occurs when the linguistic exposure of children in a community consists solely of a highly unstructured pidgin; these children use their innate language capacity to transform the pidgin, which characteristically has high syntactic variability, into a language with a highly structured grammar. As this capacity is universal, the grammars of these new languages have many similarities.

According to the theory of monogenesis in its most radical form, all pidgins and creole languages of the world can be ultimately traced back to one linguistic variety. This idea was first formulated by Hugo Schuchardt in the late 19th century and popularized in the late 1950s and early 1960s by Taylor (1961) and Thompson (1961). It assumes that some type of pidgin language, dubbed West African Pidgin Portuguese, based on Portuguese was spoken from the 15th to 18th centuries in the forts established by the Portuguese on the West African coast. This variety was the starting point of all the pidgin and creole languages. This would explain to some extent why Portuguese lexical items can be found in many creoles, but more importantly, it would account for the numerous grammatical similarities shared by such languages.

Linguistics is the scientific study of language. Linguistics is based on a theoretical as well as a descriptive study of language and is also interlinked with the applied fields of language studies and language learning, which entails the study of specific languages. Before the 20th century, linguistics evolved in conjunction with literary study and did not exclusively employ scientific methods.

Scott DeLancey is an American linguist from the University of Oregon. His work focuses on typology and historical linguistics of Tibeto-Burman languages as well as North American indigenous languages such as the Penutian family, particularly the Klamath. His research is known for its diversity of its thematic and theoretical reach.

<span class="mw-page-title-main">Formalism (linguistics)</span> Concept in linguistics

In linguistics, the term formalism is used in a variety of meanings which relate to formal linguistics in different ways. In common usage, it is merely synonymous with a grammatical model or a syntactic model: a method for analyzing sentence structures. Such formalisms include different methodologies of generative grammar which are especially designed to produce grammatically correct strings of words; or the likes of Functional Discourse Grammar which builds on predicate logic.

Hilary Chappell is a professor of linguistics at the School for Advanced Studies in the Social Sciences (EHESS) in Paris. Her research focuses on grammaticalization and the typology of the Sinitic languages.

References

  1. 1 2 Miestamo, Matti; Sinnemäki, Kaius; Karlsson (eds.), Fred (2008). Language Complexity: Typology, Contact, Change. Studies in Language Companion Series. Vol. 94. Amsterdam: John Benjamins. p. 356. doi:10.1075/slcs.94. ISBN   978-90-272-3104-8.{{cite book}}: |last3= has generic name (help)
  2. Wurzel, Wolfgang Ullrich (2001). "Creoles, complexity, and linguistic change". Linguistic Typology. 5 (2/3): 377–387. ISSN   1430-0532.
  3. Sampson, Geoffrey; Gil, David; Trudgill, Peter, eds. (2009). Language Complexity as an Evolving Variable. Studies in the Evolution of Language. Vol. 13. Oxford; New York: Oxford University Press. ISBN   9780199545216. OCLC   227962299.
  4. 1 2 3 Joseph, John E.; Newmeyer, Frederick J. (2012). "'All Languages Are Equally Complex': The rise and fall of a consensus". Historiographia Linguistica. 39 (3): 341–368. doi:10.1075/hl.39.2-3.08jos.
  5. Arkadiev, Peter; Gardani, Francesco (2020). The complexities of morphology. Oxford. pp. 1–2. ISBN   978-0-19-260551-1. OCLC   1197563838.{{cite book}}: CS1 maint: location missing publisher (link)
  6. Joseph, John E.; Newmeyer, Frederick J. (2012-01-01). "'All Languages Are Equally Complex': The rise and fall of a consensus" . Historiographia Linguistica. 39 (2–3): 341–368. doi:10.1075/hl.39.2-3.08jos. ISSN   0302-5160.
  7. Arends, Jacques (2001). "Simple grammars, complex languages". Linguistic Typology. 5 (2/3): 180–182. ISSN   1430-0532.
  8. 1 2 3 4 McWhorter, John H. (2001). "The world's simplest grammars are creole grammars". Linguistic Typology. 5 (2/3): 125–166. doi:10.1515/lity.2001.001. ISSN   1430-0532. S2CID   16297093.
  9. McElvenny, James (2021). "Language Complexity in Historical Perspective: The Enduring Tropes of Natural Growth and Abnormal Contact". Frontiers in Communication. 6. doi: 10.3389/fcomm.2021.621712 . ISSN   2297-900X.
  10. Newmeyer, Frederick J.; Preston, Laurel B., eds. (2014). Measuring Grammatical Complexity. Oxford Linguistics. Oxford; New York: Oxford University Press. doi:10.1093/acprof:oso/9780199685301.001.0001. ISBN   9780199685301. OCLC   869852316.
  11. Rescher, Nicholas (1998). Complexity: A Philosophical Overview. New Brunswick: Transaction Publishers. ISBN   978-1560003779.
  12. 1 2 3 Sinnemäki, Kaius (2011). Language universals and linguistic complexity: Three case studies in core argument marking (Thesis). University of Helsinki. Retrieved 2016-04-28.
  13. Hawkins, John A. (2014). "Major contributions from formal linguistics to the complexity debate". In Newmeyer, Frederick J.; Preston, Laurel B. (eds.). Measuring Grammatical Complexity. Oxford: Oxford University Press. pp. 14–36. doi:10.1093/acprof:oso/9780199685301.003.0002. ISBN   9780199685301.

Bibliography