Bergen Corpus of London Teenage Language

Last updated

The Bergen Corpus of London Teenage Language (COLT) is a data set of samples of spoken English that was compiled in 1993 from tape recorded and transcribed conversations by teens between the ages of 13 and 17 in schools throughout London, England. [1] [2] This corpus, which has been tagged for part of speech using the CLAWS 6 tagset, is one of the linguistic research projects housed at the University of Bergen in Norway. [3]

Resultant research

Linguistic analysis based on COLT has appeared in the book Trends in Teenage Talk [4] and subsequent journal articles, [5] [6] including, for example, work tracking innit, [7] cos, [8] degree modifiers, [9] extenders, [10] the use of taboo words, [11] and negation. [12]

Related Research Articles

In linguistics, an adjective is a word that generally modifies a noun or noun phrase or describes its referent. Its semantic role is to change information given by the noun.

English grammar is the set of structural rules of the English language. This includes the structure of words, phrases, clauses, sentences, and whole texts.

In linguistics, a corpus or text corpus is a language resource consisting of a large and structured set of texts. In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.

Linguistics is the scientific study of human language. Someone who engages in this study is called a linguist. See also the Outline of linguistics, the List of phonetics topics, the List of linguists, and the List of cognitive science topics. Articles related to linguistics include:

Collocation Frequent occurrence of words next to each other

In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words that make it up. This contrasts with an idiom, where the meaning of the whole cannot be inferred from its parts, and may be completely unrelated.

In English, the word like has a very flexible range of uses, ranging from conventional to non-standard. It can be used as a noun, verb, adverb, adjective, preposition, particle, conjunction, hedge, filler, and quotative.

In linguistics, word order is the order of the syntactic constituents of a language. Word order typology studies it from a cross-linguistic perspective, and examines how different languages employ different orders. Correlations between orders found in different syntactic sub-domains are also of interest. The primary word orders that are of interest are

In corpus linguistics, part-of-speech tagging, also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Epistemic modality is a sub-type of linguistic modality that encompasses knowledge, belief, or credence in a proposition. Epistemic modality is exemplified by the English modals may, might, must. However, it occurs cross-linguistically, encoded in a wide variety of lexical items and grammatical structures. Epistemic modality has been studied from many perspectives within linguistics and philosophy. It is one of the most studied phenomena in formal semantics.

Geoffrey Neil Leech FBA was a specialist in English language and linguistics. He was the author, co-author, or editor of over 30 books and over 120 published papers. His main academic interests were English grammar, corpus linguistics, stylistics, pragmatics, and semantics.

Language and Computers: Studies in Practical Linguistics is a book series on corpus linguistics and related areas.
As studies in linguistics, volumes in the series have, by definition, their foundations in linguistic theory; however, they are not concerned with theory for theory's sake, but always with a definite direct or indirect interest in the possibilities of practical application in the dynamic area where language and computers meet.

Contrastive focus reduplication Grammatical phenomenon

Contrastive focus reduplication, also called contrastive reduplication, identical constituent compounding, lexical cloning, or the double construction, is a type of syntactic reduplication found in some languages. Doubling a word or phrase – such as "do you like-like him?" – can indicate that the prototypical meaning of the repeated word or phrase is intended.

"As a rough approximation, we can say that the reduplicated modifier singles out a member or subset of the extension of the noun that represents a true, real, default, or prototype instance."

Poula is an Angami-Pochuri language that is predominantly spoken by the Poumai Naga people in Senapati district in Manipur and Phek district in Nagaland, India. The language of Chingjaroi is also closely related to poula.

English nouns Part of speech

English nouns form the largest category of words in English, both in terms of the number of different words and in terms of how often they are used in typical texts. The three main categories of English nouns are common nouns, proper nouns, and pronouns. A defining feature of English nouns is their ability to inflect for number, as through the plural –s morpheme. English nouns primarily function as the heads of noun phrases, which prototypically function at the clause level as subjects, objects, and predicative complements. These phrases are the only English phrases whose structure includes determinatives and predeterminatives, which add abstract specifying meaning such as definiteness and proximity. Like nouns in general, English nouns typically denote physical objects, but they also denote actions, characteristics, relations in space, and just about anything at all. Taken all together, these features separate English nouns from the language's other lexical categories, such as adjectives and verbs.

In linguistics and grammar, affirmation and negation are ways in which grammar encodes positive and negative polarity into verb phrases, clauses, or other utterances. An affirmative (positive) form is used to express the validity or truth of a basic assertion, while a negative form expresses its falsity. For example, the affirmative sentence "Jane is here" asserts that it is true that Jane is currently located near the speaker. Conversely, the negative sentence "Jane is not here" asserts that it is not true that Jane is currently located near the speaker.

The International Computer Archive of Modern and Medieval English (ICAME) is an international group of linguists and data scientists working in corpus linguistics to digitise English texts. The organisation was founded in Oslo, Norway in 1977 as the International Computer Archive of Modern English, before being renamed to its current title.

Anna-Brita Stenström is a linguist whose areas of research include corpus linguistics, sociolinguistics, pragmatics, and discourse analysis. She has initiated and co-directed three online corpora of adolescent language: The Bergen Corpus of London Teenage Language (COLT), Ungdomsspråk och språkkontakt i Norden (UNO), and Corpus Oral de Lenguaje Adolescente (COLA). She is Professor Emerita of English Linguistics at the University of Bergen, Norway.

Svenja Adolphs is a British linguist whose research involves analysis of corpus data including sources of multimodal material such as the Nottingham Multimodal Corpus (NMMC) to examine communication in new forms of digital records. Using visual mark-up systems, her work allows a better understanding of the nature of natural language use. She is a co-founder of the Health Language Research Group at the University of Nottingham, bringing together academics and clinicians to advance the work of applied linguistics in health care settings.

Monika Bednarek German-born Australian linguist

Monika Bednarek is a German-born Australian linguist. She is Professor in linguistics at the University of Sydney and Director of the Sydney Corpus Lab. She is one of the co-developers of Discursive News Values Analysis (DNVA), which is a framework for analyzing how events are constructed as newsworthy through language and images. Her work ranges across various linguistic sub-disciplines, including corpus linguistics, media linguistics, sociolinguistics, discourse analysis, stylistics, and applied linguistics.

Robin P. Fawcett (1937) is a British linguist known as the main exponent of the Cardiff grammar in systemic functional linguistics. He is Emeritus Professor at Cardiff University.

References

  1. "COLT Summary" (PDF). Retrieved March 29, 2015.
  2. González-Díaz, Victorina (2008). English Adjective Comparison: A Historical Perspective. John Benjamins. p. 9.
  3. "COLT: The Bergen Corpus of London Teenage Talk". November 20, 2003. Retrieved March 29, 2015.
  4. Stenström, Anna-Brita; et al. (2002). Trends in Teenage Talk. John Benjamins.
  5. Palacios Martínez, Ignacio M. (2011). "The Language of British Teenagers. A Preliminary Study of its Main Grammatical Features" (PDF). Journal of the Spanish Association of Anglo-American Studies. Archived from the original (PDF) on March 4, 2016. Retrieved March 29, 2015.
  6. Palacios Martinez, Ignacio (2010). "The Expression of Negation in British Teenagers' Language: A Preliminary Study". Journal of English Linguistics.
  7. Martínez, Ignacio M. Palacios. 2015. Variation, development and pragmatic uses of innit in the language of British adults and teenagers. English Language & Linguistics,
  8. Stenström, Anna-Brita. 1998. From sentence to discourse: Cos (because) in teenage talk. Pragmatics and Beyond New Series
  9. Paradis, Carita. 2000. It's well weird: Degree modifiers of adjectives revisited: The nineties. Language and Computers.
  10. Martínez, Ignacio M. Palacios. 2011. “I might, I might go I mean it depends on money things and stuff”. A preliminary analysis of general extenders in British teenagers' discourse. Journal of Pragmatics.
  11. Stenström, Anna-Brita. 2006. Taboo words in teenage talk: London and Madrid girls' conversations compared. Spanish in Context. 3(1), Jan 2006, p. 115-138.
  12. Ignacio M. Palacios Martinez (April 19, 2010). "The Expression of Negation in British Teenagers' Language: A Preliminary Study". Journal of English Linguistics. 39: 4–35. doi:10.1177/0075424210366905.