Academic Word List

Last updated

The Academic Word List (AWL) is a word list of 570 English word families [1] which appear with great frequency in a broad range of academic texts. The target readership is English as a second or foreign language students intending to enter English-medium higher education, and teachers of such students. The AWL was developed by Averil Coxhead at the School of Linguistics and Applied Language Studies at Victoria University of Wellington, New Zealand. This list replaced the previously widely used University Word List , developed by Xue and Nation in 1986. The words included in the AWL were selected based on their range (breadth of academic areas covered), frequency, and dispersion (uniformity of frequency), [2] and were divided into ten sublists, each containing 1000 words in decreasing order of frequency. The AWL excludes words from the General Service List (the 2000 highest-frequency words in general texts). Many words in the AWL are general vocabulary not restricted to an academic domain, such as the words area, approach, create, similar, and occur, found in Sublist One, and the AWL only accounts for a small percentage of the actual word occurrences in academic texts. [3]

Contents

In the second decade of the twenty-first century, a revised list, called the New Academic Word List (NAWL), was developed and made public. [4] This list is available on the Simple English Wiktionary.

The Academic Vocabulary List , based on the Academic Word List, drawing from the Corpus of Contemporary American English (COCA), was developed by Gardner and Davies in 2013. Rather than relying on word families, like the AWL, the AVL is composed of 3000 English lemmas, and provides a broader coverage of Academic English. [5]

See also

Related Research Articles

Corpus linguistics is the study of a language as that language is expressed in its text corpus, its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. The large collections of text allow linguists to run quantitative analyses on linguistic concepts, otherwise harder to quantify.

Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people.

A vocabulary is a set of words, typically the set in a language or the set known to an individual. The word vocabulary originated from the Latin vocabulum, meaning "a word, name". It forms an essential component of language and communication, helping convey thoughts, ideas, emotions, and information. Vocabulary can be oral, written, or signed and can be categorized into two main types: active vocabulary and passive vocabulary. An individual's vocabulary continually evolves through various methods, including direct instruction, independent reading, and natural language exposure, but it can also shrink due to forgetting, trauma, or disease. Furthermore, vocabulary is a significant focus of study across various disciplines, like linguistics, education, psychology, and artificial intelligence. Vocabulary is not limited to single words; it also encompasses multi-word units known as collocations, idioms, and other types of phraseology. Acquiring an adequate vocabulary is one of the largest challenges in learning a second language.

<span class="mw-page-title-main">English as a second or foreign language</span> Use of English by speakers with different native languages

English as a second or foreign language is the use of English by speakers with different native languages, often with students whose native language is not English and are learning to speak and write English, commonly among students. Language education for people learning English may be known as English as a foreign language (EFL), English as a second language (ESL), English for speakers of other languages (ESOL), English as an additional language (EAL), or English as a new language (ENL), which refers to the practice of studying English in a country where it is not the dominant language. These programs, especially ESL, are usually an academic subject, course, or program designed to teach English to students who are not yet proficient in the language. While some people only refer to learning in an English-speaking country, learning this language can also entail learning in a non-English speaking or non-native nation.

<span class="mw-page-title-main">Brown Corpus</span> Data set of American English in 1961

The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in everyday language use. Compiled by Henry Kučera and W. Nelson Francis at Brown University, in Rhode Island, it is a general language corpus containing 500 samples of English, totaling roughly one million words, compiled from works published in the United States in 1961.

In corpus linguistics a key word is a word which occurs in a text more often than we would expect to occur by chance alone. Key words are calculated by carrying out a statistical test which compares the word frequencies in a text against their expected frequencies derived in a much larger corpus, which acts as a reference for general language use. Keyness is then the quality a word or phrase has of being "key" in its context. Combinations of nouns with parts of speech that human readers would not likely notice, such as prepositions, time adverbs, and pronouns can be a relevant part of keyness. Even separate pronouns can constitute keywords.

Studies that estimate and rank the most common words in English examine texts written in English. Perhaps the most comprehensive such analysis is one that was conducted against the Oxford English Corpus (OEC), a massive text corpus that is written in the English language.

The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention that it be a representative sample of spoken and written British English of that time. It is used in corpus linguistics for analysis of corpora.

The General Service List (GSL) is a list of roughly 2,000 words published by Michael West in 1953. The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers. To maximize the utility of the list, some frequent words that overlapped broadly in meaning with words already on the list were omitted. In the original publication the relative frequencies of various senses of the words were also included.

The Corpus of Contemporary American English (COCA) is a one-billion-word corpus of contemporary American English. It was created by Mark Davies, retired professor of corpus linguistics at Brigham Young University (BYU).

Linguistics is the scientific study of language. Linguistics is based on a theoretical as well as a descriptive study of language and is also interlinked with the applied fields of language studies and language learning, which entails the study of specific languages. Before the 20th century, linguistics evolved in conjunction with literary study and did not employ scientific methods. Modern-day linguistics is considered a science because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language – i.e., the cognitive, the social, the cultural, the psychological, the environmental, the biological, the literary, the grammatical, the paleographical, and the structural.

A word list is a list of a language's lexicon within some given text corpus, serving the purpose of vocabulary acquisition. A lexicon sorted by frequency "provides a rational basis for making sure that learners get the best return for their vocabulary learning effort", but is mainly intended for course writers, not directly for learners. Frequency lists are also made for lexicographical purposes, serving as a sort of checklist to ensure that common words are not left out. Some major pitfalls are the corpus content, the corpus register, and the definition of "word". While word counting is a thousand years old, with still gigantic analysis done by hand in the mid-20th century, natural language electronic processing of large corpora such as movie subtitles has accelerated the research field.

Paul Nation is an internationally recognized scholar in the field of linguistics and teaching methodology. As a professor in the field of applied linguistics with a specialization in pedagogical methodology, he has been able to create a language teaching framework to identify key areas of language teaching focus. Paul Nation is best known for this framework, which has been labelled The Four Strands. He has also made notable contributions through his research in the field of language acquisition that focuses on the benefits of extensive reading and repetition as well as intensive reading. Nation's numerous contributions to the linguistics research community through his published work has allowed him to share his knowledge and experience so that others may adopt and adapt it. He is credited with bringing « legitimization to second language vocabulary researches » in 1990.

<span class="mw-page-title-main">Mark Davies (linguist)</span> American linguist (born 1963)

Mark E. Davies is an American linguist. He specializes in corpus linguistics and language variation and change. He is the creator of most of the text corpora from English-Corpora.org as well as the Corpus del español and the Corpus do português. He has also created large datasets of word frequency, collocates, and n-grams data, which have been used by many large companies in the fields of technology and also language learning.

Norbert Schmitt is an American applied linguist and Emeritus Professor of Applied Linguistics at the University of Nottingham in the United Kingdom. He is known for his work on second-language vocabulary acquisition and second-language vocabulary teaching. He has published numerous books and papers on vocabulary acquisition.

The following outline is provided as an overview of and topical guide to second-language acquisition:

The New General Service List (NGSL) is a list of 2,809 words (lemmas) claimed to be a list of words that second language learners of the English language are most likely to meet in their daily lives. It was published by Dr. Charles Browne, Dr. Brent Culligan and Joseph Phillips in March 2013 and updated in 2016 and 2023.

<span class="mw-page-title-main">Sketch Engine</span> Corpus manager and text analysis software

Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing CZ s.r.o. since 2003. Its purpose is to enable people studying language behaviour to search large text collections according to complex and linguistically motivated queries. Sketch Engine gained its name after one of the key features, word sketches: one-page, automatic, corpus-derived summaries of a word's grammatical and collocational behaviour. Currently, it supports and provides corpora in 90+ languages.

Vocabulary learning is the process acquiring building blocks in second language acquisition Restrepo Ramos (2015). The impact of vocabulary on proficiency in second language performance "has become […] an object of considerable interest among researchers, teachers, and materials developers". From being a "neglected aspect of language learning" vocabulary gained recognition in the literature and reclaimed its position in teaching. Educators shifted their attention from accuracy to fluency by moving from the Grammar translation method to communicative approaches to teaching. As a result, incidental vocabulary teaching and learning became one of the two major types of teaching programs along with the deliberate approach.

Averil Jean Coxhead is a New Zealand academic, and is a full professor at Victoria University of Wellington, specialising in applied linguistics. She is known for creating the Academic Word List, which is a list of 570 English word families that appear with great frequency in a broad range of academic texts. She has also created wordlists for other uses, such as rugby terms for referees and players, and building terms for Tongan tradespeople.

References

  1. Folse, Keith S. (2004). Vocabulary myths : applying second language research to classroom teaching (in Japanese). University of Michigan Press.
  2. Hedgcock, John S.; Ferris, Dana R. (2018). Teaching readers of English : students, texts, and contexts (in Japanese) (2nd ed.). Routledge.
  3. Chen, Qi; Ge, Guang-chun (2007-01-01). "A corpus-based lexical study on frequency and distribution of Coxhead's AWL word families in medical research articles (RAs)". English for Specific Purposes. 26 (4): 502–514. doi:10.1016/j.esp.2007.04.003. ISSN   0889-4906.
  4. "New Academic Word List (NAWL)".
  5. Gardner, D.; Davies, M. (2013-08-02). "A New Academic Vocabulary List". Applied Linguistics. 35 (3): 305–327. doi:10.1093/applin/amt015. ISSN   0142-6001.
  1. Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, Vol. 34, No. 2 (Summer, 2000), pp. 213-238 https://www.jstor.org/stable/3587951
  2. Coxhead, A. (2012). Academic Vocabulary, Writing and English for Academic Purposes: Perspectives from Second Language Learners. RELC Journal, 43(1), 137–145. https://doi.org/10.1177/0033688212439323
  3. Green, C. (2019). Enriching the academic wordlist and Secondary Vocabulary Lists with lexicogrammar: Toward a pattern grammar of academic vocabulary. System, 87, 102158. https://doi.org/10.1016/j.system.2019.102158
  4. Hyland, K., & Tse, P. (June 2007). Is there an "Academic Vocabulary"? TESOL Quarterly, Volume 41, Number 2, pp. 235-253.
  5. Hancioglu, N., Neufeld, S., & Eldridge, J. (2008). Through the looking glass and into the land of lexico-grammar. English for Specific Purposes 27/4, 459-479 doi : 10.1016/j.esp.2008.08.001