Most common words in Spanish

Last updated

Below are two estimates of the most common words in Modern Spanish. Each estimate comes from an analysis of a different text corpus. A text corpus is a large collection of samples of written and/or spoken language, that has been carefully prepared for linguistic analysis. To determine which words are the most common, researchers create a database of all the words found in the corpus, and categorized them based on the context in which they are used.

Contents

The first table lists the 100 most common word forms from the Corpus de Referencia del Español Actual (CREA), a text corpus compiled by the Real Academia Española (RAE). The RAE is Spain's official institution for documenting, planning, and standardising the Spanish language. A word form is any of the grammatical variations of a word.

The second table is a list of 100 most common lemmas found in a text corpus compiled by Mark Davies and other language researchers at Brigham Young University in the United States. A lemma is the primary form of a word—the one that would appear in a dictionary. The Spanish infinitive tener ("to have") is a lemma, while tiene ("has")—which is a conjugation of tener—is a word form.

Real Academia Española

The list below comes from "1000 formas más frecuentes" (transl.1000 most frequent word forms)", a list published by the Real Academia Española (RAE) from analysis of more than 160 million word forms found in the Corpus de Referencia del Español Actual (transl.Reference Corpus of Current Spanish), or CREA. CREA is a computerised corpus of texts written in Spanish, and of transcripts of spoken Spanish. It includes books, magazines, and newspapers with a wide variety of content, as well as transcripts of spoken language from radio and television broadcasts and other sources. All the works in the collection are from 1975 to 2004. CREA includes samples from all Spanish-speaking countries. [1]

The list of "2000 most frequent word forms" comes from an analysis of CREA version 3.2. [2] Plurals, verb conjugations, and other inflections are ranked separately. Homonyms, however, are not distinguished from one another. CREA 3.2 was published in June 2008. [1]

Most frequent word forms out of ~160 million words
(RAE 2008)
RankWord formOccurrencesPart of speechTranslation
1 de 9,999,518 preposition of; from
2 la 6,277,560 article, pronoun the; third person feminine singular pronoun
3 que 4,681,839 conjunction that, which
4 el 4,569,652articlethe
5 en 4,234,281prepositionin, on
6 y 4,180,279conjunctionand
7 a 3,260,939prepositionto, at
8 los 2,618,657article, pronounthe; third person masculine direct object
9 se 2,022,514pronoun-self, oneself (reflexive)
10 del 1,857,225prepositionfrom the
11 las 1,686,741article, pronounthe; third person feminine direct object
12 un 1,659,827articlea, an
13 por 1,561,904prepositionby, for, through
14 con 1,481,607prepositionwith
15 no 1,465,503 adverb no; not
16 una 1,347,603articlea, an, one
17 su 1,103,617 possessive his/her/its/your
18 para 1,062,152prepositionfor, to, in order to
19 es 1,019,669 verb is
20 al 951,054prepositionto the
21 lo 866,955article, pronounthe; third person masculine direct object
22 como 773,465conjunctionlike, as
23 más 661,696 adjective more
24 o 542,284conjunctionor
25 pero 450,512conjunctionbut
26 sus 449,870possessivehis/her/its/your
27 le 413,241pronounthird person indirect object
28 ha 380,339verbhe/she/it has [done something]; you (formal) have [done something]
29 me 374,368pronounme
30 si 327,480conjunctionif, whether
31 sin 298,383prepositionwithout
32 sobre 289,704prepositionon top of, over, about
33 este 285,461adjectivethis
34 ya 274,177adverbalready; still
35 entre 267,493prepositionbetween
36 cuando 257,272conjunctionwhen
37 todo 247,340adjectiveall, every
38 esta 238,841adjectivethis
39 ser 232,924verbto be
40 son 232,415verbthey are, you (pl.) are
41 dos 228,439numbertwo
42 también 227,411adverbtoo, also, as well
43 fue 223,791verbwas
44 había 223,430verbI/he/she/it/there was (or used to be)
45 era 219,933verbwas
46 muy 208,540adverbvery
47 años 203,027 noun
(masculine)
years
48 hasta 202,935prepositionuntil
49 desde 198,647prepositionfrom; since
50 está 194,168verbis
51 mi 186,360possessivemy
52 porque 185,700conjunctionbecause
53 qué 184,956pronounwhat?; which?; how adjective
54 sólo 170,552adverbonly, solely
55 han 169,718verbthey/you (pl.) have [done something]
56 yo 167,684pronounI
57 hay 164,940verbthere is/are
58 vez 163,538noun
(feminine)
time, instance
59 puede 161,219verbcan
60 todos 158,168adjectiveall; every
61 así 155,645adverblike that
62 nos 154,412pronounus
63 ni 153,451conjunction, adverbneither; nor; no even
64 parte 148,750noun
(masculine / feminine)
part; message
65 tiene 147,274verbhas
66 él 139,080pronoun
(masculine)
he, it
67 uno 136,020numberone
68 donde 132,077prepositionwhere
69 bien 130,957adjectivefine, well
70 tiempo 130,896noun
(masculine)
time; weather
71 mismo 130,746adjectivesame
72 ese 127,976pronounthat
73 ahora 125,661adverbnow
74 cada 124,558 determiner each; every
75 e 123,729conjunctionand
76 vida 123,491noun
(feminine)
life
77 otro 121,983adjectiveother, another
78 después 121,746prepositionafter
79 te 120,052pronounto you, for you; yourself
80 otros 119,500pronounothers
81 aunque 115,556conjunctionthough, although, even though
82 esa 115,377adjectivethat
83 eso 114,523pronounthat
84 hace 114,507verbhe/she/it does/makes
85 otra 113,982adjective, pronounother; another
86 gobierno 113,011noun
(masculine)
government
87 tan 112,471adverbso
88 durante 112,020prepositionduring
89 siempre 111,557adverbalways
90 día 110,921noun
(masculine)
day
91 tanto 110,679adjective, adverbso much
92 ella 110,620pronounshe, her; it
93 tres 109,542numberthree
94 108,631noun, pronounyes, if; reflexive pronoun
95 dijo 108,471verbsaid; told
96 sido 107,352past participlebeen
97 gran 106,991adjectivelarge, great, big
98 país 104,568noun
(masculine)
country
99 según 104,204prepositionas; according to
100 menos 103,498adjectiveless; fewer

Mark Davies

In 2006, Mark Davies, an associate professor of linguistics at Brigham Young University, published his estimate of the 5000 most common words in Modern Spanish. To make this list, he compiled samples only from 20th-century sources—especially from the years 1970 to 2000. Most of the sources are from the 1990s. Of the 20 million words in the corpus, about one-third (~6,750,000 words) come from transcripts of spoken Spanish: conversations, interviews, lectures, sermons, press conferences, sports broadcasts, and so on. Among the written sources are novels, plays, short stories, letters, essays, newspapers, and the encyclopedia Encarta . The samples, written and spoken, come from Spain and at least 10 Latin American countries. Most of the samples were previously compiled for the Corpus del Español (2001), a 100 million-word corpus that includes works from the 13th century through the 20th. [3] [4]

The 5000 words in Davies' list are lemmas. [5] A lemma is the form of the word as it would appear in a dictionary. [6] Singular nouns and plurals, for example, are treated as the same word, as are infinitives and verb conjugations. The table below includes the top 100 words from Davies' list of 5000. [7] [8] This list distinguishes between the definite articles lo and la and the pronouns lo and la; all are ranked individually. The adjectives ese and esa are ranked together (as are este and esta) ), but the pronoun eso is separate. All conjugations of a verb are ranked together.

A highlighted row indicates that the word was found to occur especially frequently in samples of spoken Spanish. [9]

Most frequent lemmas out of ~20 million words
(Davies 2006)
RankLemmaOccurrencesPart of speechTranslation
1 el / la 2,037,803 article the
2 de 1,319,834 preposition of, from
3 que 662,653 conjunction that, which
4 y 562,162conjunctionand
5 a 529,899prepositionto, at
6 en 507,233prepositionin, on
7 un 434,022articlea, an
8 ser 374,194 verb to be
9 se 329,012 pronoun -self, oneself (reflexive)
10 no 257,365 adverb no
11 haber 196,962verbto have
12 por 190,975prepositionby, for, through
13 con 184,597prepositionwith
14 su 187,810adjectivehis, her, their, your
15 para 126,061prepositionfor, to, in order to
16 como 106,840conjunctionlike, as
17 estar 106,429verbto be
18 tener 106,642verbto have
19 le 98,211pronounthird person indirect object
20 lo 91,035articlethe
21 lo 92,519pronounthird person masculine direct object
22 todo 88,057adjectiveall, every
23 pero 82,435conjunctionbut, yet, except
24 más 92,352adjectivemore
25 hacer 81,619verbto do; to make
26 o 82,444conjunctionor
27 poder 76,738verbto be able to, can
28 decir 79,343verbto tell, say
29 este / esta 80,544adjectivethis
30 ir 70,352verbto go
31 otro 61,726adjectiveother, another
32 ese / esa 60,989adjectivethat
33 la 55,523pronounthird person feminine direct object
34 si 53,608conjunctionif, whether
35 me 95,577pronounme
36 ya 46,778adverbalready, still
37 ver 45,854verbto see
38 porque 44,500conjunctionbecause
39 dar 40,233verbto give
40 cuando 39,726conjunctionwhen
41 él 38,597pronounhe
42 muy 39,558adverbvery, really
43 sin 40,432prepositionwithout
44 vez 35,286 noun
(feminine)
time, occurrence
45 mucho 36,391adjectivemuch, many, a lot
46 saber 37,092verbto know
47 qué 42,000pronounwhat?; which?; how adjective
48 sobre 35,038prepositionon top of, over, about
49 mi 45,636adjectivemy
50 alguno 30,485adjective / pronounsome; someone
51 mismo 29,569adjectivesame
52 yo 54,635pronounI
53 también 33,348adverbalso
54 hasta 29,506preposition / adverbuntil, up to; even
55 año 33,053noun
(masculine)
year
56 dos 27,733numbertwo
57 querer 28,696verbto want, love
58 entre 30,756prepositionbetween
59 así 24,832adverblike that
60 primero 26,553adjectivefirst
61 desde 25,288prepositionfrom, since
62 grande 25,963adjectivelarge, great, big
63 eso 31,636pronoun
(neuter gender)
that
64 ni 24,261conjunctionnot even, neither, nor
65 nos 26,349pronounus
66 llegar 22,878verbto arrive
67 pasar 22,466verbto pass; to happen; to spend time
68 tiempo 22,432noun
(masculine)
time, weather
69 ella(s)24,770pronounshe; (plural) them
70 33,828adverbyes
71 día 24,715noun
(masculine)
day
72 uno 21,407numberone
73 bien 21,589adverbwell
74 poco 20,986adjective / adverblittle, few; a little bit
75 deber 22,232verbshould, ought to; to owe
76 entonces 23,548adverbso, then
77 poner 20,330verbto put (on); to get [adjective]
78 cosa 23,943noun
(feminine)
thing
79 tanto 20,531adjectivemuch
80 hombre 20,292noun
(masculine)
man, mankind, husband
81 parecer 19,964verbto seem, to look like
82 nuestro 20,666adjectiveour
83 tan 19,002adverbsuch, a, too, so
84 donde 18,852conjunctionwhere
85 ahora 21,030adverbnow
86 parte 20,319noun
(feminine)
part, portion
87 después 20,229adverbafter
88 vida 18,045noun
(feminine)
life
89 quedar 18,152verbto remain, to stay
90 siempre 17,689adverbalways
91 creer 21,257verbto believe
92 hablar 19,006verbto speak, to talk
93 llevar 17,062verbto take, to carry
94 dejar 18,185verbto let, to leave
95 nada 19,365pronounnothing
96 cada 17,155adjectiveeach, every
97 seguir 16,104verbto follow
98 menos 15,527adjectiveless, fewer
99 nuevo 17,381adjectivenew
100 encontrar 15,556verbto find

See also

Notes

  1. 1 2 "CREA". RAE.es (in Spanish). Real Academia Española . Retrieved 2017-07-13.
  2. "Corpus de Referencia del Español Actual (CREA) — Listado de frecuencias". RAE.es (in Spanish). Real Academia Española . Retrieved 2017-07-13.
  3. Davies (2006), p. 2–3
  4. "El Corpus del Español". corpusdelespanol.org. Retrieved 2017-07-13.
  5. Davies (2006), pp. 4–6
  6. Davies (2006), p. 4
  7. Davies (2006), pp. 12–14
  8. "Top Spanish Vocabulary". Vistawide World Languages & Cultures. Retrieved 2017-07-13.
  9. Davies (2006), p. 9

References