Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising twenty-nine letters but also including two additional letters found in some loanwords. The Finnish orthography strives to represent all morphemes phonologically and, roughly speaking, the sound value of each letter tends to correspond with its value in the International Phonetic Alphabet (IPA) – although some discrepancies do exist.
The following table describes how each letter in the Finnish alphabet (Finnish : suomen aakkoset) is spelled and pronounced separately. If the name of a consonant begins with a vowel (usually ⟨ä⟩[æ]), it can be pronounced and spelled either as a monosyllabic or bisyllabic word. [1] In practice, the names of the letters are rarely spelled, as people usually just type the (uppercase or lowercase) glyph when they want to refer to a particular letter.
The pronunciation instructions enclosed in slashes are broad transcriptions based on the IPA system. In notes, more narrow transcriptions are enclosed in square brackets.
Glyphs | Name | Name pronunciation | Notes on usage (for more, see Finnish phonology) |
---|---|---|---|
A, a | aa | /ɑː/ | |
B, b | bee | /beː/ | Occurs in relatively new loanwords, such as banaani 'banana' and bussi 'bus'. Typically represents [b̥] or [p]. |
C, c | see | /seː/ | Occurs in unestablished loanwords, such as curry and cesium . Typically represents [k] or [s]. |
D, d | dee | /deː/ | In present standard language, ⟨d⟩ stands for [d], but it represents [d̥] or [t̪], and the pronunciation in dialects varies greatly. Natively used in Western dialects as [ɾ] and not at all in Eastern dialects. |
E, e | ee | /eː/ | The precise pronunciation tends to be between [e] and [ɛ]. |
F, f | äf, äffä | /æf/, /ˈæf.fæ/, occasionally /ef/ | Occurs in relatively new loanwords, such as asfaltti 'asphalt' or uniformu 'uniform'. Historically and in dialectal pronunciation (apart from some Western dialects), /f/ is typically replaced with /ʋ/ or medially /hʋ/ (e.g. kahvi/ˈkah.ʋi/ ← Swedish kaffe 'coffee'). Even newer loanwords may have an alternative spelling where ⟨v⟩ has replaced ⟨f⟩ (asvaltti, univormu). Note that the names of the country, language, and nationality beginning with F (Finland, Finnish, Finn) are non-native, the native ones being Suomi, suomi, and suomalainen. |
G, g | gee | /ɡeː/ | Occurs natively in the digraph ⟨ng⟩, which marks the long velar nasal [ŋː] (with no [ɡ] sound). Otherwise ⟨g⟩ only occurs in relatively new loanwords, such as gaala 'gala' and geeni 'gene'. Typically represents [ɡ̊] or [k]. |
H, h | hoo | /hoː/ | Normally a voiceless fricative, but the precise pronunciation depends on the preceding vowel; between two vowels may be pronounced as breathy-voiced [ɦ]. |
I, i | ii | /iː/ | [i] |
J, j | jii | /jiː/ | Without exception [j] (English consonant ⟨y⟩), as in German and Swedish, never fricative or affricate as in French or English. |
K, k | koo | /koː/ | |
L, l | äl, ällä | /æl/, /ˈæl.læ/, occasionally /el/ | |
M, m | äm, ämmä | /æm/, /ˈæm.mæ/, occasionally /em/ | |
N, n | än, ännä | /æn/, /ˈæn.næ/, occasionally /en/ | |
O, o | oo | /oː/ | The precise pronunciation tends to be between [o] and [ɔ]. |
P, p | pee | /peː/ | |
Q, q | kuu | /kuː/ | Mainly occurs in foreign proper names (in loanwords digraph ⟨qu⟩ has often been replaced with ⟨kv⟩). Typically represents [k], though some speakers mispronounce it as [ɡ]. |
R, r | är, ärrä | /ær/, /ˈær.ræ/, occasionally /er/ | |
S, s | äs, ässä | /æs/, /ˈæs.sæ/, occasionally /es/ | |
T, t | tee | /teː/ | The precise pronunciation tends to be dental [t̪] rather than alveolar [t]. |
U, u | uu | /uː/ | The precise pronunciation tends to be between [u] and [o]. |
V, v | vee | /ʋeː/ | Typically represents approximant [ʋ] rather than fricative [v]. |
W, w | kaksois-vee tupla-vee | /ʋeː/, /ˈkɑk.soisˌʋeː/, /ˈtup.lɑˌʋeː/ | The "double-v" may occur natively as an archaic variant of ⟨v⟩, but otherwise in unestablished loanwords and foreign proper names only. It occurs in some rare surnames such as Waltari (e.g. Mika Waltari, a world-famous author) or in some rare first names such as Werner (e.g. Werner Söderström, a well-known publisher). In collation the letter ⟨w⟩ is treated mostly like ⟨v⟩. Typically represents [ʋ]. |
X, x | äks, äksä | /æks/, /ˈæk.sæ/, occasionally /eks/ | Occurs in unestablished loanwords, such as taxi or fax , but there is often a preferred alternative where ⟨x⟩ has been replaced with digraph ⟨ks⟩ (taksi, faksi). Typically represents [ks]. |
Y, y | yy | /yː/ | The precise pronunciation tends to be between [y] and [ø]. |
Z, z | tset, tseta | /tset/, /ˈtse.tɑ/, /zet/, /ˈze.tɑ/ | Occurs in unestablished loanwords, such as zeniitti/tse.niːt.ti/ 'zenith' or pizza , but there may be an alternative spelling with ⟨ts⟩ (e.g. pitsa). Typically represents [ts] (like in German), but sometimes [dz] or [z]. |
Å, å | ruotsalainen oo | /oː/, /ˈruot.sɑˌlɑi.nenoː/ | The "Swedish ⟨o⟩", carried over from the Swedish alphabet and redundant in Finnish; retained especially for writing Finland-Swedish proper names (such as Ståhlberg). All Finnish words containing ⟨å⟩ are names; there it represents [oː] (identically to ⟨oo⟩). |
Ä, ä | ää | /æː/ | |
Ö, ö | öö | /øː/ | The precise pronunciation tends to be between [ø] and [œ]. |
In addition, ⟨ w ⟩ is sometimes listed separately and after ⟨v⟩, although officially it is merely a variant of the latter and can be alphabetized as ⟨v⟩. [2] Similarly, ⟨ š ⟩ and ⟨ ž ⟩ are variants of ⟨s⟩ and ⟨z⟩, but they are often overlooked, as they are only used in some relatively new loanwords and foreign names, and may be replaced with ⟨sh⟩ and ⟨zh⟩, respectively, [3] if it is technically impossible to reproduce ⟨š⟩ and ⟨ž⟩. [4] The Finnish keyboard layout on Microsoft Windows does not include ⟨š⟩ or ⟨ž⟩; thus, in practice, only highly formal sources such as official texts, encyclopedias or Helsingin Sanomat use them.
Glyphs | Name | Name pronunciation | Notes on usage (for more, see Finnish phonology) |
---|---|---|---|
Š, š | hattu-äs, hattu-ässä; suhu-äs, suhu-ässä | /ˈhat.tu.æs/, /ˈhat.tu.æsæ/; /ˈsu.hu.æs/, /ˈsu.hu.æsæ/ | The "⟨s⟩ with caron" is a rare variant of ⟨s⟩. It occurred in some relatively new loanwords, such as šakki 'chess' and šillinki 'shilling', but is often replaced with digraph ⟨sh⟩ (šampoo → shampoo ) or, in more established loanwords, with plain ⟨s⟩ (sampoo). In theory it represents [ʃ] but actual pronunciation may vary. |
Ž, ž | hattu-tset, hattu-tseta | /ˈhat.tuˌtset/, /ˈhat.tuˌtse.tɑ/ | The "⟨z⟩ with caron" is a rare variant of ⟨z⟩. It occurs in some unestablished loanwords, such as džonkki 'junk', and foreign proper names, but is often replaced with digraph ⟨zh⟩. In theory represents [ʒ] but the actual pronunciation may vary. |
The main peculiarities in the Finnish alphabet are the two extra vowels ⟨ä⟩ and ⟨ö⟩ (and Swedish ⟨å⟩, which is not actually needed in Finnish). In Finnish, these extra letters are collectively called ääkköset when they need to be distinguished from the ISO basic Latin alphabet; the word is a somewhat playful modification of aakkoset, which is Finnish for "alphabet". Another informal term is skandit or skandimerkit, short for skandinaaviset merkit "Scandinavian characters" (however, the Danish and Norwegian ⟨æ⟩ and ⟨ø⟩ are usually not taken into account).
In Finnish, ⟨ä⟩, ⟨ö⟩ and ⟨y⟩ are the "front vowel" counterparts to the "back vowels" ⟨a⟩, ⟨o⟩ and ⟨u⟩; grammatical endings and suffixes using these letters, use either the front or back form depending on the vowel harmony of the word. The glyphs for ⟨ä⟩ and ⟨ö⟩ are derived from the similar-looking German umlauted letters, but as with ⟨y⟩ versus ⟨u⟩, they are considered letters in their own right and thus alphabetized separately (after ⟨z⟩).
The Germanic umlaut or convention of considering digraph ⟨ae⟩ equivalent to ⟨ä⟩, and ⟨oe⟩ equivalent to ⟨ö⟩ is inapplicable in Finnish. Moreover, in Finnish, both ⟨ae⟩ and ⟨oe⟩ are vowel sequences, not single letters, and they have independent meanings (e.g. haen "I seek" vs. hän "he, she").
In handwritten text, the actual form of the extra marking may vary from a pair of dots to a pair of short vertical bars, to a single horizontal bar, or to a wavy line resembling a tilde. In practice, almost any diacritic situated above the base glyph (such as, á ā ã) would probably be interpreted as a carelessly written pair of dots (ä). However, in computerized character sets, these alternatives are incorrect. The front-vowel counterpart of ⟨u⟩ using ⟨y⟩ rather than ⟨ü⟩ is carried over from Swedish, and also avoids confusion in cursive script with ⟨ii⟩, which is common in Finnish.
In the Finnish writing system, some basic Latin letters are considered redundant, and other letters generally represent sounds that are not inherent in the Finnish language. Thus, they are not used in established Finnish words, but they may occur in newer loanwords as well as in foreign proper names, and they are included in the Finnish alphabet in order to maintain interlingual compatibility. The pronunciation of these letters varies quite a lot.
In Finnish, words are ordered alphabetically according to the collation rules specified in the official standard SFS 4600. [2] There are a few cases where Finnish collation is different from the rules applied in English:
Diacritics are never added to letters in native Finnish words (as the dots above the Finnish graphemes ⟨ä⟩ and ⟨ö⟩ are not considered diacritics). Generally, diacritics are retained in foreign-language proper names, e.g. Vilén , if possible, but when arranging words alphabetically, diacritics are usually ignored (this also applies to ⟨ š ⟩ and ⟨ ž ⟩, despite them being an officially recognized part of Finnish orthography). There are, however, some exceptions:
The standard does not specify how one should alphabetize the letter ⟨ü⟩ when used in other languages than German, but at least as regards the Estonian or Hungarian ⟨ü⟩, it seems consistent to treat it as equivalent to ⟨y⟩ (and even more so, since ⟨ü⟩ in Estonian and Hungarian is not considered a mere variant of ⟨u⟩, as it is in German). It would seem problematic, however, to apply the same principle to e.g. ⟨ü⟩ (u-diaeresis) as used in Spanish or ⟨õ⟩ (nasal vowel) as used in Portuguese, as these letters represent quite different orthographic traditions.
Other special cases:
Ligatures are alphabetized as two individual letters:
Letters and characters taken from other alphabets (e.g. Σ 'Greek capital letter sigma') or writing systems are collated after Latin letters.
When writing Finnish, the foundational principle is that each letter stands for one sound and each sound is always represented by the same letter, within the bounds of a single morpheme. The most notable exception to this rule is the velar nasal [ŋ], which does not have an allotted letter.
In Finnish, both vowels and consonants may be either short or long. A short sound is written with a single letter, and a long sound is written with a double letter (digraph). It is necessary to recognize the difference between such words as tuli/ˈtu.li/ 'fire', tuuli/ˈtuː.li/ 'wind' and tulli/ˈtul.li/ 'customs'. However, long consonants are sometimes written as short consonants in morpheme boundaries (see Finnish phonology#Sandhi for this phenomenon), thus, /ˈhɑ.kel.lɑ.vɑ/ is written as hakelava "open-box bed for wood chips" instead of expected *hakellava, and /ˈtu.let.tæn.ne/ is tule tänne "come here" instead of *tule ttänne or *tulet tänne.
In syllabification, a long consonant is always regarded as having a syllable break in the middle (as in /ˈtɑp.pɑː/), but a long vowel (or a diphthong) is regarded as a single unit that functions as the nucleus of a syllable. Either a long or short vowel may occur in a stressed as well as unstressed syllable. The phonetic quality of a vowel remains the same regardless of whether the vowel is long or short, or whether it is stressed or unstressed.
The velar nasal /ŋ/ (generally referred to as äng-äänne 'the eng sound') does not have a letter of its own. Natively, a short /ŋ/ only occurs before /k/, and it is simply written with ⟨n⟩, as in kenkä/ˈkeŋ.kæ/ 'shoe'. Since the alveolar nasal /n/ can not occur in such a position, /ŋ/ can be seen as an allophone of /n/. However, if the /k/ is weakened (because of a phenomenon called consonant gradation that occurs when the word is inflected), the result is a long, or geminated, velar nasal /ŋː/ that is written with digraph ⟨ng⟩, as in kengät/ˈkeŋ.ŋæt/ 'shoes'. The geminated /ŋː/ is not an allophone of geminated /nː/, since minimal pairs do exist: kangas/ˈkɑŋ.ŋɑs/ 'textile' vs. kannas/ˈkɑn.nɑs/ 'isthmus'.
The treatment of the velar nasal in loanwords is highly inconsistent, often mixing the original spelling of the word with an applied Finnish pronunciation pattern. Englanti "England" is pronounced /ˈeŋ.lɑn.ti/ (with a short /ŋ/ but no /ɡ/), and even magneetti "magnet" is pronounced /ˈmɑŋ.neːt.ti/ (with plain ⟨g⟩ being pronounced as /ŋ/ when followed by ⟨n⟩, as in classical Latin) – cf. a more specialized term diagnoosi/di.aɡ.noː.si/ 'diagnosis', and in a word-initial position gnuu/ɡnuː/ "gnu". Following the typical Finnish pronunciation pattern, kongestio "congestion" is often pronounced /ˈkoŋ.ŋes.ti.o/, but /ˈkoŋ.ɡes.ti.o/ may also occur.
Traditionally, /b/ and /ɡ/ are not counted as Finnish phonemes, since they only appear in loanwords. However, these borrowings being relatively common, they are nowadays considered part of the educated norm. The failure to use them correctly is sometimes ridiculed, e.g. if a news reporter or a high official consistently and publicly pronounces Belgia 'Belgium' as /ˈpel.ki.a/. Even many educated speakers, however, still make no distinction between voiced and voiceless plosives in regular speech, although minimal pairs exist: /ˈbus.si/ 'bus' vs. /ˈpus.si/ 'bag', /ˈɡo.ril.lɑ/ 'gorilla' vs. /ˈko.ril.lɑ/ 'with/at a basket'.
The status of /d/ is somewhat different from /b/ and /ɡ/, since it appears in native Finnish words, too, as a regular "weak" correspondence of the voiceless /t/ (as a result of consonant gradation), and even in the infinitives of many verbs, such as syödä, "to eat". At the time when Mikael Agricola, the "father" of literary Finnish, devised a system for writing the language, this sound still had the value of the voiced dental fricative /ð/, as in English "then". Since neither Swedish nor German of that time had a separate sign for this sound, Agricola chose to mark it with ⟨d⟩ or ⟨dh⟩.
Later on, the /ð/ sound developed in a variety of ways in different Finnish dialects: it was deleted, or became a hiatus, a flap consonant, or any of ⟨r⟩, ⟨l⟩, ⟨j⟩, ⟨jj⟩. For example, historical and rare dialectal meiðän, käðen "our" (gen.), "hand" (gen.) could be:
In the middle of the 19th century, a significant portion of the Swedish-speaking upper class in Finland decided that Finnish had to be made equal in usage to Swedish.[ citation needed ] They even started using Finnish as their home language, even while very few of them really mastered it well. Since the historical */ð/ no more had a common way of pronunciation between different Finnish dialects and since it was usually written as ⟨d⟩, many started using the Swedish pronunciation [d], which eventually became the educated norm.
Initially, few native speakers of Finnish acquired the foreign plosive realisation of the native phoneme. Still some decades ago it was not entirely exceptional to hear loanwords like deodorantti 'deodorant' pronounced as /teotorantti/, while native Finnish words with a /d/ were pronounced in the usual dialectal way. Nowadays, the Finnish language spoken by native Swedish speakers is not anymore considered paradigmatic, but as a result of their long-lasting prestige, many people particularly in the capital district acquired the new [d] sound. Due to diffusion of the standard language through mass media and basic education, and due to the dialectal prestige of the capital area, the plosive [d] can now be heard in all parts of the country, at least in loanwords and in formal speech. Nowadays replacing /d/ with a /t/ is considered rustic, for example /nyttarvittaisuuttatirektiiviä/ instead of /nyttarvittaisiinuuttadirektiiviä/ 'now we could use a new directive'.
In Helsinki slang, the slang used by some, more rarely nowadays, in Helsinki, the voiced stops are found in native words even in positions which are not the result of consonant gradation, e.g. /dallas/ 's/he walked' (← native verb root talla-), /bonjata/ 'to understand' (← Russian /ponʲiˈmatʲ/ понимать). In the Southwestern dialects of Rauma-Eurajoki-Laitila area, /b/, /d/ and /ɡ/ are commonplace, since the voicing of nasals spread to phonemes /p/, /t/ and /k/, making them half-voiced, e.g. /sendä/ ← sentään or /ninɡo/ ← niin kuin. They are also found in those coastal areas where Swedish influenced the speech.
Letter | spelling name |
---|---|
A, a | Aarne |
B, b | Bertta |
C, c | Celsius |
D, d | Daavid |
E, e | Eemeli |
F, f | Faarao |
G, g | Gideon |
H, h | Heikki |
I, i | Iivari |
J, j | Jussi |
K, k | Kalle |
L, l | Lauri |
M, m | Matti |
N, n | Niilo |
O, o | Otto |
P, p | Paavo |
Q, q | Kuu |
R, r | Risto |
S, s | Sakari |
T, t | Tyyne |
U, u | Urho |
V, v | Vihtori |
W, w | Wiski |
X, x | Äksä |
Y, y | Yrjö |
Z, z | Tseta |
Å, å | Åke |
Ä, ä | Äiti |
Ö, ö | Öljy |
A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨ó⟩, grave ⟨ò⟩, and circumflex ⟨ô⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation for the sounds of speech. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.
A macron is a diacritical mark: it is a straight bar ¯ placed above a letter, usually a vowel. Its name derives from Ancient Greek μακρόν (makrón) 'long' because it was originally used to mark long or heavy syllables in Greco-Roman metrics. It now more often marks a long vowel. In the International Phonetic Alphabet, the macron is used to indicate a mid-tone; the sign for a long vowel is instead a modified triangular colon ⟨ː⟩.
Unless otherwise noted, statements in this article refer to Standard Finnish, which is based on the dialect spoken in the former Häme Province in central south Finland. Standard Finnish is used by professional speakers, such as reporters and news presenters on television.
The circumflex is a diacritic in the Latin and Greek scripts that is also used in the written forms of many languages and in various romanization and transcription schemes. It received its English name from Latin: circumflexus "bent around"—a translation of the Greek: περισπωμένη.
A caron is a diacritic mark placed over certain letters in the orthography of some languages, to indicate a change of the related letter's pronunciation.
A phonemic orthography is an orthography in which the graphemes correspond consistently to the language's phonemes, or more generally to the language's diaphonemes. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme–phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic.
Modern English is written with a Latin-script alphabet consisting of 26 letters, with each having both uppercase and lowercase forms. The word alphabet is a compound of alpha and beta, the names of the first two letters in the Greek alphabet. Old English was first written down using the Latin alphabet during the 7th century. During the centuries that followed, various letters entered or fell out of use. By the 16th century, the present set of 26 letters had largely stabilised:
The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters (9) with diacritics: the acute accent – kreska: ⟨ć, ń, ó, ś, ź⟩; the overdot – kropka: ⟨ż⟩; the tail or ogonek – ⟨ą, ę⟩; and the stroke – ⟨ł⟩. ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. Additionally, before the standardization of Polish spelling, ⟨qu⟩ was sometimes used in place of ⟨kw⟩, and ⟨x⟩ in place of ⟨ks⟩.
A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.
Finno-Ugric transcription (FUT) or the Uralic Phonetic Alphabet (UPA) is a phonetic transcription or notational system used predominantly for the transcription and reconstruction of Uralic languages. It was first published in 1901 by Eemil Nestor Setälä, a Finnish linguist; it was somewhat modified in the 1970s.
Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic—the written letters correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see Polish phonology.
Portuguese orthography is based on the Latin alphabet and makes use of the acute accent, the circumflex accent, the grave accent, the tilde, and the cedilla to denote stress, vowel height, nasalization, and other sound changes. The diaeresis was abolished by the last Orthography Agreement. Accented letters and digraphs are not counted as separate characters for collation purposes.
The Romic Alphabet, sometimes known as the Romic Reform, is a phonetic alphabet proposed by Henry Sweet. It descends from Ellis's Palaeotype alphabet and English Phonotypic Alphabet, and is the direct ancestor of the International Phonetic Alphabet. In Romic every sound had a dedicated symbol, and every symbol represented a single sound. There were no capital letters; there were letters derived from small capitals, though these were distinct letters.
Hindustani is the lingua franca of northern India and Pakistan, and through its two standardized registers, Hindi and Urdu, a co-official language of India and co-official and national language of Pakistan respectively. Phonological differences between the two standards are minimal.
The Palaeotype alphabet is a phonetic alphabet used by Alexander John Ellis to describe the pronunciation of English. It was based on the theory of Bell's Visible Speech, but set in roman script, and attempted to include the sounds conveyed by Lepsius's Standard Alphabet as well. It in turn inspired Henry Sweet's 1877 Romic alphabet, which itself served as the basis for the International Phonetic Alphabet.
Dania is the traditional linguistic transcription system used in Denmark to describe the Danish language. It was invented by Danish linguist Otto Jespersen and published in 1890 in the Dania, Tidsskrift for folkemål og folkeminder magazine from which the system was named.