English terms with diacritical marks

Last updated

è
Letter e with diacritic grave

English rarely uses diacritics, which are symbols indicating the modification of a letter's sound when spoken. [1] Most of the affected words are in terms imported from other languages. [2] The two dots accent (diaeresis or umlaut), the grave accent and the acute accent are the only diacritics native to Modern English, and their usage has tended to fall off except in certain publications and particular cases. [3] [4]

Contents

Proper nouns are not generally counted as English terms except when accepted into the language as an eponym – such as Geiger–Müller tube, or the English terms roentgen after Wilhelm Röntgen, and biro after László Bíró, in which case any diacritical mark is often lost.

Unlike continental European languages, English orthography tends to use digraphs (like "sh", "oo", and "ea") rather than diacritics to indicate more sounds than can be accommodated by the letters of the Latin alphabet. Unlike other systems (such as Spanish orthography) where the spelling indicates the pronunciation, English spelling is highly varied, and diacritics alone would be insufficient to make it reliably phonetic. (See English orthography § History.)

Types of diacritical marks

Though limited, the following diacritical marks in English may be encountered, particularly for marking in poetry: [5]

In representing European personal names, anthroponyms, and place names, toponyms, the following are often encountered:

For a more complete list see diacritical marks.

Special characters

Some sources distinguish "diacritical marks" (marks upon standard letters in the A–Z 26-letter alphabet) from "special characters" (letters not marked but radically modified from the standard 26-letter alphabet) such as Old English and Icelandic eth (Ð, ð) and thorn (uppercase Þ, lowercase þ), and ligatures such as Latin and Anglo-Saxon Æ (minuscule: æ), and German eszett (ß; final -ß, often -ss even in German and always in Swiss Standard German). [6] [7] [8]

The reverse of "special characters" is when foreign digraphs, such as Welsh ll in Llanelli, Dutch ij, or Croatian nj (same in Serbian and Bosnian) are simply treated as two standard A–Z characters.

Native English words

In some cases, the diacritic is not borrowed from any foreign language but is purely of English origin. The second of two vowels in a hiatus can be marked with a diaeresis (or "tréma") – as in words such as coöperative, daïs and reëlect – but its use has become less common, sometimes being replaced by the use of a hyphen. [9] The New Yorker [9] [10] [11] and MIT Technology Review under Jason Pontin have maintained such usage as house styles.

The diaeresis mark is also in rare cases used over a single vowel to show that it is pronounced separately (as in Brontë). It is often omitted in printed works because the sign is missing on modern keyboards. [3]

The acute and grave accents are occasionally used in poetry and lyrics: the acute to indicate stress overtly where it might be ambiguous (rébel vs. rebél) or nonstandard for metrical reasons (caléndar); the grave to indicate that an ordinarily silent or elided syllable is pronounced (warnèd, parlìament).

In historical versions of English

The Old English Latin alphabet began to replace the Runic alphabet in the 8th century, due to the influence of Celtic Christian missionaries to the Anglo-Saxon kingdoms. Orthography of Old English – which was entirely handwritten in its own time – was not well standardized, though it did not use all the Latin letters, and included several letters not present in the modern alphabet. When reprinted in modern times, an overdot is occasionally used with two Latin letters to differentiate sounds for the reader:

Some modern printings also apply diacritics to vowels following the rules of Old Norse normalized spelling developed in the 19th century.

In the Late Middle English period, the shape of the English letter þ (thorn), which was derived from the Runic alphabet, evolved in some handwritten and blackletter texts to resemble the Latin letter y. The þ shape survived into the era of printing presses only as far as the press of William Caxton. In later publications, thorn was represented by "y", or by to distinguish thorn from y. By the end of the Early Modern English period, thorn had been completely replaced in contemporary usage by the digraph "th" (reviving a practice from early Old English), and the overdot was no longer needed outside of printings of very old texts. The overdot is missing from the only surviving usage of a Y-shaped thorn, in the archaic stock phrase ye olde (from "þe olde", pronounced "the old", but "ye olde" is often misanalyzed and pronounced with the modern "y" sound).

Words imported from other languages

Loanwords, or sometimes more precisely called borrowed words, have entered the English language from foreign languages by a process of naturalisation, or specifically anglicisation, which is carried out mostly unconsciously (a similar process occurs in all other languages). [12] During this process, there is a tendency to adapt the original word: this includes accents and other diacritics being dropped (for example French hôtel and French rôle becoming "hotel" and "role" respectively in English, or French à propos , which lost both the accent and space to become English "apropos"). [13]

In many cases, imported words can be found in print in both their accented and unaccented versions. Since modern dictionaries are mostly descriptive and no longer prescribe outdated forms, they increasingly list unaccented forms, though some dictionaries, such as the Oxford English Dictionary, do not list the unaccented variants of particular words (e.g., soupçon).

Words that retain their accents often do so to help indicate pronunciation (e.g. frappé , naïve , soufflé ), or to help distinguish them from an unaccented English word (e.g. exposé vs. expose, résumé vs. resume, rosé vs. rose). Technical terms or those associated with specific fields (especially cooking or musical terms) are less likely to lose their accents (such as the French soupçon , façade and entrée ).

Some Spanish words with the Spanish letter ñ have been naturalised by substituting English ny (e.g., Spanish cañón is now usually English canyon , Spanish piñón is now usually English pinyon pine ). Certain words, like piñata , jalapeño and quinceañera , are usually kept intact. In many instances the ñ is replaced with the plain letter n . In words of German origin (e.g. doppelgänger), the letters with umlauts ä, ö, ü may be written ae, oe, ue. [14] This could be seen in many newspapers during World War II, which printed Fuehrer for Führer. However, today umlauts are usually either left out, with no e following the previous letter, or included as written in German (as in The New York Times or The Economist ). Zurich is an exception since it is not a case of a "dropped umlaut", but is a genuine English exonym, used also in French (from Latin Turicum)—therefore it may be seen written without the umlaut even alongside other German and Swiss names that retain the umlaut. [15] The German letter ß is usually replaced in English by ‘ss’. This is seen in names such as Pascal Groß.

Accent-addition and accent-removal

As words are naturalized into English, sometimes diacritics are added to imported words that originally did not have any, often to distinguish them from common English words or to otherwise assist in proper pronunciation. In the cases of maté from Spanish mate ( /ˈmɑːt/ ; Spanish: ['mate] ), animé from Japanese anime, and latté or even lattè from Italian latte ( /ˈlɑːt/ ; Italian pronunciation: [ˈlatte] ), an accent on the final e indicates that the word is pronounced with // at the end, rather than the e being silent. Examples of a partial removal include resumé (from the French résumé) and haček (from the Czech háček) because of the change in pronunciation of the initial vowels. Complete naturalization stripping all diacritics also has occurred, in words such as canyon, from the Spanish cañón. For accurate readings, some speech writers use diacritics to differentiate homographs, such as lēad (pronounced like liːd) and lĕad (pronounced like lɛd). In poetic usage are adjectives such as learnèd and belovèd, which are pronounced with two and three syllables respectively, unlike the past participles learned and beloved, which are each pronounced with one fewer syllable.

Regional differences

Canada

In Canadian English, words of French origin retain their orthography more often than in other English-speaking countries, such as the usage of é (e with acute) in café, Montréal, née, Québec, and résumé. This is due to the large influence afforded by French being one of Canada's two official languages at the federal government level as well as at the provincial level in New Brunswick and Manitoba, and the majority and sole official language in Québec.

New Zealand

In New Zealand, loanwords of Māori origin typically include the macron that is used in current written Māori to indicate vowel length. This practice has become increasingly widespread since 2000, with both of the main newspaper chains adopting macrons in their print and online editions in May 2018. [16] [17] This shift primarily reflects changing social attitudes as part of the ongoing Māori Renaissance, and is sometimes followed in English outside New Zealand when writing about New Zealand topics. In some areas, such as the Waikato, use of a macron is replaced by a double vowel instead (eg. "Maaori" instead of "Māori"), typically in areas where local Māori dialects prefer usage of a double vowel. [18] A diaeresis has sometimes been used ("Mäori") in place of a macron where the technical capacity to display a macron is limited. [19]

Regional dialects

Diacritics have been employed in the orthographies of some regional dialects in England.

Names with diacritics

Diacritics are used in the names of some English-speaking people:

Typographical limitations

The early days of metal type printing quickly faced problems of not just simple diacritical marks for English, and accents for French and German, but also musical notation (for sheet music printing) and Greek and Hebrew alphabets (for Bible printing). [22] However problems with representation of diacritical marks continued even in scholarly publishing and dissertations up to the word processor era. [23] [ full citation needed ] Mechanical typewriter keyboards manufactured for English-speaking countries seldom include diacritics.

The first generation of word processors also had character set limitations, [24] and confusion due to typesetting convention was exacerbated in the character coded environment due to limitations of the ASCII character set. [25]

See also

Notes

  1. háč is cognate to hook , and -ek is a diminutive suffix which can also be seen in the Polish language ogonek
  2. actually, a simultaneous [ ʒ ] and trilled [ r ]; devoiced to [ r̝̊ ] in contact with a voiceless consonant. See Czech phonology for more details.

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

A macron is a diacritical mark: it is a straight bar ¯ placed above a letter, usually a vowel. Its name derives from Ancient Greek μακρόν (makrón) 'long' because it was originally used to mark long or heavy syllables in Greco-Roman metrics. It now more often marks a long vowel. In the International Phonetic Alphabet, the macron is used to indicate a mid-tone; the sign for a long vowel is instead a modified triangular colon ː.

The acute accent, ◌́, is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts. For the most commonly encountered uses of the accent in the Latin and Greek alphabets, precomposed characters are available.

The double acute accent is a diacritic mark of the Latin and Cyrillic scripts. It is used primarily in Hungarian or Chuvash, and consequently it is sometimes referred to by typographers as hungarumlaut. The signs formed with a regular umlaut are letters in their own right in the Hungarian alphabet—for instance, they are separate letters for the purpose of collation. Letters with the double acute, however, are considered variants of their equivalents with the umlaut, being thought of as having both an umlaut and an acute accent.

The circumflex is a diacritic in the Latin and Greek scripts that is also used in the written forms of many languages and in various romanization and transcription schemes. It received its English name from Latin: circumflexus "bent around"—a translation of the Greek: περισπωμένη.

<span class="mw-page-title-main">Breve</span> Diacritical mark

A breve is the diacritic mark ◌̆, shaped like the bottom half of a circle. As used in Ancient Greek, it is also called brachy, βραχύ. It resembles the caron but is rounded, in contrast to the angular tip of the caron. In many forms of Latin, ◌̆ is used for a shorter, softer variant of a vowel, such as "Ĭ", where the sound is nearly identical to the English /i/.

<span class="mw-page-title-main">Ö</span> Latin letter O with diaeresis

Ö, or ö, is a character that represents either a letter from several extended Latin alphabets, or the letter "o" modified with an umlaut or diaeresis. Ö, or ö, is a variant of the letter O. In many languages, the letter "ö", or the "o" modified with an umlaut, is used to denote the close- or open-mid front rounded vowels or. In languages without such vowels, the character is known as an "o with diaeresis" and denotes a syllable break, wherein its pronunciation remains an unmodified.

<span class="mw-page-title-main">Ü</span> Latin letter U with umlaut/diaeresis

Ü is a Latin script character composed of the letter U and the diaeresis diacritical mark. In some alphabets such as those of a number of Romance languages or Guarani it denotes an instance of regular U to be construed in isolation from adjacent characters with which it would usually form a larger unit; other alphabets like the Azerbaijani, Estonian, German, Hungarian and Turkish ones treat it as a letter in its own right. In those cases it typically represents a close front rounded vowel.

A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

<span class="mw-page-title-main">German alphabet</span> German form of the Latin alphabet

The modern German alphabet consists of the twenty-six letters of the ISO basic Latin alphabet:

<span class="mw-page-title-main">Ë</span> Latin letter E with diaeresis; used in Albanian, Dutch, French, and Afrikaans

Ë, ë (e-diaeresis) is a letter in the Albanian, Kashubian, Emilian, Romagnol, Ladin, and Lenape alphabets. As a variant of the letter e, it also appears in Acehnese, Afrikaans, Belarusian, Breton, Dutch, English, Filipino, French, Luxembourgish, Piedmontese, Russian, the Abruzzese dialect of the Neapolitan language, and the Ascolano dialect. The letter is also used in Seneca, Taiwanese Hokkien, Turoyo, and Uyghur when written in Latin script.

When used as a diacritic mark, the term dot refers to the glyphs "combining dot above", and "combining dot below" which may be combined with some letters of the extended Latin alphabets in use in a variety of languages. Similar marks are used with other scripts.

German orthography is the orthography used in writing the German language, which is largely phonemic. However, it shows many instances of spellings that are historic or analogous to other spellings rather than phonemic. The pronunciation of almost every word can be derived from its spelling once the spelling rules are known, but the opposite is not generally the case.

French orthography encompasses the spelling and punctuation of the French language. It is based on a combination of phonemic and historical principles. The spelling of words is largely based on the pronunciation of Old French c. 1100–1200 AD, and has stayed more or less the same since then, despite enormous changes to the pronunciation of the language in the intervening years. Even in the late 17th century, with the publication of the first French dictionary by the Académie française, there were attempts to reform French orthography.

Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in a number of languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons.

<span class="mw-page-title-main">Latin script</span> Writing system based on the alphabet used by the Romans

The Latin script, also known as the Roman script, and technically Latin writing system is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae, in southern Italy. The Greek alphabet was altered by the Etruscans, and subsequently their alphabet was altered by the Romans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.

Greek orthography has used a variety of diacritics starting in the Hellenistic period. The more complex polytonic orthography, which includes five diacritics, notates Ancient Greek phonology. The simpler monotonic orthography, introduced in 1982, corresponds to Modern Greek phonology, and requires only two diacritics.

<span class="mw-page-title-main">Umlaut (diacritic)</span> Diacritic mark to indicate sound shift

The umlaut is the diacritical mark used to indicate in writing the result of the historical sound shift due to which former back vowels are now pronounced as front vowels.

The diaeresis is a diacritical mark used to indicate the separation of two distinct vowel letters in adjacent syllables when an instance of diaeresis occurs, so as to distinguish from a digraph or diphthong.

References

  1. Ambrose, Gavin; Harris, Paul (2007). The Fundamentals of Typography. AVA. p. 92. ISBN   9782940373451. OCLC   842600469. Diacritical marks  Diacritical marks are a range of accents and other symbols, which indicate that the sound of a letter is modified during pronunciation. These are rare in English but relatively common in other languages.
  2. Garner, Bryan A (2002). The Oxford Dictionary of American Usage and Style. p. 100. Diacritical Marks, also known as 'diacritics', are orthographical characters that indicate a special phonetic quality for a given character. They occur mostly in foreign languages. But in English a fair number of imported terms have diacritical marks"
  3. 1 2 Burchfield, R.W. (1996). Fowlers's Modern English Usage (3 ed.). Oxford University Press. p.  210. ISBN   0-19-869126-2.
  4. On Diacritics and Archaïsm. Flakery.org, June 18, 2006.
  5. Lennard, John (2006). The Poetry Handbook. p. 57. Though limited in English the following may be encountered: acute (née) and grave (changèd) accents, modifying vowels or marking stresses; the circumflex (entrepôt), indicating omitted s; the diaeresis (naïf), preventing a diphthong, or umlaut (über), altering Germanic vowels; the cedilla (soupçon), softening c; the tittle (frō [sic]), indicating omitted n or m, or macron (statūs), lengthening vowels; the tilde (Señor), indicating palatalised n; and the breve (drŏll = 'drol', not 'drowle'), shortening vowels....
  6. Cheng, Karen (2006). Designing Type. p. 212. The eszett (also spelled esszett or referred to as a 'sharp s') is not a diacritic, but a ligature that occurs only in the German language. In general, the eszett signifies an 'ss' letter combination. The use of the eszett has declined significantly over ..."
  7. "Diacritics & Special Characters". University of North Carolina. The following diacritics and special characters display: Diacritics: acute Á ; circumflex  ; grave À ; tilde à; umlaut Ä. Special characters: thorn, lowercase þ ; thorn, uppercase Þ.
  8. Korpela, Jukka K. (2006). Unicode Explained. p. 195. Many other scripts use ligatures far more often. Ligatures as discussed here should not be confused with characters that originate from ligatures. For example, capital Latin letter "ae" ae (U+00E6) is an independent letter in Norwegian and ..
  9. 1 2 diaeresis: December 9, 1998. The Mavens' Word of the Day. Random House.
  10. Umlauts in English?. General Questions. Straight Dope Message Board.
  11. Norris, Mary (2012-04-26). "The Curse of the Diaeresis". The New Yorker. The special tool we use here at The New Yorker for punching out the two dots that we then center carefully over the second vowel in such words as "naïve" and "Laocoön" will be getting a workout this year, as the Democrats coöperate to reëlect the President.
  12. Davidson, Misty (11 April 2021). "Borrowed Words–How English Borrows From Other Languages". Common Ground International Language Services. Retrieved 2 March 2023.
  13. Garner, Bryan A. (2009). Garner's Modern American Usage. p. 248. Sometimes they survive indefinitely, but often they fall into disuse as terms are fully naturalised. Nobody today, for example, writes hôtel or rôle.
  14. Knapp, Robbin D. (2005). German English Words: A Popular Dictionary of German Words Used in English. p. 108. When German words with umlauts are assimilated into the English language, they sometimes keep their umlauts (e.g., doppelgänger, Flügelhorn, föhn, Der Freischütz, führer, jäger, kümmel, Künstlerroman, schweizerkäse, über-), but often are ...
  15. Bewes, Diccon (2012). Swiss Watching. In English, the most daring thing we do now is leave the umlaut off Zürich; not that any British ear would hear the difference anyway. For other official names, such as the houses of parliament, I have given only the German version, as it's the one used most often.
  16. Crewdson, Patrick (11 September 2017). "Why Stuff is introducing macrons for te reo Māori words". Stuff.co.nz. Retrieved 29 May 2018.
  17. "Official language to receive our best efforts". New Zealand Herald. 9 May 2018. Archived from the original on 7 April 2020. Retrieved 29 May 2018.
  18. "Use of the double vowel in te reo Maaori at CM Health". Te Whatu Ora Counties Manukau. 20 June 2023. Retrieved 20 July 2023.
  19. Keane, Basil (11 March 2010). "Mātauranga hangarau – information technology - Māori language on the internet". Te Ara – Encyclopedia of New Zealand. Retrieved 29 May 2018.
  20. Gibson, Alexander Craig (1869). The Folk-speech of Cumberland and Some Districts Adjacent: Being Short Stories and Rhymes in the Dialects of the West Border Counties. J.R. Smith.
  21. Peacock, Edward (1889). A Glossary of Words Used in the Wapentakes of Manley and Corringham, Lincolnshire Volumes 1-2. English dialect society.
  22. Eliot, Simon; Rose, Jonathan (2011). A Companion to the History of the Book. p. 210. Within a short time, pages in metal type were combined with woodcut illustrations, later to be followed by metal engravings. Hebrew and Greek, with their vowel points and accents, and music posed problems of vertical as well as horizontal ..
  23. "(Unknown)". Scholarly Publishing. ? (?): 335. 1982. ... after printed copies of the dissertation – printed by the traditional letterpress process, from metal type – had been deposited in ... The original languages often required diacritical marks not used in English or an alphabet other than the Roman.
  24. Sassoon, Rosemary (1993). Computers and Typography. p. 59. character set limitations
  25. Bunke, Horst; Shen-pei Wang, Patrick (1997). Handbook of character recognition and document image analysis. p. 276. Confusion due to typesetting convention is exacerbated in the character coded environment due to the unfortunate limitations of the ubiquitous ASCII character set and the lack of a single widely accepted international standard for representation of characters with diacritics