This article needs additional citations for verification .(May 2023) |
Y with diaeresis | |
---|---|
Ÿ ÿ | |
![]() | |
Usage | |
History | |
Development | Y y
|
Other | |
ÿ is a Latin script character composed of the letter Y and the diaeresis diacritical mark. It occurs in French as a variant of ⟨ ï ⟩ in a few proper nouns, as in the name of the Parisian suburb of L'Haÿ-les-Roses [la.ileʁoz] and in the surname of the house of Croÿ [kʁu.i]. [1] It occurs in a few Hungarian names as well, such as Lajos Méhelÿ and Margit Danÿ.
As a diaeresis is never used on the first letter of a word and all-caps text typically omitted all accents, there was assumed to be no need for an uppercase ⟨Ÿ⟩ when computer character sets such as CP437 and ISO 8859-1 were designed. However much software assumes that conversion from lower-case to upper-case and then back again is lossless, so ⟨Ÿ⟩ was added to many character sets such as CP1252, ISO 8859-15, and Unicode. This also happened to a more prominent character, the German ß.
IPA uses ⟨ÿ⟩ to transcribe the close central compressed vowel, a type of vowel sound used in some spoken languages.
The character has also found use as a metal umlaut.
The lowercase ÿ has the Unicode code U+00FF, or 255, making it often appear when binary files are opened as text files.
A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨ó⟩, grave ⟨ò⟩, and circumflex ⟨ô⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.
ISO/IEC 8859-1:1998, Information technology—8-bit single-byte coded graphic character sets—Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets and the first two blocks of characters in Unicode.
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.
The double acute accent is a diacritic mark of the Latin and Cyrillic scripts. It is used primarily in Hungarian or Chuvash, and consequently it is sometimes referred to by typographers as hungarumlaut. The signs formed with a regular umlaut are letters in their own right in the Hungarian alphabet—for instance, they are separate letters for the purpose of collation. Letters with the double acute, however, are considered variants of their equivalents with the umlaut, being thought of as having both an umlaut and an acute accent.
Ø is a letter used in the Danish, Norwegian, Faroese, and Southern Sámi languages. It is mostly used to represent the mid front rounded vowels, such as and, except for Southern Sámi where it is used as an diphthong.
Mojibake is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often from a different writing system.
Ö, or ö, is a character that represents either a letter from several extended Latin alphabets, or the letter "o" modified with an umlaut or diaeresis. Ö, or ö, is a variant of the letter O. In many languages, the letter "ö", or the "o" modified with an umlaut, is used to denote the close- or open-mid front rounded vowels or. In languages without such vowels, the character is known as an "o with diaeresis" and denotes a syllable break, wherein its pronunciation remains an unmodified.
Ä is a character that represents either a letter from several extended Latin alphabets, or the letter A with an umlaut mark or diaeresis. In the International Phonetic Alphabet, it represents the open central unrounded vowel.
Œ is a Latin alphabet grapheme, a ligature of o and e. In medieval and early modern Latin, it was used in borrowings from Greek that originally contained the diphthong οι, and in a few non-Greek words. These usages continue in English and French. In French, the words that were borrowed from Latin and contained the Latin diphthong written as œ now generally have é or è; but œ is still used in some non-learned French words, representing open-mid front rounded vowels, such as œil ("eye") and sœur ("sister").
In German orthography, the letter ß, called Eszett or scharfes S, represents the phoneme in Standard German when following long vowels and diphthongs. The letter-name Eszett combines the names of the letters of ⟨s⟩ and ⟨z⟩ in German. The character's Unicode names in English are double s, sharp s and eszett. The Eszett letter is currently used only in German, and can be typographically replaced with the double-s digraph ⟨ss⟩, if the ß-character is unavailable. In the 20th century, the ß-character was replaced with ss in the spelling of Swiss Standard German, while remaining Standard German spelling in other varieties of the German language.
Ü is a Latin script character composed of the letter U and the diaeresis diacritical mark. In some alphabets such as those of a number of Romance languages or Guarani it denotes an instance of regular U to be construed in isolation from adjacent characters with which it would usually form a larger unit; other alphabets like the Azerbaijani, Estonian, German, Hungarian and Turkish ones treat it as a letter in its own right. In those cases it typically represents a close front rounded vowel.
Ï, lowercase ï, is a symbol used in various languages written with the Latin alphabet; it can be read as the letter I with diaeresis, I-umlaut or I-trema.
The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters (9) with diacritics: the acute accent – kreska: ⟨ć, ń, ó, ś, ź⟩; the overdot – kropka: ⟨ż⟩; the tail or ogonek – ⟨ą, ę⟩; and the stroke – ⟨ł⟩. The letters ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. Additionally, before the standardization of Polish spelling, ⟨qu⟩ was sometimes used in place of ⟨kw⟩, and ⟨x⟩ in place of ⟨ks⟩.
The Romanian alphabet is a variant of the Latin alphabet used for writing the Romanian language. It is a modification of the classical Latin alphabet and consists of 31 letters, five of which have been modified from their Latin originals for the phonetic requirements of the language.
Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.
Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in several languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons.
Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.