Latin Extended-C

Last updated
Latin Extended-C
RangeU+2C60..U+2C7F
(32 code points)
Plane BMP
Scripts Latin
Major alphabetsUighur
UPA
Assigned32 code points
Unused0 reserved code points
Unicode version history
5.0 (2006)17 (+17)
5.1 (2008)29 (+12)
5.2 (2009)32 (+3)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
Graphic Table of Latin Extended-C UCB Latin Extended-C.png
Graphic Table of Latin Extended-C

Latin Extended-C is a Unicode block containing Latin characters for Uighur New Script, the Uralic Phonetic Alphabet, Shona, Claudian Latin and the Swedish Dialect Alphabet.

Contents

Block

Latin Extended-C [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+2C6x
U+2C7x Ɀ
Notes
1. ^ As of Unicode version 15.1

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin Extended-C block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.0U+2C60..2C645 L2/04-372R N2847R Priest, Lorna (2004-12-09), Proposal to Encode Additional Latin Orthographic Characters
U+2C65..2C662 L2/05-076 Davis, Mark (2005-02-10), Stability of Case Folding
N2942 Freytag, Asmus; Whistler, Ken (2005-08-12), Proposal to add nine lowercase characters
L2/05-108R Moore, Lisa (2005-08-26), "Stability of Case Folding (B.14.2)", UTC #103 Minutes
L2/05-270 Whistler, Ken (2005-09-21), "C. Code points for already approved Latin characters", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "M47.5k, M47.5l", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C67..2C6C6 L2/05-029R N2931 Priest, Lorna (2005-01-28), Proposal to encode additional Latin orthographic characters for Uighur Latin Alphabet
L2/05-026 Moore, Lisa (2005-05-16), "Latin Orthographic Characters for Uighur (C.2)", UTC #102 Minutes
L2/05-263 N2992 Supporting references for N2931: Proposal to Encode Additional Latin Characters for Uighur and Kazak Latin Alphabet, 2005-09-21
L2/05-270 Whistler, Ken (2005-09-21), "C. Code points for already approved Latin characters", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C741 L2/04-246R Priest, Lorna (2004-07-26), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
L2/04-316 Moore, Lisa (2004-08-19), "C.6", UTC #100 Minutes
L2/04-348 N2906 Priest, Lorna (2004-08-23), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
L2/05-180 Moore, Lisa (2005-08-17), "C.16", UTC #104 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "M47.5a", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C75..2C762 L2/05-183 N2957 Everson, Michael; Haugen, Odd Einar; Emiliano, António; Pedro, Susana; Grammel, Florian; Baker, Peter; Stötzner, Andreas; Dohnicht, Marcus; Luft, Diana (2005-08-02), Preliminary proposal to add medievalist characters to the UCS
L2/05-191 Whistler, Ken (2005-08-02), Proposal for dealing with lowercase Claudian letters
L2/05-193R2 N2960R Everson, Michael (2005-08-12), Proposal to add Claudian Latin letters to the UCS
L2/05-180 Moore, Lisa (2005-08-17), "Claudian (C.15)", UTC #104 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.4.6, 8.2.3", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C771 L2/05-189 N2958 Lehtiranta, Juhani; Ruppel, Klaas; Suutari, Toni; Trosterud, Trond (2005-07-22), Report on progress in implementing the Uralic Phonetic Alphabet with indication of the need for additional characters and symbols
L2/05-261 N2989 Ruppel, Klaas; Kolehmainen, Erkki I.; Everson, Michael; Freytag, Asmus; Whistler, Ken (2005-09-13), Proposal to add six additional Uralicist characters to the UCS
L2/05-270 Whistler, Ken (2005-09-21), "A. Uralicist character additions", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.4.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
5.1U+2C6D..2C6E, 2C72..2C734 N2945 Priest, Lorna; Constable, Peter (2005-08-09), Proposal to Encode Additional Latin Phonetic and Orthographic Characters
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.8", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.1", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
U+2C6F1 L2/06-266 N3122 Everson, Michael (2006-08-06), Proposal to add Latin letters and a Greek symbol to the UCS
L2/06-231 Moore, Lisa (2006-08-17), "C.16", UTC #108 Minutes
N3153 (pdf, doc)Umamaheswaran, V. S. (2007-02-16), "M49.3", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29
U+2C711 L2/05-208 Constable, Peter; Esling, John (2005-08-02), Approval of new IPA sound: the labiodental flap
N2945 Priest, Lorna; Constable, Peter (2005-08-09), Proposal to Encode Additional Latin Phonetic and Orthographic Characters
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.8", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.1", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/18-324 Chan, Eiso (2018-11-02), Annotation additions for U+2C71
L2/19-047 Anderson, Deborah; et al. (2019-01-13), "1", Recommendations to UTC #158 January 2019 on Script Proposals
L2/19-008 Moore, Lisa (2019-02-08), "C.5.1 Annotation additions for U+2C71 LATIN SMALL LETTER V WITH HOOK", UTC #158 Minutes
U+2C78..2C7A3 L2/06-036 N3031, N3031-1, N3031-2 Lemonen, Therese; Ruppel, Klaas; Kolehmainen, Erkki I.; Sandström, Caroline (2006-01-26), Proposal to encode characters for Ordbok över Finlands svenska folkmål in the UCS
L2/06-008R2 Moore, Lisa (2006-02-13), "C.17", UTC #106 Minutes
L2/06-108 Moore, Lisa (2006-05-25), "Consensus 107-C31", UTC #107 Minutes, Accept the name change for U+2C7A LATIN SMALL LETTER O WITH LOW RING INSIDE.
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.21", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/07-118R2 Moore, Lisa (2007-05-23), "111-C17", UTC #111 Minutes, Approve 1 character name change: 2C78 LATIN SMALL LETTER E WITH NOTCH
L2/07-268 N3253 (pdf, doc)Umamaheswaran, V. S. (2007-07-26), "M50.4b", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27, 2C78 is renamed LATIN SMALL LETTER E WITH NOTCH
U+2C7B..2C7D3 L2/06-215 N3070 Ruppel, Klaas; Rueter, Jack; Kolehmainen, Erkki I. (2006-04-07), Proposal for Encoding 3 Additional Characters of the Uralic Phonetic Alphabet
L2/06-108 Moore, Lisa (2006-05-25), "Consensus 107-C45", UTC #107 Minutes
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.19", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/11-043 Freytag, Asmus; Karlsson, Kent (2011-02-02), Proposal to correct mistakes and inconsistencies in certain property assignments for super and subscripted letters
L2/11-016 Moore, Lisa (2011-02-15), "Correct mistakes in property assignments for super and subscripted letters (B.13.4) [U+2C7C]", UTC #126 / L2 #223 Minutes
L2/11-160 PRI #181 Changing General Category of Twelve Characters, 2011-05-02
5.2U+2C701 L2/07-334R2 N3447 Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters
L2/07-345 Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52
U+2C7E..2C7F2 L2/03-190R Constable, Peter (2003-06-08), Proposal to Encode Additional Phonetic Symbols in the UCS
L2/07-334R2 N3447 Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters
L2/07-345 Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

<span class="mw-page-title-main">D</span> 4th letter of the Latin alphabet

D, or d, is the fourth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is dee, plural dees.

<span class="mw-page-title-main">Ezh</span> Letter of the Latin alphabet

Ezh, also called the "tailed z", is a letter, notable for its use in the International Phonetic Alphabet (IPA) to represent the voiced postalveolar fricative consonant. For example, the pronunciation of "si" in vision and precision, or the ⟨s⟩ in treasure. See also the letter ⟨Ž⟩ as used in many Slavic languages, the Persian alphabet letter ⟨ژ⟩, the Cyrillic letter ⟨Ж⟩, the Devanagari letter (झ़) and the Esperanto letter ⟨Ĵ⟩.

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."

Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.

Unicode supports several phonetic scripts and notations through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English Dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.

<span class="mw-page-title-main">Ȼ</span> Letter of the Latin alphabet

Ȼ is a letter of the Latin alphabet, formed from C with the addition of a stroke through the letter. Its minuscule form represents the sound in certain phonetic transcription systems for the indigenous languages of Mexico, and the Saanich alphabet uses its majuscule form for, and in Unifon, a phonemic transcription for American English; where it represents the sound of Ч.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Latin Extended Additional is a Unicode block.


Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics. The use of subscripts and superscripts in Unicode allows any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX. Other superscript letters can be found in the Spacing Modifier Letters, Phonetic Extensions and Phonetic Extensions Supplement blocks, while the superscript 1, 2, and 3, inherited from ISO 8859-1, were included in the Latin-1 Supplement block.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos alphabet, Sakha and Americanist usage.

Latin Extended-F is a Unicode block containing modifier letters, nearly all IPA and extIPA, for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane (BMP). They were added to the free Gentium Plus and Andika fonts with version 6.2 in February 2023. Some computers have 𐞃, 𐞎 and 𐞥 supported on the font Calibri.

Latin Extended-G is a Unicode block containing additional characters for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane (BMP).

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.