Latin Extended-C

Last updated
Latin Extended-C
RangeU+2C60..U+2C7F
(32 code points)
Plane BMP
Scripts Latin
Major alphabetsUighur
UPA
Assigned32 code points
Unused0 reserved code points
Unicode version history
5.0 (2006)17 (+17)
5.1 (2008)29 (+12)
5.2 (2009)32 (+3)
Note: [1] [2]

Latin Extended-C is a Unicode block containing Latin characters for Uighur New Script, the Uralic Phonetic Alphabet, Shona, Claudian Latin and the Swedish Dialect Alphabet.

Contents

Block

Latin Extended-C [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+2C6x
U+2C7x Ɀ
Notes
1. ^ As of Unicode version 14.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin Extended-C block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.0U+2C60..2C645 L2/04-372R N2847R Priest, Lorna (2004-12-09), Proposal to Encode Additional Latin Orthographic Characters
U+2C65..2C662 L2/05-076 Davis, Mark (2005-02-10), Stability of Case Folding
N2942 Freytag, Asmus; Whistler, Ken (2005-08-12), Proposal to add nine lowercase characters
L2/05-108R Moore, Lisa (2005-08-26), "Stability of Case Folding (B.14.2)", UTC #103 Minutes
L2/05-270 Whistler, Ken (2005-09-21), "C. Code points for already approved Latin characters", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "M47.5k, M47.5l", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C67..2C6C6 L2/05-029R N2931 Priest, Lorna (2005-01-28), Proposal to encode additional Latin orthographic characters for Uighur Latin Alphabet
L2/05-026 Moore, Lisa (2005-05-16), "Latin Orthographic Characters for Uighur (C.2)", UTC #102 Minutes
L2/05-263 N2992 Supporting references for N2931: Proposal to Encode Additional Latin Characters for Uighur and Kazak Latin Alphabet, 2005-09-21
L2/05-270 Whistler, Ken (2005-09-21), "C. Code points for already approved Latin characters", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C741 L2/04-246R Priest, Lorna (2004-07-26), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
L2/04-316 Moore, Lisa (2004-08-19), "C.6", UTC #100 Minutes
L2/04-348 N2906 Priest, Lorna (2004-08-23), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
L2/05-180 Moore, Lisa (2005-08-17), "C.16", UTC #104 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "M47.5a", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C75..2C762 L2/05-183 N2957 Everson, Michael; Haugen, Odd Einar; Emiliano, António; Pedro, Susana; Grammel, Florian; Baker, Peter; Stötzner, Andreas; Dohnicht, Marcus; Luft, Diana (2005-08-02), Preliminary proposal to add medievalist characters to the UCS
L2/05-191 Whistler, Ken (2005-08-02), Proposal for dealing with lowercase Claudian letters
L2/05-193R2 N2960R Everson, Michael (2005-08-12), Proposal to add Claudian Latin letters to the UCS
L2/05-180 Moore, Lisa (2005-08-17), "Claudian (C.15)", UTC #104 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.4.6, 8.2.3", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
U+2C771 L2/05-189 N2958 Lehtiranta, Juhani; Ruppel, Klaas; Suutari, Toni; Trosterud, Trond (2005-07-22), Report on progress in implementing the Uralic Phonetic Alphabet with indication of the need for additional characters and symbols
L2/05-261 N2989 Ruppel, Klaas; Kolehmainen, Erkki I.; Everson, Michael; Freytag, Asmus; Whistler, Ken (2005-09-13), Proposal to add six additional Uralicist characters to the UCS
L2/05-270 Whistler, Ken (2005-09-21), "A. Uralicist character additions", WG2 Consent Docket (Sophia Antipolis)
L2/05-279 Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.4.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
5.1U+2C6D..2C6E, 2C72..2C734 N2945 Priest, Lorna; Constable, Peter (2005-08-09), Proposal to Encode Additional Latin Phonetic and Orthographic Characters
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.8", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.1", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
U+2C6F1 L2/06-266 N3122 Everson, Michael (2006-08-06), Proposal to add Latin letters and a Greek symbol to the UCS
L2/06-231 Moore, Lisa (2006-08-17), "C.16", UTC #108 Minutes
N3153 (pdf, doc)Umamaheswaran, V. S. (2007-02-16), "M49.3", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29
U+2C711 L2/05-208 Constable, Peter; Esling, John (2005-08-02), Approval of new IPA sound: the labiodental flap
N2945 Priest, Lorna; Constable, Peter (2005-08-09), Proposal to Encode Additional Latin Phonetic and Orthographic Characters
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.2.8", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.1", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/18-324 Chan, Eiso (2018-11-02), Annotation additions for U+2C71
L2/19-047 Anderson, Deborah; et al. (2019-01-13), "1", Recommendations to UTC #158 January 2019 on Script Proposals
L2/19-008 Moore, Lisa (2019-02-08), "C.5.1 Annotation additions for U+2C71 LATIN SMALL LETTER V WITH HOOK", UTC #158 Minutes
U+2C78..2C7A3 L2/06-036 N3031, N3031-1, N3031-2 Lemonen, Therese; Ruppel, Klaas; Kolehmainen, Erkki I.; Sandström, Caroline (2006-01-26), Proposal to encode characters for Ordbok över Finlands svenska folkmål in the UCS
L2/06-008R2 Moore, Lisa (2006-02-13), "C.17", UTC #106 Minutes
L2/06-108 Moore, Lisa (2006-05-25), "Consensus 107-C31", UTC #107 Minutes, Accept the name change for U+2C7A LATIN SMALL LETTER O WITH LOW RING INSIDE.
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.21", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/07-118R2 Moore, Lisa (2007-05-23), "111-C17", UTC #111 Minutes, Approve 1 character name change: 2C78 LATIN SMALL LETTER E WITH NOTCH
L2/07-268 N3253 (pdf, doc)Umamaheswaran, V. S. (2007-07-26), "M50.4b", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27, 2C78 is renamed LATIN SMALL LETTER E WITH NOTCH
U+2C7B..2C7D3 L2/06-215 N3070 Ruppel, Klaas; Rueter, Jack; Kolehmainen, Erkki I. (2006-04-07), Proposal for Encoding 3 Additional Characters of the Uralic Phonetic Alphabet
L2/06-108 Moore, Lisa (2006-05-25), "Consensus 107-C45", UTC #107 Minutes
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.19", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
L2/11-043 Freytag, Asmus; Karlsson, Kent (2011-02-02), Proposal to correct mistakes and inconsistencies in certain property assignments for super and subscripted letters
L2/11-016 Moore, Lisa (2011-02-15), "Correct mistakes in property assignments for super and subscripted letters (B.13.4) [U+2C7C]", UTC #126 / L2 #223 Minutes
L2/11-160 PRI #181 Changing General Category of Twelve Characters, 2011-05-02
5.2U+2C701 L2/07-334R2 N3447 Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters
L2/07-345 Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52
U+2C7E..2C7F2 L2/03-190R Constable, Peter (2003-06-08), Proposal to Encode Additional Phonetic Symbols in the UCS
L2/07-334R2 N3447 Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters
L2/07-345 Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

As of Unicode version 14.0 Cyrillic script is encoded across several blocks, all in the BMP:

Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F contains characters for phonetic transcription.

Unicode supports several phonetic scripts and notations through the existing writing systems and the addition of extra blocks with phonetic characters. These phonetic extras are derived of an existing script, usually Latin, Greek or Cyrillic. In Unicode there is no "IPA script". Apart from IPA, extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.

Latin Extended-D is a Unicode block containing Latin characters for phonetic, Mayan, and Medieval transcription and notation systems. 89 of the characters in this block are for medieval characters proposed by the Medieval Unicode Font Initiative.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.

IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

The ISO basic Latin alphabet is a Latin-script alphabet and consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the English alphabet.

Latin Extended Additional is a Unicode block.

Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics. The use of subscripts and superscripts in Unicode allows any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX. Other superscript letters can be found in the Spacing Modifier Letters, Phonetic Extensions and Phonetic Extensions Supplement blocks, while the superscript 1, 2, and 3, inherited from ISO 8859-1, were included in the Latin-1 Supplement block.

Shavian is a Unicode block containing characters of the Shavian alphabet, an orthography invented to write English phonetically and funded by the will of George Bernard Shaw. The Shavian block was derived from an earlier private use encoding in the ConScript Unicode Registry, like the Deseret and Phaistos Disc encodings.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Sakha and Americanist usage.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.