Arabic Extended-C | |
---|---|
Range | U+10EC0..U+10EFF (64 code points) |
Plane | SMP |
Scripts | Arabic |
Assigned | 7 code points |
Unused | 57 reserved code points |
Unicode version history | |
15.0 (2022) | 3 (+3) |
16.0 (2024) | 7 (+4) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] |
Arabic Extended-C is a Unicode block encoding Qur'anic marks used in Turkey. [3] [4]
Arabic Extended-C [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+10ECx | | | | |||||||||||||
U+10EDx | ||||||||||||||||
U+10EEx | ||||||||||||||||
U+10EFx | | 𐻽 | 𐻾 | 𐻿 | ||||||||||||
Notes |
The following Unicode-related documents record the purpose and process of defining specific characters in the Arabic Extended-C block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | Document |
---|---|---|---|---|
15.0 | U+10EFD..10EFF | 3 | L2/21-133 | Shaikh, Lateef Sagar (2021-06-24), Proposal to encode Quranic marks used in Turkey |
L2/21-130 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Liang, Hai (2021-07-26), "6a. Quranic Marks used in Turkey", Recommendations to UTC #168 July 2021 on Script Proposals | |||
L2/21-123 | Cummings, Craig (2021-08-03), "Consensus 168-C22", Draft Minutes of UTC Meeting 168 | |||
L2/21-181 | Pournader, Roozbeh (2021-08-25), Allocating Arabic Extended-C in SMP and Arabic code point changes | |||
L2/21-174 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Liang, Hai (2021-10-01), "4a Arabic Allocation in SMP", Recommendations to UTC #169 October 2021 on Script Proposals | |||
L2/21-167 | Cummings, Craig (2022-01-27), "Consensus 169-C2", Approved Minutes of UTC Meeting 169, Move the following characters -- U+0895 ARABIC SMALL LOW WORD SAKTA to U+10EFD U+0896 ARABIC SMALL LOW WORD QASR to U+10EFE U+0897 ARABIC SMALL LOW WORD MADDA to U+10EFF | |||
16.0 | U+10EC2..10EC4 | 3 | L2/22-116 | Sh., Rikza F. (2022-05-22), Proposal to Encode Four Pegon Characters |
L2/22-128 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Constable, Peter (2022-07-20), "4b Pegon", Recommendations to UTC #172 July 2022 on Script Proposals | |||
L2/22-121 | Constable, Peter (2022-08-01), "D.1.4b Four Arabic Pegon Characters", Draft Minutes of UTC Meeting 172 | |||
U+10EFC | 1 | L2/21-204 | Shaikh, Lateef Sagar (2021-08-11), Proposal to encode Quranic Superscript Alef Motahafar used in Quran published in Libya | |
L2/21-174 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Liang, Hai (2021-10-01), "4b Quranic Superscript Alef Motahafar", Recommendations to UTC #169 October 2021 on Script Proposals | |||
L2/22-023 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Constable, Peter (2022-01-22), "6e Quranic Superscript Alef Motahafar", Recommendations to UTC #170 January 2022 on Script Proposals | |||
L2/22-047 | Shaikh, Lateef Sagar (2022-02-15), Proposal to encode Arabic Combining Alef Overlay used in Quran published in Libya | |||
L2/22-068 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Constable, Peter (2022-04-15), "4b Combining Alef Overlay", Recommendations to UTC #171 April 2022 on Script Proposals | |||
L2/22-061 | Constable, Peter (2022-07-27), "D.1 Section 4b", Approved Minutes of UTC Meeting 171 | |||
|
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 of the standard defines 154998 characters and 168 scripts used in various ordinary, literary, academic, and technical contexts.
A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:
Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.
In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 16.0, five of the planes have assigned code points (characters), and seven are named.
Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
Arabic Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. This block also allocates 32 noncharacters in Unicode, designed specifically for internal use.
Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.
Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint ZWNBSP is also here, which is only meant for a byte order mark. The block name in Unicode 1.0 was Basic Glyphs for Arabic Language; its characters were re-ordered in the process of merging with ISO 10646 in Unicode 1.0.1 and 1.1.
Syriac is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam.
Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.
Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text", and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.
Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).
Coptic Epact Numbers is a Unicode block containing Old Coptic number forms.
Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.
Arabic Extended-B is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages. The block also includes currency symbols and an abbreviation mark.