Arabic Presentation Forms-A | |
---|---|
Range | U+FB50..U+FDFF (688 code points) |
Plane | BMP |
Scripts | Arabic (629 char.) Common (2 char.) |
Major alphabets | Central Asian languages Pashto Persian Kurdish Sindhi Urdu |
Symbol sets | contextual forms multi-letter and word ligatures |
Assigned | 631 code points |
Unused | 25 reserved code points 32 non-characters |
Unicode version history | |
1.1 (1993) | 593 (+593) |
3.2 (2002) | 594 (+1) |
4.0 (2003) | 595 (+1) |
6.0 (2010) | 611 (+16) |
14.0 (2021) | 631 (+20) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] This range was initially part of the Private Use Area in Unicode 1.0.0, [3] and removed from it in Unicode 1.0.1. [4] |
Arabic Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. This block also allocates 32 noncharacters in Unicode, designed specifically for internal use.
The presentation forms are present only for compatibility with older standards such as codepage 864 used in DOS, and are typically used in visual and not logical order. [5] It has been agreed no further presentation forms will be encoded; though the block still sees further encodings including a contiguous range of 32 noncharacters. [6]
Arabic Presentation Forms-A [1] [2] [3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+FB5x | ﭐ | ﭑ | ﭒ | ﭓ | ﭔ | ﭕ | ﭖ | ﭗ | ﭘ | ﭙ | ﭚ | ﭛ | ﭜ | ﭝ | ﭞ | ﭟ |
U+FB6x | ﭠ | ﭡ | ﭢ | ﭣ | ﭤ | ﭥ | ﭦ | ﭧ | ﭨ | ﭩ | ﭪ | ﭫ | ﭬ | ﭭ | ﭮ | ﭯ |
U+FB7x | ﭰ | ﭱ | ﭲ | ﭳ | ﭴ | ﭵ | ﭶ | ﭷ | ﭸ | ﭹ | ﭺ | ﭻ | ﭼ | ﭽ | ﭾ | ﭿ |
U+FB8x | ﮀ | ﮁ | ﮂ | ﮃ | ﮄ | ﮅ | ﮆ | ﮇ | ﮈ | ﮉ | ﮊ | ﮋ | ﮌ | ﮍ | ﮎ | ﮏ |
U+FB9x | ﮐ | ﮑ | ﮒ | ﮓ | ﮔ | ﮕ | ﮖ | ﮗ | ﮘ | ﮙ | ﮚ | ﮛ | ﮜ | ﮝ | ﮞ | ﮟ |
U+FBAx | ﮠ | ﮡ | ﮢ | ﮣ | ﮤ | ﮥ | ﮦ | ﮧ | ﮨ | ﮩ | ﮪ | ﮫ | ﮬ | ﮭ | ﮮ | ﮯ |
U+FBBx | ﮰ | ﮱ | ﮲ | ﮳ | ﮴ | ﮵ | ﮶ | ﮷ | ﮸ | ﮹ | ﮺ | ﮻ | ﮼ | ﮽ | ﮾ | ﮿ |
U+FBCx | ﯀ | ﯁ | ﯂ | |||||||||||||
U+FBDx | ﯓ | ﯔ | ﯕ | ﯖ | ﯗ | ﯘ | ﯙ | ﯚ | ﯛ | ﯜ | ﯝ | ﯞ | ﯟ | |||
U+FBEx | ﯠ | ﯡ | ﯢ | ﯣ | ﯤ | ﯥ | ﯦ | ﯧ | ﯨ | ﯩ | ﯪ | ﯫ | ﯬ | ﯭ | ﯮ | ﯯ |
U+FBFx | ﯰ | ﯱ | ﯲ | ﯳ | ﯴ | ﯵ | ﯶ | ﯷ | ﯸ | ﯹ | ﯺ | ﯻ | ﯼ | ﯽ | ﯾ | ﯿ |
U+FC0x | ﰀ | ﰁ | ﰂ | ﰃ | ﰄ | ﰅ | ﰆ | ﰇ | ﰈ | ﰉ | ﰊ | ﰋ | ﰌ | ﰍ | ﰎ | ﰏ |
U+FC1x | ﰐ | ﰑ | ﰒ | ﰓ | ﰔ | ﰕ | ﰖ | ﰗ | ﰘ | ﰙ | ﰚ | ﰛ | ﰜ | ﰝ | ﰞ | ﰟ |
U+FC2x | ﰠ | ﰡ | ﰢ | ﰣ | ﰤ | ﰥ | ﰦ | ﰧ | ﰨ | ﰩ | ﰪ | ﰫ | ﰬ | ﰭ | ﰮ | ﰯ |
U+FC3x | ﰰ | ﰱ | ﰲ | ﰳ | ﰴ | ﰵ | ﰶ | ﰷ | ﰸ | ﰹ | ﰺ | ﰻ | ﰼ | ﰽ | ﰾ | ﰿ |
U+FC4x | ﱀ | ﱁ | ﱂ | ﱃ | ﱄ | ﱅ | ﱆ | ﱇ | ﱈ | ﱉ | ﱊ | ﱋ | ﱌ | ﱍ | ﱎ | ﱏ |
U+FC5x | ﱐ | ﱑ | ﱒ | ﱓ | ﱔ | ﱕ | ﱖ | ﱗ | ﱘ | ﱙ | ﱚ | ﱛ | ﱜ | ﱝ | ﱞ | ﱟ |
U+FC6x | ﱠ | ﱡ | ﱢ | ﱣ | ﱤ | ﱥ | ﱦ | ﱧ | ﱨ | ﱩ | ﱪ | ﱫ | ﱬ | ﱭ | ﱮ | ﱯ |
U+FC7x | ﱰ | ﱱ | ﱲ | ﱳ | ﱴ | ﱵ | ﱶ | ﱷ | ﱸ | ﱹ | ﱺ | ﱻ | ﱼ | ﱽ | ﱾ | ﱿ |
U+FC8x | ﲀ | ﲁ | ﲂ | ﲃ | ﲄ | ﲅ | ﲆ | ﲇ | ﲈ | ﲉ | ﲊ | ﲋ | ﲌ | ﲍ | ﲎ | ﲏ |
U+FC9x | ﲐ | ﲑ | ﲒ | ﲓ | ﲔ | ﲕ | ﲖ | ﲗ | ﲘ | ﲙ | ﲚ | ﲛ | ﲜ | ﲝ | ﲞ | ﲟ |
U+FCAx | ﲠ | ﲡ | ﲢ | ﲣ | ﲤ | ﲥ | ﲦ | ﲧ | ﲨ | ﲩ | ﲪ | ﲫ | ﲬ | ﲭ | ﲮ | ﲯ |
U+FCBx | ﲰ | ﲱ | ﲲ | ﲳ | ﲴ | ﲵ | ﲶ | ﲷ | ﲸ | ﲹ | ﲺ | ﲻ | ﲼ | ﲽ | ﲾ | ﲿ |
U+FCCx | ﳀ | ﳁ | ﳂ | ﳃ | ﳄ | ﳅ | ﳆ | ﳇ | ﳈ | ﳉ | ﳊ | ﳋ | ﳌ | ﳍ | ﳎ | ﳏ |
U+FCDx | ﳐ | ﳑ | ﳒ | ﳓ | ﳔ | ﳕ | ﳖ | ﳗ | ﳘ | ﳙ | ﳚ | ﳛ | ﳜ | ﳝ | ﳞ | ﳟ |
U+FCEx | ﳠ | ﳡ | ﳢ | ﳣ | ﳤ | ﳥ | ﳦ | ﳧ | ﳨ | ﳩ | ﳪ | ﳫ | ﳬ | ﳭ | ﳮ | ﳯ |
U+FCFx | ﳰ | ﳱ | ﳲ | ﳳ | ﳴ | ﳵ | ﳶ | ﳷ | ﳸ | ﳹ | ﳺ | ﳻ | ﳼ | ﳽ | ﳾ | ﳿ |
U+FD0x | ﴀ | ﴁ | ﴂ | ﴃ | ﴄ | ﴅ | ﴆ | ﴇ | ﴈ | ﴉ | ﴊ | ﴋ | ﴌ | ﴍ | ﴎ | ﴏ |
U+FD1x | ﴐ | ﴑ | ﴒ | ﴓ | ﴔ | ﴕ | ﴖ | ﴗ | ﴘ | ﴙ | ﴚ | ﴛ | ﴜ | ﴝ | ﴞ | ﴟ |
U+FD2x | ﴠ | ﴡ | ﴢ | ﴣ | ﴤ | ﴥ | ﴦ | ﴧ | ﴨ | ﴩ | ﴪ | ﴫ | ﴬ | ﴭ | ﴮ | ﴯ |
U+FD3x | ﴰ | ﴱ | ﴲ | ﴳ | ﴴ | ﴵ | ﴶ | ﴷ | ﴸ | ﴹ | ﴺ | ﴻ | ﴼ | ﴽ | ﴾ | ﴿ |
U+FD4x | ﵀ | ﵁ | ﵂ | ﵃ | ﵄ | ﵅ | ﵆ | ﵇ | ﵈ | ﵉ | ﵊ | ﵋ | ﵌ | ﵍ | ﵎ | ﵏ |
U+FD5x | ﵐ | ﵑ | ﵒ | ﵓ | ﵔ | ﵕ | ﵖ | ﵗ | ﵘ | ﵙ | ﵚ | ﵛ | ﵜ | ﵝ | ﵞ | ﵟ |
U+FD6x | ﵠ | ﵡ | ﵢ | ﵣ | ﵤ | ﵥ | ﵦ | ﵧ | ﵨ | ﵩ | ﵪ | ﵫ | ﵬ | ﵭ | ﵮ | ﵯ |
U+FD7x | ﵰ | ﵱ | ﵲ | ﵳ | ﵴ | ﵵ | ﵶ | ﵷ | ﵸ | ﵹ | ﵺ | ﵻ | ﵼ | ﵽ | ﵾ | ﵿ |
U+FD8x | ﶀ | ﶁ | ﶂ | ﶃ | ﶄ | ﶅ | ﶆ | ﶇ | ﶈ | ﶉ | ﶊ | ﶋ | ﶌ | ﶍ | ﶎ | ﶏ |
U+FD9x | ﶒ | ﶓ | ﶔ | ﶕ | ﶖ | ﶗ | ﶘ | ﶙ | ﶚ | ﶛ | ﶜ | ﶝ | ﶞ | ﶟ | ||
U+FDAx | ﶠ | ﶡ | ﶢ | ﶣ | ﶤ | ﶥ | ﶦ | ﶧ | ﶨ | ﶩ | ﶪ | ﶫ | ﶬ | ﶭ | ﶮ | ﶯ |
U+FDBx | ﶰ | ﶱ | ﶲ | ﶳ | ﶴ | ﶵ | ﶶ | ﶷ | ﶸ | ﶹ | ﶺ | ﶻ | ﶼ | ﶽ | ﶾ | ﶿ |
U+FDCx | ﷀ | ﷁ | ﷂ | ﷃ | ﷄ | ﷅ | ﷆ | ﷇ | ﷏ | |||||||
U+FDDx | ||||||||||||||||
U+FDEx | ||||||||||||||||
U+FDFx | ﷰ | ﷱ | ﷲ | ﷳ | ﷴ | ﷵ | ﷶ | ﷷ | ﷸ | ﷹ | ﷺ | ﷻ | ﷼ | ﷽ | ﷾ | ﷿ |
Notes
|
The following Unicode-related documents record the purpose and process of defining specific characters in the Arabic Presentation Forms-A block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
1.1 | U+FB50..FBB1, FBD3..FD3F, FD50..FD8F, FD92..FDC7, FDF0..FDFB | 593 | (to be determined) | ||
L2/06-008R2 | Moore, Lisa (2006-02-13), "Motion 106-M3", UTC #106 Minutes, Drop U+FD3E ORNATE LEFT PARENTHESIS and U+FD3F ORNATE RIGHT PARENTHESIS from the list of characters with Bidi Mirrored property proposed in Public Review Issue 80. | ||||
L2/14-026 | Moore, Lisa (2014-02-17), "Consensus 138-C21", UTC #138 Minutes, Change the General Category and linebreak properties of U+FD3E LEFT ORNATE PARENTHESIS to gc=Pe and lb=CL; and change General Category and linebreak properties of U+FD3F RIGHT ORNATE PARENTHESIS to gc=Ps and lb=OP, in Unicode 7.0. | ||||
L2/20-289 | N5155 | Evans, Lorna Priest (2020-12-07), Request for glyph changes and annotations for Kazakh, Kyrgyz, and Uyghur [Affects U+FBD7-FBD8, U+FBDD, and U+FBE0-FBE1] | |||
L2/21-016R | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2021-01-14), "11a. Glyph changes and annotations for Kazakh, Kyrgyz, and Uyghur", Recommendations to UTC #166 January 2021 on Script Proposals | ||||
L2/21-009 | Moore, Lisa (2021-01-27), "B.1 — 11a", UTC #166 Minutes | ||||
3.1 | U+FDD0..FDEF | 32 | L2/00-187 | Moore, Lisa (2000-08-23), "Not a Character", UTC minutes -- Boston, August 8-11, 2000 | |
L2/00-341 | N2277 | Addition of reserved characters for internal processing uses, 2000-09-19 | |||
L2/01-050 | N2253 | Umamaheswaran, V. S. (2001-01-21), "7.20 Proposal for Reserved Positions for Processing Purposes", Minutes of the SC2/WG2 meeting in Athens, September 2000 | |||
3.2 | U+FDFC | 1 | L2/01-148R | Pournader, Roozbeh (2001-04-07), Proposal: Arabic Ligature Rial | |
L2/01-184R | Moore, Lisa (2001-06-18), "Motion 87-M6", Minutes from the UTC/L2 meeting | ||||
L2/01-354 | N2373 | Pournader, Roozbeh (2001-09-20), Proposal: Arabic Currency Sign Rial | |||
L2/02-154 | N2403 | Umamaheswaran, V. S. (2002-04-22), "7.8", Draft minutes of WG 2 meeting 41, Hotel Phoenix, Singapore, 2001-10-15/19 | |||
4.0 | U+FDFD | 1 | L2/02-005 | Hussain, Sarmad; Afzal, Muhammad (2001-12-18), Urdu Computing Standards (Charts and Exhibits) | |
L2/02-006 (pdf, doc) | N2413-1 | Zia, Khaver (2002-01-10), Towards Unicode Standard for Urdu | |||
L2/02-003 | N2413-2 | Afzal, Muhammad; Hussain, Sarmad (2001-12-28), Urdu Computing Standards: Development of Urdu Zabta Takhti (UZT) 1.01 | |||
L2/02-004 | N2413-3 | Hussain, Sarmad; Afzal, Muhammad (2001-12-28), Urdu Computing Standards: Urdu Zabta Takhti (UZT) 1.01 | |||
L2/02-163 | N2413-4 (pdf, doc) | Proposal to add Marks and Digits in Arabic Code Block (for Urdu), 2002-04-30 | |||
L2/02-011R | Kew, Jonathan (2002-01-12), Comments on L2/02-006: Towards Unicode Standard for Urdu | ||||
L2/02-197 | Freytag, Asmus (2002-05-01), Urdu Feedback from Bidi Committee | ||||
L2/02-166R2 | Moore, Lisa (2002-08-09), "Motion 91-M3", UTC #91 Minutes | ||||
L2/02-372 | N2453 (pdf, doc) | Umamaheswaran, V. S. (2002-10-30), "7.9 Urdu contribution", Unconfirmed minutes of WG 2 meeting 42 | |||
L2/02-466 | N2567 | Everson, Michael; Pournader, Roozbeh (2002-12-09), Towards resolution on the name of U+FDFD | |||
L2/02-467 | N2568 | Everson, Michael; Pournader, Roozbeh; Hussain, Sarmad; Afzal, Muhammad (2002-12-10), Consensus on the name of U+FDFD | |||
L2/04-196 | N2653 (pdf, doc) | Umamaheswaran, V. S. (2004-06-04), "a-3", Unconfirmed minutes of WG 2 meeting 44 | |||
6.0 | U+FBB2..FBC1 | 16 | L2/98-274 | Davis, Mark; Mansour, Kamal (1998-07-28), Proposed Arabic Script Additions for Minority Languages | |
L2/98-409 | Davis, Mark; Mansour, Kamal (1998-12-01), Proposal to add 25 Arabic characters to the BMP | ||||
L2/98-419 (pdf, doc) | Aliprand, Joan (1999-02-05), "Additional Arabic characters", Approved Minutes -- UTC #78 & NCITS Subgroup L2 # 175 Joint Meeting, San Jose, CA -- December 1-4, 1998 | ||||
L2/02-021 | Davis, Mark; Mansour, Kamal (2002-01-17), Proposal To Amend Arabic repertoire | ||||
L2/03-154 | Kew, Jonathan; Mansour, Kamal; Davis, Mark (2003-05-16), Proposal to encode productive Arabic-script modifier marks | ||||
L2/06-039 | N3460-A | Durrani, Attash (2006-01-29), Preliminary Proposal to add Nuqta Characters to Arabic Block | |||
L2/06-240 | Kew, Jonathan (2006-07-19), Letter to Dr. Durrani | ||||
L2/06-322 | Durrani, Attash (2006-10-04), Letter to Jonathan Kew re Nuqtas | ||||
L2/07-094 | Durrani, Attash (2007-04-03), Regarding Nuqta Characters | ||||
L2/07-174 | Durrani, Attash (2007-05-14), The Case Folding Solution for the Arabic Script | ||||
L2/08-159 | Durrani, Attash; Mansour, Kamal; McGowan, Rick (2008-04-18), Proposal to Encode 22 Characters for Arabic Pedagogical Use | ||||
L2/08-230 | Anderson, Deborah (2008-05-23), Comments on Proposal to Encode 22 Characters for Arabic Pedagogical Use | ||||
L2/08-159R | N3460R | Durrani, Attash; Mansour, Kamal; McGowan, Rick (2008-06-24), Proposal to Encode 16 Characters for Arabic Pedagogical Use | |||
L2/08-161R2 | Moore, Lisa (2008-11-05), "Motion 115-M3", UTC #115 Minutes | ||||
L2/08-412 | N3553 (pdf, doc) | Umamaheswaran, V. S. (2008-11-05), "M53.19", Unconfirmed minutes of WG 2 meeting 53 | |||
L2/08-361 | Moore, Lisa (2008-12-02), "Consensus 117-C26", UTC #117 Minutes | ||||
L2/09-011 | Pournader, Roozbeh (2009-01-13), Consistent naming and better properties for Arabic Pedagogical Symbols | ||||
L2/09-110 | N3606 | Pandey, Anshuman (2009-03-30), Proposal to Advance the Renaming of Arabic Pedagogical Symbols | |||
L2/09-234 | N3603 (pdf, doc) | Umamaheswaran, V. S. (2009-07-08), "M54.06a", Unconfirmed minutes of WG 2 meeting 54 | |||
L2/09-104 | Moore, Lisa (2009-05-20), "Consensus 119-C25", UTC #119 / L2 #216 Minutes | ||||
14.0 | U+FBC2 | 1 | L2/19-306 | N5142 | Pournader, Roozbeh; Anderson, Deborah (2019-09-29), Arabic additions for Quranic orthographies |
L2/19-343 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2019-10-06), "a. Additions for Quranic orthographiesFD4C:c. Arabic honorifics", Recommendations to UTC #161 October 2019 on Script Proposals | ||||
L2/19-323 | Moore, Lisa (2019-10-01), "Consensus 161-C4", UTC #161 Minutes | ||||
L2/20-105 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Constable, Peter; Liang, Hai (2020-04-20), "3f. Comments on L2/19-306", Recommendations to UTC #163 April 2020 on Script Proposals | ||||
U+FD40..FD4B, FDFE..FDFF | 14 | L2/14-147 | Pournader, Roozbeh (2014-07-27), Proposal to encode seventeen Arabic honorifics | ||
L2/14-170 | Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Iancu, Laurențiu (2014-07-28), "5. L2/14‐147", Recommendations to UTC #140 August 2014 on Script Proposals | ||||
L2/19-289R | Pournader, Roozbeh; Jibaly, Mustafa (2019-07-26), Proposal to encode fourteen Arabic honorifics | ||||
L2/19-270 | Moore, Lisa (2019-10-07), "Consensus 160-C25", UTC #160 Minutes | ||||
U+FD4C..FD4D | 2 | L2/19-319 | Pournader, Roozbeh; Jibaly, Mustafa (2019-09-29), Proposal to encode two more Arabic honorifics | ||
L2/19-323 | Moore, Lisa (2019-10-01), "Consensus 161-C3", UTC #161 Minutes | ||||
U+FD4E..FD4F | 2 | L2/20-042 | Pournader, Roozbeh; Hooshdaran, Soheil; Jibaly, Mustafa (2020-01-15), Proposal to encode yet two more Arabic honorifics | ||
L2/20-015R | Moore, Lisa (2020-05-14), "C.5.3", Draft Minutes of UTC Meeting 162 | ||||
U+FDCF | 1 | L2/20-081 | Pournader, Roozbeh; Evans, Lorna (2020-03-10), Proposal to encode an Arabic honorific used in Christian texts | ||
L2/20-105 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Constable, Peter; Liang, Hai (2020-04-20), "3a. Arabic Honorific", Recommendations to UTC #163 April 2020 on Script Proposals | ||||
L2/20-102 | Moore, Lisa (2020-05-06), "Consensus 163-C13", UTC #163 Minutes | ||||
|
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.
The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:
A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:
Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
The Unicode Standard assigns various properties to each Unicode character and code point.
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.
Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.
Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint ZWNBSP is also here, which is only meant for a byte order mark. The block name in Unicode 1.0 was Basic Glyphs for Arabic Language; its characters were re-ordered in the process of merging with ISO 10646 in Unicode 1.0.1 and 1.1.
CJK Compatibility Ideographs is a Unicode block created to contain mostly Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. However, it also contains 12 unified ideographs sourced from Japanese character sets from IBM.
Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text", and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.
Arabic Extended-C is a Unicode block encoding Qur'anic marks used in Turkey.