Shavian (Unicode block)

Shavian
Shavian
Range	U+10450..U+1047F; (48 code points)
Plane	SMP
Scripts	Shavian
Major alphabets	Shavian English
Assigned	48 code points
Unused	0 reserved code points
Source standards	ConScript Unicode Registry
Unicode version history
4.0 (2003)	48 (+48)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated July 27, 2024

Shavian is a Unicode block containing characters of the Shavian alphabet (also known as the Shaw alphabet), an orthography invented to write English phonemically and funded by the will of George Bernard Shaw. The Shavian block was derived from an earlier private use encoding in the ConScript Unicode Registry, like the Deseret and Phaistos Disc encodings.

Shavian ^[1] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+1045x	𐑐	𐑑	𐑒	𐑓	𐑔	𐑕	𐑖	𐑗	𐑘	𐑙	𐑚	𐑛	𐑜	𐑝	𐑞	𐑟
U+1046x	𐑠	𐑡	𐑢	𐑣	𐑤	𐑥	𐑦	𐑧	𐑨	𐑩	𐑪	𐑫	𐑬	𐑭	𐑮	𐑯
U+1047x	𐑰	𐑱	𐑲	𐑳	𐑴	𐑵	𐑶	𐑷	𐑸	𐑹	𐑺	𐑻	𐑼	𐑽	𐑾	𐑿
Notes 1. ^ As of Unicode version 15.1

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Shavian block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
4.0	U+10450..1047F	48	L2/97-103		Jenkins, John H. (1997-05-21), Proposal to add Shavian to ISO/IEC 10646
				N1576	Proposal to add Shavian, 1997-05-21
			L2/97-288	N1603	Umamaheswaran, V. S. (1997-10-24), "8.24.2", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997
			L2/01-256	N2362	Everson, Michael; Jenkins, John (2001-06-03), Proposal for encoding the Shavian script in the SMP of the UCS
			L2/01-285	N2362R	Everson, Michael; Jenkins, John (2001-07-14), Revised proposal for encoding the Shavian script in the SMP of the UCS
			L2/01-295R		Moore, Lisa (2001-11-06), "Motion 88-M7", Minutes from the UTC/L2 meeting #88, The UTC approves encoding the Shavian script at 10450..1047F.
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

<span class="mw-page-title-main">Shavian alphabet</span> Phonemic alphabet proposed for English spelling

The Shavian alphabet is a constructed alphabet conceived as a way to provide simple, phonemic orthography for the English language to replace the inefficiencies and difficulties of conventional spelling using the Latin alphabet. It was posthumously funded by and named after Irish playwright George Bernard Shaw.

A constructed writing system or a neography is a writing system specifically created by an individual or group, rather than having evolved as part of a language or culture like a natural script. Some are designed for use with constructed languages, although several of them are used in linguistic experimentation or for other more practical ends in existing languages. Prominent examples of constructed scripts include Korean Hangul and Tengwar.

Michael Everson is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over one hundred books since 2006.

The Standard Compression Scheme for Unicode (SCSU) is a Unicode Technical Standard for reducing the number of bytes needed to represent Unicode text, especially if that text uses mostly characters from one or a small number of per-language character blocks. It does so by dynamically mapping values in the range 128–255 to offsets within particular blocks of 128 characters. The initial conditions of the encoder mean that existing strings in ASCII and ISO-8859-1 that do not contain C0 control codes other than NULL TAB CR and LF can be treated as SCSU strings. Since most alphabets do reside in blocks of contiguous Unicode codepoints, texts that use small alphabets and either ASCII punctuation or punctuation that fits within the window for the main alphabet can be encoded at one byte per character, most other punctuation can be encoded at 2 bytes per symbol through non-locking shifts. SCSU can also switch to UTF-16 internally to handle non-alphabetic languages.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Syriac is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam.

Deseret is a Unicode block containing characters in the Deseret alphabet, which were invented by the Church of Jesus Christ of Latter-day Saints to write English. The Deseret block was derived from an earlier private use encoding in the ConScript Unicode Registry, like the Shavian and Phaistos Disc encodings. The block was added in version 3.1 of the Unicode Standard; the letters Oi and Ew, both uppercase and lowercase, were added in version 4.0.

Phoenician is a Unicode block containing characters used across the Mediterranean world from the 12th century BCE to the 3rd century CE. The Phoenician alphabet was added to the Unicode Standard in July 2006 with the release of version 5.0. An alternative proposal to handle it as a font variation of Hebrew was turned down.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos alphabet, Sakha and Americanist usage.

Modi is a Unicode block containing the Modi alphabet characters for writing the Marathi language.

Palmyrene is a Unicode block containing characters for the historical Palmyrene alphabet used to write the local Palmyrene dialect of Aramaic.

Pau Cin Hau is a Unicode block containing characters for the Pau Cin Hau alphabet which was created by Pau Cin Hau, founder of the Laipian religion, to represent his religious teachings. It was used primarily in the 1930s to write Tedim which is spoken in Chin State, Myanmar.

Marchen is a Unicode block containing characters from the Marchen alphabet, which has been used to write the extinct Zhang-Zhung language of the Zhang-zhung culture of Tibet. In modern Bon usage, Marchen is also used to write Tibetan.

Osage is a Unicode block containing characters from the Osage alphabet, which was devised in 2006 for writing the Osage language spoken by the Osage people of Oklahoma, United States.

Shavian is a proposed phonetic alphabet for English.

Nag Mundari is a Unicode block containing the letters for writing the Mundari language. Nag Mundari is encoded as a unicameral alphabet. The Nag Mundari block contains 27 letters plus five diacritics and ten digits.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-3] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[1]

[2]

[lower-alpha 1]

Shavian
Range	U+10450..U+1047F (48 code points)
Plane	SMP
Scripts	Shavian
Major alphabets	Shavian English
Assigned	48 code points
Unused	0 reserved code points
Source standards	ConScript Unicode Registry
Unicode version history

4.0 (2003)	48 (+48)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]