Unicode alias names and abbreviations

Last updated September 12, 2024

In Unicode, characters can have a unique name. A character can also have one or more alias names. An alias name can be an abbreviation, a C0 or C1 control name, a correction, an alternate name or a figment. An alias too is unique over all names and aliases, and therefore identifying.

Background

The formal, primary Unicode name is unique over all names, only uses certain characters & format, and is guaranteed never to change. The formal name consists of characters A–Z (uppercase), 0–9, " " (space), and "-" (hyphen). Next to this name, a character can have one or more formal (normative) alias names. Such an alias name also follows the rules of a name: characters used (A-Z, -, 0-9, <space>) and not used (a-z, %, $, etc.). Alias names are also unique in the full name set (that is, all names and alias names are all unique in their combined set). Alias names are formally described in the Unicode Standard.^[1]^[2] In this sense, an abbreviation is also considered a Unicode name.

Reason to add an alias

There are five possible reasons to assign an alias name to a code point.^[1] A character can have multiple aliases: for example U+0008<control-0008> has control alias BACKSPACE and abbreviation alias BS.

1. Abbreviation: Commonly occurring abbreviations (or acronyms) for control codes, format characters, spaces, and variation selectors.; There are 354 such aliases, including 256 aliases for variant selectors (VS-1 ... VS-256).; For example, U+00A0 NO-BREAK SPACE has alias NBSP.; Presentation: in the code charts, the abbreviation is shown in a dashed box:
NBSP
.
2. Control: ISO 6429 names for C0 and C1 control functions and similar commonly occurring names, are added as an alias to the character.; There are 84 such aliases.; For example, U+0008<control-0008> has alias BACKSPACE.; Presentation: Control characters do not have a primary name, they are labeled like <control-0008>. Its alias name like BACKSPACE is used in the chart documentation, but never as a primary name. This prevents unintended (automated) replacement by the actual, disrupting control character. For example, using alias name BEL in line would be replaced by U+0007<control-0007>, triggering the bell sound.
3. Correction: This is a correction for a "serious problem" in the primary character name, usually an error.; There are 35 such aliases.; For example, U+2118℘SCRIPT CAPITAL P is actually a lowercase p, and so is given alias name ※ WEIERSTRASS ELLIPTIC FUNCTION: "actually this has the form of a lowercase calligraphic p, despite its name, and through the alias the correct spelling is added."; Presentation: A corrected name is preceded by symbol ※ (the reference mark).
4. Alternate: For widely used alternate name for a character.; There is 1 such alias.; Example: U+FEFFZERO WIDTH NO-BREAK SPACE has alternate BYTE ORDER MARK.; Presentation: listed in character charts description.
5. Figment: Several documented labels for C1 control code points which were never actually approved in any standard ( figment = feigned, in fiction).; There are 3 such aliases.; For example, U+0099<control-0099> has figment alias SINGLE GRAPHIC CHARACTER INTRODUCER. This name is an architectural concept from early drafts of ISO/IEC 10646-1, but it was never approved and standardized.; Presentation: These figment abbreviations are not published in Standard; the chart shows "XXX" for each informally, that is: not a unique or identifying abbreviation.

List of aliases

Code point	HTML decimal	Name or <label>	Alias		Reason	Chart	Note
Code point	HTML decimal	Name or <label>	Abbr	Name	Reason	Chart	Note
U+0000	``	<control-0000>	NUL	NULL	Control	C0 Controls and Basic Latin (pdf)
U+0001	``	<control-0001>	SOH	START OF HEADING	Control	C0 Controls and Basic Latin (pdf)
U+0002	``	<control-0002>	STX	START OF TEXT	Control	C0 Controls and Basic Latin (pdf)
U+0003	``	<control-0003>	ETX	END OF TEXT	Control	C0 Controls and Basic Latin (pdf)
U+0004	``	<control-0004>	EOT	END OF TRANSMISSION	Control	C0 Controls and Basic Latin (pdf)
U+0005	``	<control-0005>	ENQ	ENQUIRY	Control	C0 Controls and Basic Latin (pdf)
U+0006	``	<control-0006>	ACK	ACKNOWLEDGE	Control	C0 Controls and Basic Latin (pdf)
U+0007	``	<control-0007>	BEL	ALERT	Control	C0 Controls and Basic Latin (pdf)
U+0008	``	<control-0008>	BS	BACKSPACE	Control	C0 Controls and Basic Latin (pdf)
U+0009	`&Tab;` ` `	<control-0009>	TAB	CHARACTER TABULATION	Control	C0 Controls and Basic Latin (pdf)
U+0009	`&Tab;` ` `	<control-0009>	HT	HORIZONTAL TABULATION	Control	C0 Controls and Basic Latin (pdf)
U+000A	` `	<control-000A>	LF	LINE FEED	Control	C0 Controls and Basic Latin (pdf)
			NL	NEW LINE	Control
			EOL	END OF LINE	Control
U+000B	``	<control-000B>		LINE TABULATION	Control	C0 Controls and Basic Latin (pdf)
U+000B	``	<control-000B>	VT	VERTICAL TABULATION	Control	C0 Controls and Basic Latin (pdf)
U+000C	``	<control-000C>	FF	FORM FEED	Control	C0 Controls and Basic Latin (pdf)
U+000D	` `	<control-000D>	CR	CARRIAGE RETURN	Control	C0 Controls and Basic Latin (pdf)
U+000E	``	<control-000E>	SO	SHIFT OUT	Control	C0 Controls and Basic Latin (pdf)
U+000E	``	<control-000E>		LOCKING-SHIFT ONE	Control	C0 Controls and Basic Latin (pdf)
U+000F	``	<control-000F>	SI	SHIFT IN	Control	C0 Controls and Basic Latin (pdf)
U+000F	``	<control-000F>		LOCKING-SHIFT ZERO	Control	C0 Controls and Basic Latin (pdf)
U+0010	``	<control-0010>	DLE	DATA LINK ESCAPE	Control	C0 Controls and Basic Latin (pdf)
U+0011	``	<control-0011>	DC1	DEVICE CONTROL ONE	Control	C0 Controls and Basic Latin (pdf)
U+0012	``	<control-0012>	DC2	DEVICE CONTROL TWO	Control	C0 Controls and Basic Latin (pdf)
U+0013	``	<control-0013>	DC3	DEVICE CONTROL THREE	Control	C0 Controls and Basic Latin (pdf)
U+0014	``	<control-0014>	DC4	DEVICE CONTROL FOUR	Control	C0 Controls and Basic Latin (pdf)
U+0015	``	<control-0015>	NAK	NEGATIVE ACKNOWLEDGE	Control	C0 Controls and Basic Latin (pdf)
U+0016	``	<control-0016>	SYN	SYNCHRONOUS IDLE	Control	C0 Controls and Basic Latin (pdf)
U+0017	``	<control-0017>	ETB	END OF TRANSMISSION BLOCK	Control	C0 Controls and Basic Latin (pdf)
U+0018	``	<control-0018>	CAN	CANCEL	Control	C0 Controls and Basic Latin (pdf)
U+0019	``	<control-0019>	EOM	END OF MEDIUM	Control	C0 Controls and Basic Latin (pdf)
U+0019	``	<control-0019>	EM		Abbreviation	C0 Controls and Basic Latin (pdf)	added in version 15.0
U+001A	``	<control-001A>	SUB	SUBSTITUTE	Control	C0 Controls and Basic Latin (pdf)
U+001B	``	<control-001B>	ESC	ESCAPE	Control	C0 Controls and Basic Latin (pdf)
U+001C	``	<control-001C>		INFORMATION SEPARATOR FOUR	Control	C0 Controls and Basic Latin (pdf)
U+001C	``	<control-001C>	FS	FILE SEPARATOR	Control	C0 Controls and Basic Latin (pdf)
U+001D	``	<control-001D>		INFORMATION SEPARATOR THREE	Control	C0 Controls and Basic Latin (pdf)
U+001D	``	<control-001D>	GS	GROUP SEPARATOR	Control	C0 Controls and Basic Latin (pdf)
U+001E	``	<control-001E>		INFORMATION SEPARATOR TWO	Control	C0 Controls and Basic Latin (pdf)
U+001E	``	<control-001E>	RS	RECORD SEPARATOR	Control	C0 Controls and Basic Latin (pdf)
U+001F	``	<control-001F>		INFORMATION SEPARATOR ONE	Control	C0 Controls and Basic Latin (pdf)
U+001F	``	<control-001F>	US	UNIT SEPARATOR	Control	C0 Controls and Basic Latin (pdf)
U+0020	` `	SPACE	SP		Abbreviation	C0 Controls and Basic Latin (pdf)
U+007F	``	<control-007F>	DEL	DELETE	Control	C0 Controls and Basic Latin (pdf)
U+0080	``	<control-0080>	PAD	PADDING CHARACTER	Figment	C1 Controls and Latin-1 Supplement (pdf)	Aliases are not widely published by Unicode; chart shows non-unique XXX
U+0081	``	<control-0081>	HOP	HIGH OCTET PRESET	Figment	C1 Controls and Latin-1 Supplement (pdf)	Aliases are not widely published by Unicode; chart shows non-unique XXX
U+0082	``	<control-0082>	BPH	BREAK PERMITTED HERE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0083	``	<control-0083>	NBH	NO BREAK HERE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0084	``	<control-0084>	IND	INDEX	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0085	``	<control-0085>	NEL	NEXT LINE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0086	``	<control-0086>	SSA	START OF SELECTED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0087	``	<control-0087>	ESA	END OF SELECTED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0088	``	<control-0088>		CHARACTER TABULATION SET	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0088	``	<control-0088>	HTS	HORIZONTAL TABULATION SET	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0089	``	<control-0089>		CHARACTER TABULATION WITH JUSTIFICATION	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0089	``	<control-0089>	HTJ	HORIZONTAL TABULATION WITH JUSTIFICATION	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008A	``	<control-008A>		LINE TABULATION SET	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008A	``	<control-008A>	VTS	VERTICAL TABULATION SET	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008B	``	<control-008B>		PARTIAL LINE FORWARD	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008B	``	<control-008B>	PLD	PARTIAL LINE DOWN	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008C	``	<control-008C>		PARTIAL LINE BACKWARD	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008C	``	<control-008C>	PLU	PARTIAL LINE UP	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008D	``	<control-008D>		REVERSE LINE FEED	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008D	``	<control-008D>	RI	REVERSE INDEX	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008E	``	<control-008E>		SINGLE SHIFT TWO	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008E	``	<control-008E>	SS2	SINGLE-SHIFT-2	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008F	``	<control-008F>		SINGLE SHIFT THREE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+008F	``	<control-008F>	SS3	SINGLE-SHIFT-3	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0090	``	<control-0090>	DCS	DEVICE CONTROL STRING	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0091	``	<control-0091>		PRIVATE USE ONE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0091	``	<control-0091>	PU1	PRIVATE USE-1	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0092	``	<control-0092>		PRIVATE USE TWO	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0092	``	<control-0092>	PU2	PRIVATE USE-2	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0093	``	<control-0093>	STS	SET TRANSMIT STATE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0094	``	<control-0094>	CCH	CANCEL CHARACTER	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0095	``	<control-0095>	MW	MESSAGE WAITING	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0096	``	<control-0096>		START OF GUARDED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0096	``	<control-0096>	SPA	START OF PROTECTED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0097	``	<control-0097>		END OF GUARDED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0097	``	<control-0097>	EPA	END OF PROTECTED AREA	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0098	``	<control-0098>	SOS	START OF STRING	Control	C1 Controls and Latin-1 Supplement (pdf)
U+0099	``	<control-0099>	SGC	SINGLE GRAPHIC CHARACTER INTRODUCER	Figment	C1 Controls and Latin-1 Supplement (pdf)	Aliases are not widely published by Unicode; chart shows non-unique XXX
U+009A	``	<control-009A>	SCI	SINGLE CHARACTER INTRODUCER	Control	C1 Controls and Latin-1 Supplement (pdf)
U+009B	``	<control-009B>	CSI	CONTROL SEQUENCE INTRODUCER	Control	C1 Controls and Latin-1 Supplement (pdf)
U+009C	``	<control-009C>	ST	STRING TERMINATOR	Control	C1 Controls and Latin-1 Supplement (pdf)
U+009D	``	<control-009D>	OSC	OPERATING SYSTEM COMMAND	Control	C1 Controls and Latin-1 Supplement (pdf)
U+009E	``	<control-009E>	PM	PRIVACY MESSAGE	Control	C1 Controls and Latin-1 Supplement (pdf)
U+009F	``	<control-009F>	APC	APPLICATION PROGRAM COMMAND	Control	C1 Controls and Latin-1 Supplement (pdf)
U+00A0	` &NonBreakingSpace;` ` `	NO-BREAK SPACE	NBSP		Abbreviation	C1 Controls and Latin-1 Supplement (pdf)
U+00AD	`` ``	SOFT HYPHEN	SHY		Abbreviation	C1 Controls and Latin-1 Supplement (pdf)
U+01A2	`Ƣ`	LATIN CAPITAL LETTER OI		LATIN CAPITAL LETTER GHA	※ Correction	Latin Extended-B (pdf)
U+01A3	`ƣ`	LATIN SMALL LETTER OI		LATIN SMALL LETTER GHA	※ Correction	Latin Extended-B (pdf)
U+034F	`͏`	COMBINING GRAPHEME JOINER	CGJ		Abbreviation	Combining Diacritical Marks (pdf)	The name of this character is misleading; it does not actually join graphemes
U+0616	`ؖ`	ARABIC SMALL HIGH LIGATURE ALEF WITH LAM WITH YEH		ARABIC SMALL HIGH LIGATURE ALEF WITH YEH BARREE	※ Correction	Arabic	added in version 15.0
U+061C	`؜`	ARABIC LETTER MARK	ALM		Abbreviation	Arabic (pdf)	See RLM
U+0709	`܉`	SYRIAC SUBLINEAR COLON SKEWED RIGHT		SYRIAC SUBLINEAR COLON SKEWED LEFT	※ Correction	Syriac (pdf)
U+0CDE	`ೞ`	KANNADA LETTER FA		KANNADA LETTER LLLA	※ Correction	Kannada (pdf)
U+0E9D	`ຝ`	LAO LETTER FO TAM		LAO LETTER FO FON	※ Correction	Lao (pdf)
U+0E9F	`ຟ`	LAO LETTER FO SUNG		LAO LETTER FO FAY	※ Correction	Lao (pdf)
U+0EA3	`ຣ`	LAO LETTER LO LING		LAO LETTER RO	※ Correction	Lao (pdf)
U+0EA5	`ລ`	LAO LETTER LO LOOT		LAO LETTER LO	※ Correction	Lao (pdf)
U+0FD0	`࿐`	TIBETAN MARK BSKA- SHOG GI MGO RGYAN		TIBETAN MARK BKA- SHOG GI MGO RGYAN	※ Correction	Tibetan (pdf)
U+11EC	`ᇬ`	HANGUL JONGSEONG IEUNG-KIYEOK		HANGUL JONGSEONG YESIEUNG-KIYEOK	※ Correction	Hangul Jamo (pdf)
U+11ED	`ᇭ`	HANGUL JONGSEONG IEUNG-SSANGKIYEOK		HANGUL JONGSEONG YESIEUNG-SSANGKIYEOK	※ Correction	Hangul Jamo (pdf)
U+11EE	`ᇮ`	HANGUL JONGSEONG SSANGIEUNG		HANGUL JONGSEONG SSANGYESIEUNG	※ Correction	Hangul Jamo (pdf)
U+11EF	`ᇯ`	HANGUL JONGSEONG IEUNG-KHIEUKH		HANGUL JONGSEONG YESIEUNG-KHIEUKH	※ Correction	Hangul Jamo (pdf)
U+180B	`᠋`	MONGOLIAN FREE VARIATION SELECTOR ONE	FVS1		Abbreviation	Mongolian (pdf)
U+180C	`᠌`	MONGOLIAN FREE VARIATION SELECTOR TWO	FVS2		Abbreviation	Mongolian (pdf)
U+180D	`᠍`	MONGOLIAN FREE VARIATION SELECTOR THREE	FVS3		Abbreviation	Mongolian (pdf)
U+180E	`᠎`	MONGOLIAN VOWEL SEPARATOR	MVS		Abbreviation	Mongolian (pdf)
U+180F	`᠏`	MONGOLIAN FREE VARIATION SELECTOR FOUR	FVS4		Abbreviation	Mongolian (pdf)
U+1BBD	`ᮽ`	SUNDANESE LETTER BHA		SUNDANESE LETTER ARCHAIC I	※ Correction	Sudanese (pdf)	added in version 15.0
U+200B	`&NegativeMediumSpace;&NegativeThickSpace;&NegativeThinSpace;&NegativeVeryThinSpace;&ZeroWidthSpace;` ``	ZERO WIDTH SPACE	ZWSP		Abbreviation	General Punctuation (pdf)
U+200C	`&zwnj;` `‌`	ZERO WIDTH NON-JOINER	ZWNJ		Abbreviation	General Punctuation (pdf)
U+200D	`&zwj;` `‍`	ZERO WIDTH JOINER	ZWJ		Abbreviation	General Punctuation (pdf)
U+200E	`&lrm;` `‎`	LEFT-TO-RIGHT MARK	LRM		Abbreviation	General Punctuation (pdf)
U+200F	`&rlm;` `‏`	RIGHT-TO-LEFT MARK	RLM		Abbreviation	General Punctuation (pdf)
U+202A	`‪`	LEFT-TO-RIGHT EMBEDDING	LRE		Abbreviation	General Punctuation (pdf)
U+202B	`‫`	RIGHT-TO-LEFT EMBEDDING	RLE		Abbreviation	General Punctuation (pdf)
U+202C	`‬`	POP DIRECTIONAL FORMATTING	PDF		Abbreviation	General Punctuation (pdf)
U+202D	`‭`	LEFT-TO-RIGHT OVERRIDE	LRO		Abbreviation	General Punctuation (pdf)
U+202E	`‮`	RIGHT-TO-LEFT OVERRIDE	RLO		Abbreviation	General Punctuation (pdf)
U+202F	` `	NARROW NO-BREAK SPACE	NNBSP		Abbreviation	General Punctuation (pdf)
U+205F	` ` ` `	MEDIUM MATHEMATICAL SPACE	MMSP		Abbreviation	General Punctuation (pdf)
U+2060	`&NoBreak;` `⁠`	WORD JOINER	WJ		Abbreviation	General Punctuation (pdf)
U+2066	`⁦`	LEFT-TO-RIGHT ISOLATE	LRI		Abbreviation	General Punctuation (pdf)
U+2067	`⁧`	RIGHT-TO-LEFT ISOLATE	RLI		Abbreviation	General Punctuation (pdf)
U+2068	`⁨`	FIRST STRONG ISOLATE	FSI		Abbreviation	General Punctuation (pdf)
U+2069	`⁩`	POP DIRECTIONAL ISOLATE	PDI		Abbreviation	General Punctuation (pdf)
U+2118	`&weierp;&wp;` `℘`	SCRIPT CAPITAL P		WEIERSTRASS ELLIPTIC FUNCTION	※ Correction	Letterlike Symbols (pdf)
U+2448	`⑈`	OCR DASH		MICR ON US SYMBOL	※ Correction	Optical Character Recognition (pdf)
U+2449	`⑉`	OCR CUSTOMER ACCOUNT NUMBER		MICR DASH SYMBOL	※ Correction	Optical Character Recognition (pdf)
U+2B7A	`⭺`	LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE		LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE	※ Correction	Miscellaneous Symbols and Arrows (pdf)
U+2B7C	`⭼`	RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE		RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE	※ Correction	Miscellaneous Symbols and Arrows (pdf)
U+A015	`ꀕ`	YI SYLLABLE WU		YI SYLLABLE ITERATION MARK	※ Correction	Yi Syllables (pdf)
U+AA6E	`ꩮ`	MYANMAR LETTER KHAMTI HHA		MYANMAR LETTER KHAMTI LLA	※ Correction	Myanmar Extended-A (pdf)
U+FE00 ... U+FE0F	`︀` ... `️`	VARIATION SELECTOR-1 ... VARIATION SELECTOR-16	VS1 ... VS16		Abbreviation	Variation Selectors (pdf)
					(16 code points)
					Abbreviation
U+FE18	`︘`	PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET		PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET	※ Correction	Vertical Forms (pdf)
U+FEFF	``	ZERO WIDTH NO-BREAK SPACE	BOM	BYTE ORDER MARK	Alternate	Arabic Presentation Forms-B (pdf)
U+FEFF	``	ZERO WIDTH NO-BREAK SPACE	ZWNBSP		Abbreviation	Arabic Presentation Forms-B (pdf)
U+122D4	`𒋔`	CUNEIFORM SIGN SHIR TENU		CUNEIFORM SIGN NU11 TENU	※ Correction	Cuneiform (pdf)
U+122D5	`𒋕`	CUNEIFORM SIGN SHIR OVER SHIR BUR OVER BUR		CUNEIFORM SIGN NU11 OVER NU11 BUR OVER BUR	※ Correction	Cuneiform (pdf)
U+12327	`𒌧`	CUNEIFORM SIGN UN GUNU		CUNEIFORM SIGN KALAM	※ Correction	Cuneiform (pdf)
U+1680B	`𖠋`	BAMUM LETTER PHASE-A MAEMBGBIEE		BAMUM LETTER PHASE-A MAEMGBIEE	※ Correction	Bamum Supplement (pdf)
U+16E56	`𖹖`	MEDEFAIDRIN CAPITAL LETTER HP		MEDEFAIDRIN CAPITAL LETTER H	※ Correction	Medefaidrin (pdf)
U+16E57	`𖹗`	MEDEFAIDRIN CAPITAL LETTER NY		MEDEFAIDRIN CAPITAL LETTER NG	※ Correction	Medefaidrin (pdf)
U+16E76	`𖹶`	MEDEFAIDRIN SMALL LETTER HP		MEDEFAIDRIN SMALL LETTER H	※ Correction	Medefaidrin (pdf)
U+16E77	`𖹷`	MEDEFAIDRIN SMALL LETTER NY		MEDEFAIDRIN SMALL LETTER NG	※ Correction	Medefaidrin (pdf)
U+1B001	`𛀁`	HIRAGANA LETTER ARCHAIC YE		HENTAIGANA LETTER E-1	※ Correction	Kana Supplement (pdf)
U+1D0C5	`𝃅`	BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS		BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS	※ Correction	Byzantine Musical Symbols (pdf)
U+1E899	`𞢙`	MENDE KIKAKUI SYLLABLE M172 MBOO		MENDE KIKAKUI SYLLABLE M172 MBO	※ Correction	Mende Kikakui (pdf)
U+1E89A	`𞢚`	MENDE KIKAKUI SYLLABLE M174 MBO		MENDE KIKAKUI SYLLABLE M174 MBOO	※ Correction	Mende Kikakui (pdf)
U+E0100 ... U+E01EF	`󠄀` ... `󠇯`	VARIATION SELECTOR-17 ... VARIATION SELECTOR-256	VS17 ... VS256		Abbreviation	Variation Selectors Supplement (pdf)
					(240 code points)
					Abbreviation

Informal alternative names

The Unicode standard also uses and publishes alternative names that are not formal, and are not listed as normative alias names. These labels may not be unique and may use irregular characters in their name. They are used in Unicode code charts, for example U+070F SYRIAC ABBREVIATION MARK: SAM.^[3]

Related Research Articles

In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

Extended Binary Coded Decimal Interchange Code is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six-bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s. It is supported by various non-IBM platforms, such as Fujitsu-Siemens' BS2000/OSD, OS-IV, MSP, and MSP-EX, the SDS Sigma series, Unisys VS/9, Unisys MCP and ICL VME.

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 of the standard defines 154998 characters and 168 scripts used in various ordinary, literary, academic, and technical contexts.

The tab keyTab ↹ on a keyboard is used to advance the cursor to the next tab stop.

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard. It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for the replacement of six Icelandic characters with characters unique to the Turkish alphabet. And the uppercase of i is İ; the lowercase of I is ı.

ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.

T.61 is an ITU-T Recommendation for a Teletex character set. T.61 predated Unicode, and was the primary character set in ASN.1 used in early versions of X.500 and X.509 for encoding strings containing characters used in Western European languages. It is also used by older versions of LDAP. While T.61 continues to be supported in modern versions of X.500 and X.509, it has been deprecated in favor of Unicode. It is also called Code page 1036, CP1036, or IBM 01036.

A bell character is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message. Though tickers punched the bell codes into their tapes, printers generally do not print a character when the bell code is received. Bell codes are usually represented by the label "BEL". They have been used since 1870.

The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.

Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

The programming language APL uses a number of symbols, rather than words from natural language, to identify operations, similarly to mathematical symbols. Prior to the wide adoption of Unicode, a number of special-purpose EBCDIC and non-EBCDIC code pages were used to represent the symbols required for writing APL.

The Unicode Standard assigns various properties to each Unicode character and code point.

This article describes and classifies the Unicode characters that may validly appear in XML.

<span class="mw-page-title-main">ZX Spectrum character set</span>

The ZX Spectrum character set is the variant of ASCII used in the ZX Spectrum family computers. It is based on ASCII-1967 but the characters ^, ` and DEL are replaced with ↑, £ and ©. It also differs in its use of the C0 control codes other than the common BS and CR, and it makes use of the 128 high-bit characters beyond the ASCII range. The ZX Spectrum's main set of printable characters and system font are also used by the Jupiter Ace computer.

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

VSCII, also known as TCVN 5712, ISO-IR-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard character encodings for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993.

This article covers technical details of the character encoding system defined by ETS 300 706 of the ETSI, a standard for World System Teletext, and used for the Viewdata and Teletext variants of Videotex in Europe.

References

1 2 "NameAliases.txt". The Unicode Consortium. 2024-04-24. Retrieved 2024-09-11.
↑ "The Unicode Standard". The Unicode Consortium.
↑ "Unicode 14.0 Character Code Charts: Syriac" (PDF).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[NameAliases-1] 1 2 "NameAliases.txt". The Unicode Consortium. 2024-04-24. Retrieved 2024-09-11.

[2] "The Unicode Standard". The Unicode Consortium.

[3] "Unicode 14.0 Character Code Charts: Syriac" (PDF).

[1]

[2]

[3]