Variation Selectors Supplement

Last updated
Variation Selectors Supplement
RangeU+E0100..U+E01EF
(240 code points)
Plane SSP
Scripts Inherited
Assigned240 code points
Unused0 reserved code points
Unicode version history
4.0 (2003)240 (+240)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Variation Selectors Supplement is a Unicode block containing additional variation selectors beyond those found in the Variation Selectors block.

These combining characters are named variation selector-17 (for U+E0100) through to variation selector-256 (U+E01EF), abbreviated VS17 – VS256.

As of 12 December 2017, VS17 (U+E0100) to VS48 (U+E011F) are used in ideographic variation sequences in the Unicode Ideographic Variation Database (IVD). [3] [4] These selectors are known as Ideographic Variation Selectors (IVS). They are not listed in the list of standardized variation sequence, instead they are listed in another Ideographic Variation Database.


Variation Selectors Supplement [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+E010x VS 
17
 VS 
18
 VS 
19
 VS 
20
 VS 
21
 VS 
22
 VS 
23
 VS 
24
 VS 
25
 VS 
26
 VS 
27
 VS 
28
 VS 
29
 VS 
30
 VS 
31
 VS 
32
U+E011x VS 
33
 VS 
34
 VS 
35
 VS 
36
 VS 
37
 VS 
38
 VS 
39
 VS 
40
 VS 
41
 VS 
42
 VS 
43
 VS 
44
 VS 
45
 VS 
46
 VS 
47
 VS 
48
U+E012x VS 
49
 VS 
50
 VS 
51
 VS 
52
 VS 
53
 VS 
54
 VS 
55
 VS 
56
 VS 
57
 VS 
58
 VS 
59
 VS 
60
 VS 
61
 VS 
62
 VS 
63
 VS 
64
U+E013x VS 
65
 VS 
66
 VS 
67
 VS 
68
 VS 
69
 VS 
70
 VS 
71
 VS 
72
 VS 
73
 VS 
74
 VS 
75
 VS 
76
 VS 
77
 VS 
78
 VS 
79
 VS 
80
U+E014x VS 
81
 VS 
82
 VS 
83
 VS 
84
 VS 
85
 VS 
86
 VS 
87
 VS 
88
 VS 
89
 VS 
90
 VS 
91
 VS 
92
 VS 
93
 VS 
94
 VS 
95
 VS 
96
U+E015x VS 
97
 VS 
98
 VS 
99
 VS 
100
 VS 
101
 VS 
102
 VS 
103
 VS 
104
 VS 
105
 VS 
106
 VS 
107
 VS 
108
 VS 
109
 VS 
110
 VS 
111
 VS 
112
U+E016x VS 
113
 VS 
114
 VS 
115
 VS 
116
 VS 
117
 VS 
118
 VS 
119
 VS 
120
 VS 
121
 VS 
122
 VS 
123
 VS 
124
 VS 
125
 VS 
126
 VS 
127
 VS 
128
U+E017x VS 
129
 VS 
130
 VS 
131
 VS 
132
 VS 
133
 VS 
134
 VS 
135
 VS 
136
 VS 
137
 VS 
138
 VS 
139
 VS 
140
 VS 
141
 VS 
142
 VS 
143
 VS 
144
U+E018x VS 
145
 VS 
146
 VS 
147
 VS 
148
 VS 
149
 VS 
150
 VS 
151
 VS 
152
 VS 
153
 VS 
154
 VS 
155
 VS 
156
 VS 
157
 VS 
158
 VS 
159
 VS 
160
U+E019x VS 
161
 VS 
162
 VS 
163
 VS 
164
 VS 
165
 VS 
166
 VS 
167
 VS 
168
 VS 
169
 VS 
170
 VS 
171
 VS 
172
 VS 
173
 VS 
174
 VS 
175
 VS 
176
U+E01Ax VS 
177
 VS 
178
 VS 
179
 VS 
180
 VS 
181
 VS 
182
 VS 
183
 VS 
184
 VS 
185
 VS 
186
 VS 
187
 VS 
188
 VS 
189
 VS 
190
 VS 
191
 VS 
192
U+E01Bx VS 
193
 VS 
194
 VS 
195
 VS 
196
 VS 
197
 VS 
198
 VS 
199
 VS 
200
 VS 
201
 VS 
202
 VS 
203
 VS 
204
 VS 
205
 VS 
206
 VS 
207
 VS 
208
U+E01Cx VS 
209
 VS 
210
 VS 
211
 VS 
212
 VS 
213
 VS 
214
 VS 
215
 VS 
216
 VS 
217
 VS 
218
 VS 
219
 VS 
220
 VS 
221
 VS 
222
 VS 
223
 VS 
224
U+E01Dx VS 
225
 VS 
226
 VS 
227
 VS 
228
 VS 
229
 VS 
230
 VS 
231
 VS 
232
 VS 
233
 VS 
234
 VS 
235
 VS 
236
 VS 
237
 VS 
238
 VS 
239
 VS 
240
U+E01Ex VS 
241
 VS 
242
 VS 
243
 VS 
244
 VS 
245
 VS 
246
 VS 
247
 VS 
248
 VS 
249
 VS 
250
 VS 
251
 VS 
252
 VS 
253
 VS 
254
 VS 
255
 VS 
256
Notes
1. ^ As of Unicode version 15.1

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Variation Selectors Supplement block:

Related Research Articles

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

Supplemental Mathematical Operators is a Unicode block containing various mathematical symbols, including N-ary operators, summations and integrals, intersections and unions, logical and relational operators, and subset/superset relations.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.1, Unicode defines a total of 97,680 characters.

Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation.

CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs submitted to the Ideographic Research Group between 1992 and 1998, plus ten ideographs added in Unicode 13.0 which had previously been mistakenly unified with others.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Mongolian is a Unicode block containing characters for dialects of Mongolian, Manchu, and Sibe languages. It is traditionally written in vertical lines Top-Down, right across the page, although the Unicode code charts cite the characters rotated to horizontal orientation as this is the orientation of glyphs in a font that supports layout in vertical orientation.

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When contrasted with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.

CJK Unified Ideographs Extension C is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 2002 and 2006, plus five "urgently needed" characters added in Unicode versions 14.0 and 15.0, some of which had previously been mistakenly unified with other characters.

CJK Unified Ideographs Extension D is a Unicode block containing uncommon CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, some of which are in current use. Much smaller than most Unicode blocks for CJK unified ideographs, Extension D consists of characters which were submitted to the Ideographic Research Group as "urgently needed characters" between 2006 and 2009. Characters submitted during the same period which were needed less urgently were included in CJK Unified Ideographs Extension E instead.

CJK Compatibility Ideographs is a Unicode block created to contain mostly Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. However, it also contains 12 unified ideographs sourced from Japanese character sets from IBM.

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

CJK Unified Ideographs Extension E is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 2006 and 2013, excluding the characters submitted as "urgently needed" between 2006 and 2009, which were included in CJK Unified Ideographs Extension D.

CJK Unified Ideographs Extension F is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, as well as more than a thousand Sawndip characters for writing the Zhuang language, which were submitted to the Ideographic Research Group between 2012 and 2015.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "Ideographic Variation Database". Unicode Consortium.
  4. "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.