Variant form (Unicode)

Last updated January 08, 2026

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Variation selectors are not required for Arabic and Latin cursive characters, where substitution of glyphs can occur based on context: glyphs may be connected together depending on whether the character is the initial character in a word, the final character, a medial character or an isolated character. These types of glyph substitution are easily handled by the context of the character with no other authoring input involved. Authors may also use special-purpose characters such as joiners and non-joiners to force an alternate form of glyph where it would not otherwise appear. Ligatures are similar instances where glyphs may be substituted simply by turning ligatures on or off as a rich text attribute.

For other glyph substitution, the author's intent may need to be encoded with the text and cannot be determined contextually. This is the case with character/glyphs referred to as gaiji, where different glyphs are used for the same character either historically or for ideographs for family names. This is one of the gray areas in distinguishing between a glyph and a character: If a family name differs slightly from the ideograph character it derives from, then is that a simple glyph variant or a character variant?

Character substitutions may also occur outside of Unicode, for example with OpenType Layout tags.^[4]

Blocks with standardized variation sequences

As of Unicode version 17.0, standardized variation sequences specifically for emoji/text presentation are defined for base characters in 20 blocks:^[1]

Other standardized variation sequences are formed with base characters in the following sixteen blocks:^[1]

Blocks with ideographic variation sequences

As of 14 July 2025^[update], ideographic variation sequences are defined for base characters in eleven blocks:^[2]^[3]

References

1 2 3 "UCD: Standardized Variation Sequences". Unicode Consortium.
1 2 "Ideographic Variation Database". Unicode Consortium.
1 2 "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
↑ "Language system tags". Microsoft. 30 September 2022.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[UniStdVarTxt-1] 1 2 3 "UCD: Standardized Variation Sequences". Unicode Consortium.

[IVD-2] 1 2 "Ideographic Variation Database". Unicode Consortium.

[UTS37-3] 1 2 "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.

[4] "Language system tags". Microsoft. 30 September 2022.

[1]

[2]

[3]

[4]

Variant form (Unicode)

Contents

Blocks with standardized variation sequences

Blocks with ideographic variation sequences

See also

References