The zero-width space (rendered: ; HTML entity: ​ or ​), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the rendered text. This enables text-processing systems for scripts that do not use explicit spacing to recognize where word boundaries are for the purpose of handling line breaks appropriately.
The zero-width space is Unicode character U+200B
, and is located in the Unicode General Punctuation block. In HTML, it can be represented by the character entity reference ​
.
The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.
The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese. [1]
In justified text, the rendering engine may add inter-character spacing, also known as letter spacing, between letters separated by a zero-width space, unlike around fixed-width spaces. [1]
To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
By contrast, the following words have not been separated:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
The first text is broken into lines but only at word boundaries, and resizing the browser window will re-break the text accordingly, while the second text is not broken at all.
In HTML pages, the HTML element < wbr >
functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts. [2]
The zero-width space should not be used to prevent automatic conversion of certain character combinations into emojis, because it marks a line break opportunity. To prevent systems from converting sequences like :)
into emoji like ☺ or 🙂, the zero-width non-joiner or any other (non-breaking) non-displayed character should be used.[ citation needed ]
ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one. [3] [4]
The zero-width space character is encoded in Unicode as U+200BZERO WIDTH SPACE. [5]
In HTML, it can be referenced as ​
, ​
or ​
. Additionally, the character entities ​
, ​
, ​
, and ​
all also refer to the zero-width space, contrary to what their names suggest. [6]
In HTML 'mailto:' tags[ clarify ], %E2%80%8B renders a zero-width space (but may interfere with correctly copying the email link).
The TeX representation is \hskip0pt
; the LaTeX representation is \hspace{0pt}
; [7] and the groff representation is \:
. [8]