Windows Glyph List 4, or more commonly WGL4 for short, also known as the Pan-European character set, is a character repertoire on Microsoft operating systems comprising 657 Unicode characters, two of them private use. Its purpose is to provide an implementation guideline for producers of fonts for the representation of European natural languages; fonts that provide glyphs for the entire set of characters can claim WGL4 compliance and thus can expect to be compatible with a wide range of software.
As of 2004 [update] , WGL4 characters were the only ones guaranteed to display correctly on Microsoft Windows. More recent versions of Windows display far more glyphs.
Because many fonts are designed to fulfill the WGL4 set, this set of characters is likely to work (display as other than replacement glyphs) on many computer systems. For example, all the non-private-use characters in the table below are likely to display properly, compared to the many missing characters that may be seen in other articles about Unicode.
The repertoire, defined by Microsoft, encompasses all the characters found in Windows code pages 1252 (Windows Western), 1250 (Windows Central European), 1251 (Windows Cyrillic), 1253 (Windows Greek), 1254 (Windows Turkish), and 1257 (Windows Baltic), as well as characters from DOS code page 437.
It does not cover the combining diacritics used by Vietnamese-related code page 1258, the Thai letters used in code page 874, Hebrew and Arabic letters covered by code pages 1255 and 1256, or the ideographic characters used by code pages 932, 936, 949 and 950.
It also does not cover the Romanian letters Ș, ș, Ț, and ț (U+0218–B), which were added to several of Microsoft's fonts for Windows Vista (long after the WGL4 repertoire was originally defined).
In version 1.5 of the OpenType Specification (May 2008) four Cyrillic characters were added to the WGL4 character set: Ѐ (U+0400), Ѝ (U+040D), ѐ (U+0450) and ѝ (U+045D). [1] [2] [3]
U+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | Block |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0020 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | C0 Controls and Basic Latin (identical to ASCII printable characters) | |
0030 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0040 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | |
0050 | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ | |
0060 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | |
0070 | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | ||
C1 Controls and Latin-1 Supplement (identical to ISO/IEC 8859-1) | |||||||||||||||||
00A0 | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | - | ® | ¯ | ||
00B0 | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ | |
00C0 | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï | |
00D0 | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß | |
00E0 | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï | |
00F0 | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ | |
0100 | Ā | ā | Ă | ă | Ą | ą | Ć | ć | Ĉ | ĉ | Ċ | ċ | Č | č | Ď | ď | Latin Extended-A |
0110 | Đ | đ | Ē | ē | Ĕ | ĕ | Ė | ė | Ę | ę | Ě | ě | Ĝ | ĝ | Ğ | ğ | |
0120 | Ġ | ġ | Ģ | ģ | Ĥ | ĥ | Ħ | ħ | Ĩ | ĩ | Ī | ī | Ĭ | ĭ | Į | į | |
0130 | İ | ı | IJ | ij | Ĵ | ĵ | Ķ | ķ | ĸ | Ĺ | ĺ | Ļ | ļ | Ľ | ľ | Ŀ | |
0140 | ŀ | Ł | ł | Ń | ń | Ņ | ņ | Ň | ň | ʼn | Ŋ | ŋ | Ō | ō | Ŏ | ŏ | |
0150 | Ő | ő | Œ | œ | Ŕ | ŕ | Ŗ | ŗ | Ř | ř | Ś | ś | Ŝ | ŝ | Ş | ş | |
0160 | Š | š | Ţ | ţ | Ť | ť | Ŧ | ŧ | Ũ | ũ | Ū | ū | Ŭ | ŭ | Ů | ů | |
0170 | Ű | ű | Ų | ų | Ŵ | ŵ | Ŷ | ŷ | Ÿ | Ź | ź | Ż | ż | Ž | ž | ſ | |
Latin Extended-B | |||||||||||||||||
0190 | ƒ | ||||||||||||||||
01F0 | Ǻ | ǻ | Ǽ | ǽ | Ǿ | ǿ | |||||||||||
02C0 | ˆ | ˇ | ˉ | Spacing Modifier Letters | |||||||||||||
02D0 | ˘ | ˙ | ˚ | ˛ | ˜ | ˝ | |||||||||||
0380 | ΄ | ΅ | Ά | · | Έ | Ή | Ί | Ό | Ύ | Ώ | Greek | ||||||
0390 | ΐ | Α | Β | Γ | Δ | Ε | Ζ | Η | Θ | Ι | Κ | Λ | Μ | Ν | Ξ | Ο | |
03A0 | Π | Ρ | Σ | Τ | Υ | Φ | Χ | Ψ | Ω | Ϊ | Ϋ | ά | έ | ή | ί | ||
03B0 | ΰ | α | β | γ | δ | ε | ζ | η | θ | ι | κ | λ | μ | ν | ξ | ο | |
03C0 | π | ρ | ς | σ | τ | υ | φ | χ | ψ | ω | ϊ | ϋ | ό | ύ | ώ | ||
0400 | Ѐ | Ё | Ђ | Ѓ | Є | Ѕ | І | Ї | Ј | Љ | Њ | Ћ | Ќ | Ѝ | Ў | Џ | Cyrillic |
0410 | А | Б | В | Г | Д | Е | Ж | З | И | Й | К | Л | М | Н | О | П | |
0420 | Р | С | Т | У | Ф | Х | Ц | Ч | Ш | Щ | Ъ | Ы | Ь | Э | Ю | Я | |
0430 | а | б | в | г | д | е | ж | з | и | й | к | л | м | н | о | п | |
0440 | р | с | т | у | ф | х | ц | ч | ш | щ | ъ | ы | ь | э | ю | я | |
0450 | ѐ | ё | ђ | ѓ | є | ѕ | і | ї | ј | љ | њ | ћ | ќ | ѝ | ў | џ | |
0490 | Ґ | ґ | |||||||||||||||
1E80 | Ẁ | ẁ | Ẃ | ẃ | Ẅ | ẅ | Latin Extended Additional | ||||||||||
1EF0 | Ỳ | ỳ | |||||||||||||||
2010 | – | — | ― | ‗ | ‘ | ’ | ‚ | ‛ | “ | ” | „ | General Punctuation | |||||
2020 | † | ‡ | • | … | |||||||||||||
2030 | ‰ | ′ | ″ | ‹ | › | ‼ | ‾ | ||||||||||
2040 | ⁄ | ||||||||||||||||
2070 | ⁿ | Super/Subscripts | |||||||||||||||
20A0 | ₣ | ₤ | ₧ | € | Currency Symbols | ||||||||||||
2100 | ℅ | Letterlike symbols | |||||||||||||||
2110 | ℓ | № | |||||||||||||||
2120 | ™ | Ω | ℮ | ||||||||||||||
2150 | ⅛ | ⅜ | ⅝ | ⅞ | Number Forms | ||||||||||||
2190 | ← | ↑ | → | ↓ | ↔ | ↕ | Arrows | ||||||||||
21A0 | ↨ | ||||||||||||||||
2200 | ∂ | ∆ | ∏ | Mathematical Operators | |||||||||||||
2210 | ∑ | − | ∕ | ∙ | √ | ∞ | ∟ | ||||||||||
2220 | ∩ | ∫ | |||||||||||||||
2240 | ≈ | ||||||||||||||||
2260 | ≠ | ≡ | ≤ | ≥ | |||||||||||||
2300 | ⌂ | Miscellaneous Technical | |||||||||||||||
2310 | ⌐ | ||||||||||||||||
2320 | ⌠ | ⌡ | |||||||||||||||
2500 | ─ | │ | ┌ | Box-drawing characters | |||||||||||||
2510 | ┐ | └ | ┘ | ├ | |||||||||||||
2520 | ┤ | ┬ | |||||||||||||||
2530 | ┴ | ┼ | |||||||||||||||
2550 | ═ | ║ | ╒ | ╓ | ╔ | ╕ | ╖ | ╗ | ╘ | ╙ | ╚ | ╛ | ╜ | ╝ | ╞ | ╟ | |
2560 | ╠ | ╡ | ╢ | ╣ | ╤ | ╥ | ╦ | ╧ | ╨ | ╩ | ╪ | ╫ | ╬ | ||||
2580 | ▀ | ▄ | █ | ▌ | Block Elements | ||||||||||||
2590 | ▐ | ░ | ▒ | ▓ | |||||||||||||
25A0 | ■ | □ | ▪ | ▫ | ▬ | Geometric Shapes | |||||||||||
25B0 | ▲ | ► | ▼ | ||||||||||||||
25C0 | ◄ | ◊ | ○ | ● | |||||||||||||
25D0 | ◘ | ◙ | |||||||||||||||
25E0 | ◦ | ||||||||||||||||
Miscellaneous Symbols | |||||||||||||||||
2630 | ☺ | ☻ | ☼ | ||||||||||||||
2640 | ♀ | ♂ | |||||||||||||||
2660 | ♠ | ♣ | ♥ | ♦ | ♪ | ♫ | |||||||||||
F000 | fi | fl | Private Use Area | ||||||||||||||
FB00 | fi | fl | Alphabetic Presentation Forms | ||||||||||||||
U+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | Block |
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 of the standard defines 154998 characters and 168 scripts used in various ordinary, literary, academic, and technical contexts.
Mojibake is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often from a different writing system.
OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corporation.
In typography, a dingbat is an ornament, specifically, a glyph used in typesetting, often employed to create box frames, or as a dinkus. Some of the dingbat symbols have been used as signature marks or used in bookbinding to order sections.
Arial Unicode MS is a TrueType font and the extended version of the font Arial. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode 2.1—thus supporting most Microsoft code pages, but also requiring much more storage space. It also adds Ideographic layout tables, but unlike Arial, it mandates no smoothing in the 14–18 point range, and contains Roman (upright) glyphs only; there is no oblique (italic) version. Arial Unicode MS was previously distributed with Microsoft Office, but this ended in 2016 version. It is bundled with Mac OS X v10.5 and later. It may also be purchased separately from Ascender Corporation, who licenses the font from Microsoft.
Lucida Sans Unicode is an OpenType typeface from the design studio of Bigelow & Holmes, designed to support the most commonly used characters defined in version 1.0 of the Unicode standard. It is a sans-serif variant of the Lucida font family and supports Latin, Greek, Cyrillic and Hebrew scripts, as well as all the characters used in the International Phonetic Alphabet.
Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".
Uniscribe is the Microsoft Windows set of services for rendering Unicode-encoded text, supporting complex text layout. It is implemented in the dynamic link library USP10.DLL. Uniscribe was released with Windows 2000 and Internet Explorer 5.0. In addition, the Windows CE platform has supported Uniscribe since version 5.0.
The internationalized domain name (IDN) homoglyph attack is a method used by malicious parties to deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters look alike. For example, the Cyrillic, Greek and Latin alphabets each have a letter ⟨o⟩ that has the same shape but different meaning from its counterparts.
misc-fixed is a collection of monospace bitmap fonts that is distributed with the X Window System. It is a set of independent bitmap fonts which—apart from all being sans-serif fonts—cannot be described as belonging to a single font family. The misc-fixed fonts were the first fonts available for the X Window System. Their individual origin is not attributed, but it is likely that many of them were created in the early or mid 1980s as part of MIT's Project Athena, or at its industrial partner, DEC. The misc-fixed fonts are in the public domain.
Andalé Mono is a monospaced sans-serif typeface designed by Steve Matteson for terminal emulation and software development environments, originally for the Taligent project by Apple Inc. and IBM. Andalé Mono has a sibling called Andalé Sans.
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. They are intentionally left undefined so that third parties may assign their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.
Segoe is a typeface, or family of fonts, that is best known for its use by Microsoft. The company uses Segoe in its online and printed marketing materials, including recent logos for a number of products. Additionally, the Segoe UI font sub-family is used by numerous Microsoft applications, and may be installed by applications. It was adopted as Microsoft's default operating system font, and is also used on Outlook.com, Microsoft's web-based email service. On August 23, 2012, Microsoft unveiled its new corporate logo typeset in Segoe, replacing the logo it had used for the previous 25 years.
A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.
Sylfaen is a multi-script serif font family designed by John Hudson and W. Ross Mills of Tiro Typeworks, and Geraldine Wade of Monotype Typography. The name Sylfaen is a Welsh word meaning foundation.
Unicode input is method to add a specific Unicode character to a computer file; it is a common way to input characters not directly supported by a physical keyboard. Characters can be entered either by selecting them from a display, by typing a certain sequence of keys on a physical keyboard, or by drawing the symbol by hand on touch-sensitive screen. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.
Web typography, like typography generally, is the design of pages – their layout and typeface choices. Unlike traditional print-based typography, pages intended for display on the World Wide Web have additional technical challenges and – given its ability to change the presentation dynamically – additional opportunities. Early web page designs were very simple due to technology limitations; modern designs use Cascading Style Sheets (CSS), JavaScript and other techniques to deliver the typographer's and the client's vision.
OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973. It follows the ISO 1073-2:1976 (E) standard, refined in 1979. It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes. It is also used for machine-readable passports. It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A.
The world glyph sets are character repertoires comprising a subset of Unicode characters. Their purpose is to provide an implementation guideline for producers of fonts for the representation of natural languages. Unlike Windows Glyph List 4 (WGL) it is specified by font foundries and not by operating system manufacturers. It is, however, very similar in glyph coverage to WGL4, but neither contains all the characters of the other.
wgl4d.htm: Added four Macedonian characters