Mazovia encoding

Last updated

Mazovia encoding
KermitMAZOVIA
Alias(es)cp667, cp790, cp991, MAZ
Language(s) Polish
Classification Extended ASCII, OEM code page
Based on OEM-US
Other related encoding(s)Fidonet Mazovia (MFD),
Mazovia 157,
FreeDOS-991

Mazovia encoding is a character set used under DOS to represent Polish text. The character set derives from code page 437, with specific positions modified to accommodate Polish letters. Notably, the Mazovia encoding maintains the block graphic characters from code page 437, distinguishing it from IBM's later official Central European code page 852, which failed to preserve all block graphics, leading to incorrect display in programs such as Norton Commander.

Contents

The Mazovia encoding was designed in 1984 by Jan Klimowicz of IMM  [ pl ]. It was designed as part of a project to develop and produce a Polish IBM PC clone codenamed "Mazovia 1016  [ pl ]". The code page was specifically optimized for the peripheral devices commonly used with the Mazovia 1016 computer, including a graphics card with dual switchable graphics, a keyboard with US English and Russian layouts, and printers with Polish fonts. The Mazovia encoding gained widespread acceptance and distribution in Poland when the Polish National Bank (NBP) adopted it as a standard in 1986. The NBP played a significant role in facilitating the production of compatible computers by Ipaco, which utilized Taiwanese components under the guidance of Zbigniew Jakubas  [ pl ] and Krzysztof Sochacki.

Some ambiguity exists in the official code page assignment for the Mazovia encoding:

PTS-DOS and S/DOS support this encoding under code page 667 (CP667). [1] The same encoding was also called code page 991 (CP991) in some Polish software, [nb 1] however, the FreeDOS implementation of code page 991 seems not to be identical to this original encoding. The DOS code page switching file NECPINW.CPI for NEC Pinwriters supports the Mazovia encoding under both code pages 667 and 991. [1] FreeDOS has meanwhile introduced support for the original Mazovia encoding under code page 790 (CP790) as well. The Fujitsu DL6400 (Pro) / DL6600 (Pro) printers support the Mazovia encoding as well. [2] This encoding is known as code page 3843 in Star printers.

Character set

Each character is shown with its equivalent Unicode code point. [3] Only the second half of the table (128255) is shown, all of the first half (0127) being the same as ASCII and code page 437.

Several variants of this encoding exists:

These variants are not fully compliant with the definition of code page 667 / 790 and should therefore not be associated with these numbers.

Code page 667 / 790
0123456789ABCDEF
8x Ç
00C7
ü
00FC
é
00E9
â
00E2
ä
00E4
à
00E0
ą
0105
ç
00E7
ê
00EA
ë
00EB
è
00E8
ï
00EF
î
00EE
ć
0107
Ä
00C4
Ą
0104
9x Ę
0118
ę
0119
ł
0142
ô
00F4
ö
00F6
Ć
0106
û
00FB
ù
00F9
Ś
015A
Ö
00D6
Ü
00DC
¢
00A2
Ł
0141
¥
00A5
ś
015B
ƒ
0192
Ax Ź
0179
Ż
017B
ó
00F3
Ó
00D3
ń
0144
Ń
0143
ź
017A
ż
017C
¿
00BF

2310
¬
00AC
½
00BD
¼
00BC
¡
00A1
«
00AB
»
00BB
Bx
2591

2592

2593

2502

2524

2561

2562

2556

2555

2563

2551

2557

255D

255C

255B

2510
Cx
2514

2534

252C

251C

2500

253C

255E

255F

255A

2554

2569

2566

2560

2550

256C

2567
Dx
2568

2564

2565

2559

2558

2552

2553

256B

256A

2518

250C

2588

2584

258C

2590

2580
Ex α
03B1
ß
00DF
Γ
0393
π
03C0
Σ
03A3
σ
03C3
µ
00B5
τ
03C4
Φ
03A6
Θ
0398
Ω
03A9
δ
03B4

221E
φ
03C6
ε
03B5

2229
Fx
2261
±
00B1

2265

2264

2320

2321
÷
00F7

2248
°
00B0

2219
·
00B7

221A

207F
²
00B2

25A0
NBSP
  Differences from code page 437

See also

Notes

  1. The Polish text converter PLC, developed by Marcin Gryszkalis between 1997-1999, supports the standard Mazovia encoding under code page 991 as well as under the symbolic handle MAZ. The Fidonet Mazovia encoding is supported under symbolic handle MFD instead.

Related Research Articles

<span class="mw-page-title-main">ASCII art</span> Computer art form using text characters

ASCII art is a graphic design technique that uses computers for presentation and consists of pictures pieced together from the 95 printable characters defined by the ASCII Standard from 1963 and ASCII compliant character sets with proprietary extended characters. The term is also loosely used to refer to text-based visual art in general. ASCII art can be created with any text editor, and is often used with free-form languages. Most examples of ASCII art require a fixed-width font such as Courier for presentation.

<span class="mw-page-title-main">Mojibake</span> Garbled text as a result of incorrect character encodings

Mojibake is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often from a different writing system.

In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte.

<span class="mw-page-title-main">ATASCII</span> Character encoding used by the Atari 8-bit family of home computers

The ATASCII character set, from ATARI Standard Code for Information Interchange, alternatively ATARI ASCII, is a character encoding used in the Atari 8-bit family of home computers. ATASCII is based on ASCII, but is not fully compatible with it.

<span class="mw-page-title-main">FM Towns</span> Japanese personal computer

The FM Towns is a Japanese personal computer built by Fujitsu from February 1989 to the summer of 1997. It started as a proprietary PC variant intended for multimedia applications and PC games, but later became more compatible with IBM PC compatibles. In 1993, the FM Towns Marty was released, a game console compatible with existing FM Towns games.

<span class="mw-page-title-main">ArmSCII</span> Set of obsolete single-byte character encodings

ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.

<span class="mw-page-title-main">Code page 850</span> Computer character set for Latin scripts

Code page 850 is a code page used under DOS operating systems in Western Europe. Depending on the country setting and system configuration, code page 850 is the primary code page and default OEM code page in many countries, including various English-speaking locales, whilst other English-speaking locales default to the hardware code page 437.

<span class="mw-page-title-main">Code page 437</span> Character set of the original IBM PC

Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".

<span class="mw-page-title-main">ViewMAX</span> File manager

ViewMAX is a CUA-compliant file manager supplied with DR DOS versions 5.0 and 6.0. It is based on a cut-down runtime version of Digital Research's GEM/3 graphical user interface modified to run only a single statically built application, the ViewMAX desktop. Support for some unneeded functions has been removed whilst some new functions were added at the same time. Nevertheless, the systems remained close enough for ViewMAX to recognize GEM desktop accessories automatically and to allow some native GEM applications to be run inside the ViewMAX environment. Many display drivers for GEM 3.xx could be used by ViewMAX as well, enabling ViewMAX to be used with non-standard display adapters and higher resolutions than possible using the default set of ViewMAX drivers. Also, Digital Research's SID86, the symbolic instruction debugger that shipped with DR DOS 3.xx and provided dedicated functions to debug GEM applications, could be used for ViewMAX as well.

The Kamenický encoding, named for the brothers Jiří and Marian Kamenický, was a code page for personal computers running DOS, very popular in Czechoslovakia around 1985–1995. Another name for this encoding is KEYBCS2, the name of the terminate-and-stay-resident utility which implemented the matching keyboard driver. It was also named KAMENICKY.

<span class="mw-page-title-main">Code page 866</span> Computer character set for Russian

Code page 866 is a code page used under DOS and OS/2 in Russia to write Cyrillic script. It is based on the "alternative code page" developed in 1984 in IHNA AS USSR and published in 1986 by a research group at the Academy of Science of the USSR. The code page was widely used during the DOS era because it preserves all of the pseudographic symbols of code page 437 and maintains alphabetic order of Cyrillic letters. Initially this encoding was only available in the Russian version of MS-DOS 4.01 (1990), but with MS-DOS 6.22 it became available in any language version.

Code page 852 is a code page used under DOS to write Central European languages that use Latin script.

Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.

Terminal is a family of monospaced raster typefaces. It is relatively small compared with Courier. It uses crossed zeros, and is designed to approximate the font normally used in MS-DOS or other text-based consoles such as on Linux. In Microsoft Windows, it is used as the default font in the Command Prompt in Windows 7 and earlier.

ESC/P, short for Epson Standard Code for Printers and sometimes styled Escape/P, is a printer control language developed by Epson to control computer printers. It was mainly used in dot matrix printers and some inkjet printers, and is still widely used in many receipt thermal printers. During the era of dot matrix printers, it was also used by other manufacturers, sometimes in modified form. At the time, it was a popular mechanism to add formatting to printed text, and was widely supported in software.

In computing, a hardware code page (HWCP) refers to a code page supported natively by a hardware device such as a display adapter or printer. The glyphs to present the characters are stored in the alphanumeric character generator's resident read-only memory and are thus not user-changeable. They are available for use by the system without having to load any font definitions into the device first. Startup messages issued by a PC's System BIOS or displayed by an operating system before initializing its own code page switching logic and font management and before switching to graphics mode are displayed in a computer's default hardware code page.

CWI-2 is a Hungarian code page frequently used in the 1980s and early 1990s. If this code page is erroneously interpreted as code page 437, it will still be fairly readable.

<span class="mw-page-title-main">Atari ST character set</span> Character set of the Atari ST personal computer family

The Atari ST character set is the character set of the Atari ST personal computer family including the Atari STE, TT and Falcon. It is based on code page 437, the original character set of the IBM PC.

The GEM character set is the character set of Digital Research's graphical user interface GEM on Intel platforms. It is based on code page 437, the original character set of the IBM PC.

References

  1. 1 2 Paul, Matthias R. (2001) [1996]. "Specification and reference documentation for NECPINW". NECPINW.CPI - DOS code page switching driver for NEC Pinwriters (2.08 ed.). FILESPEC.TXT from NECPI208.ZIP. Archived from the original on 2017-09-10. Retrieved 2013-04-22.
  2. Fujitsu DL6400/DL6600 Dot Matrix Printer User's Manual (PDF). Fujitsu Limited. April 1994. C147-E015-01EN. Archived (PDF) from the original on 2016-06-14. Retrieved 2016-06-14.
  3. Pinwriter Familie - Pinwriter - Epromsockel - Zusätzliche Zeichensätze / Schriftarten (Printed reference manual for optional font and codepage EPROMs for NEC Pinwriters, including custom variants) (in German) (00 3/93 ed.). NEC Deutschland GmbH. 1993. (NB. Some dot matrix printers of the NEC Pinwriter series, namely the P3200/P3300 (P20/P30), P6200/P6300 (P60/P70), P9300 (P90), P7200/P7300 (P62/P72), P22Q/P32Q, P3800/P3900 (P42Q/P52Q), P1200/P1300 (P2Q/P3Q), P2000 (P2X) and P8000 (P72X), supported the installation of optional font EPROMs, where this encoding was included in ROM #8 "Polish". It could be invoked via escape sequence ESC R (n) with (n) = 21.)