The character sets used by Videotex are based, to greater or lesser extents, on ISO/IEC 2022. Three Data Syntax systems are defined by ITU T.101, corresponding to the Videotex systems of different countries.
Data Syntax 1 is defined in Annex B of T.101:1994. It is based on the CAPTAIN system used in Japan. Its graphical sets include JIS X 0201 and JIS X 0208.
The following G-sets are available through ISO/IEC 2022-based designation escapes: [1] : AnxB.2.3
Name | G-set escape type | F byte | ISO-IR for F byte |
---|---|---|---|
Primary Character set | Single byte 94-code | 0x4A (J ) | ISO-IR-14 (JIS X 0201 Roman) |
Katakana Character set | Single byte 94-code | 0x49 (I ) | ISO-IR-13 (JIS X 0201 Kana) |
Mosaic I set | Single byte 94-code | 0x33 (3 ) | (Occupies private-use F byte; also registered as ISO-IR-137 with F byte 0x79 ) [2] |
Mosaic II set | Single byte 94-code | 0x63 (c ) | ISO-IR-71 [3] |
Display Control set | Single byte 96-code | 0x38 (8 ) | (Occupies private-use F byte) |
PDI set | Single byte 96-code | 0x57 (W ) | (F byte exceptionally reserved and not used in ISO-IR) [4] |
MVI set | Single byte 96-code | 0x39 (9 ) | (Occupies private-use F byte) |
Kanji set | Multiple byte 94n-code | 0x42 (B ) | ISO-IR-87 (JIS X 0208:1983) |
Macro set | Single byte DRCS 96-code | 0x40 (@ ) | (Uses a DRCS escape syntax) |
DRCS I set | Single byte DRCS 94-code | 0x41 (A ) | (Is a DRCS) |
DRCS II set | Multiple byte DRCS 94n-code | 0x40 (@ ) | (Is a DRCS) |
The mosaic sets supply characters for use in semigraphics.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | ▖ | � | � | ▟ | � | � | � | � | � | � | � | 🮛 | � | � | � | |
3x | ▄ | ▗ | � | � | ▙ | � | � | � | � | � | � | � | 🮚 | � | � | � |
4x | ||||||||||||||||
5x | ||||||||||||||||
6x | 🭒 | 🭓 | 🭔 | 🭕 | 🭖 | ◥ | 🭗 | 🭘 | 🭙 | 🭚 | 🭛 | 🭜 | 🭬 | 🭭 | ||
7x | 🭝 | 🭞 | 🭟 | 🭠 | 🭡 | ◤ | 🭢 | 🭣 | 🭤 | 🭥 | 🭦 | 🭧 | 🭮 | 🭯 |
� Not in Unicode
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | 🬀 | 🬁 | 🬂 | 🬃 | 🬄 | 🬅 | 🬆 | 🬇 | 🬈 | 🬉 | 🬊 | 🬋 | 🬌 | 🬍 | 🬎 | |
3x | 🬏 | 🬐 | 🬑 | 🬒 | 🬓 | ▌ | 🬔 | 🬕 | 🬖 | 🬗 | 🬘 | 🬙 | 🬚 | 🬛 | 🬜 | 🬝 |
4x | 🬼 | 🬽 | 🬾 | 🬿 | 🭀 | ◣ | 🭁 | 🭂 | 🭃 | 🭄 | 🭅 | 🭆 | 🭨 | 🭩 | 🭰 | 🮕 |
5x | 🭇 | 🭈 | 🭉 | 🭊 | 🭋 | ◢ | 🭌 | 🭍 | 🭎 | 🭏 | 🭐 | 🭑 | 🭪 | 🭫 | 🭵 | █ |
6x | 🬞 | 🬟 | 🬠 | 🬡 | 🬢 | 🬣 | 🬤 | 🬥 | 🬦 | 🬧 | ▐ | 🬨 | 🬩 | 🬪 | 🬫 | 🬬 |
7x | 🬭 | 🬮 | 🬯 | 🬰 | 🬱 | 🬲 | 🬳 | 🬴 | 🬵 | 🬶 | 🬷 | 🬸 | 🬹 | 🬺 | 🬻 |
Data Syntax 2 is defined in Annex C of T.101:1994. It corresponds to some European Videotex systems such as CEPT T/CD 06-01. The graphical character coding of Data Syntax 2 is based on T.51.
The default G2 set of Data Syntax 2 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, but adding a dialytika tonos (΅—combining form is U+0344) at the beginning of the row of diacritical marks for combination with codes from a Greek primary set. [5] An umlaut diacritic code distinct from the diaeresis code, as included in some versions of T.61, is also sometimes included. [6]
The default G1 set is the second mosaic set, corresponding roughly to the second mosaic set of Data Syntax 1. [1] : AnxCpt1/TableC.11 The default G3 set is the third mosaic set, matching the first mosaic set of Data Syntax 1 for 0x60 through 0x6D and 0x70 through 0x7D, and otherwise differing. [1] : AnxCpt1/TableC.12 The first mosaic set matches the second except for 0x40 through 0x5E: 0x40 through 0x5A follow ASCII (supplying uppercase letters), whereas the remainder are national variant characters; the displaced full block is placed at 0x7F. [1] : AnxCpt1/TableC.10
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | 🬀 | 🬁 | 🬂 | 🬃 | 🬄 | 🬅 | 🬆 | 🬇 | 🬈 | 🬉 | 🬊 | 🬋 | 🬌 | 🬍 | 🬎 |
3x | 🬏 | 🬐 | 🬑 | 🬒 | 🬓 | ▌ | 🬔 | 🬕 | 🬖 | 🬗 | 🬘 | 🬙 | 🬚 | 🬛 | 🬜 | 🬝 |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | ← a | ½ a | → a | ↑ a | ⌗/_ b |
6x | 🬞 | 🬟 | 🬠 | 🬡 | 🬢 | 🬣 | 🬤 | 🬥 | 🬦 | 🬧 | ▐ | 🬨 | 🬩 | 🬪 | 🬫 | 🬬 |
7x | 🬭 | 🬮 | 🬯 | 🬰 | 🬱 | 🬲 | 🬳 | 🬴 | 🬵 | 🬶 | 🬷 | 🬸 | 🬹 | 🬺 | 🬻 | █ |
Data Syntax 3 is defined in Annex D of T.101:1994. The graphical character coding of Data Syntax 3 is based on T.51.
The supplementary set for Data Syntax 3 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, and allocating non-spacing marks for a "vector overbar" and solidus and several semigraphic characters to unallocated space in that set.
See the comments in the T.51 article for caveats about the combining mark Unicode mappings shown below. Unlike Unicode combining characters, T.51 diacritic codes precede the base character.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x/8x | ||||||||||||||||
1x/9x | ||||||||||||||||
2x/Ax | ¡ | ¢ | £ | $ | ¥ | # | § | ¤ | ‘ | “ | « | ← | ↑ | → | ↓ | |
3x/Bx | ° | ± | ² | ³ | × | µ | ¶ | · | ÷ | ’ | ” | » | ¼ | ½ | ¾ | ¿ |
4x/Cx | ◌⃑ | ◌̀ | ◌́ | ◌̂ | ◌̃ | ◌̄ | ◌̆ | ◌̇ | ◌̈ | ◌̸ | ◌̊ | ◌̧ | ◌̲ | ◌̋ | ◌̨ | ◌̌ |
5x/Dx | ― | ¹ | ® | © | ™ | ♪ | ─ | │ | ╱ | ╲ | ◢ | ◣ | ⅛ | ⅜ | ⅝ | ⅞ |
6x/Ex | Ω | Æ | Đ/Ð | ª | Ħ | ┼ | IJ | Ŀ | Ł | Ø | Œ | º | Þ | Ŧ | Ŋ | ʼn |
7x/Fx | ĸ | æ | đ | ð | ħ | ı | ij | ŀ | ł | ø | œ | ß | þ | ŧ | ŋ |
C0 control codes for Videotex differ from ASCII as shown in the table below. The NUL, BEL, SO (LS1), SI (LS0) and ESC codes are also available in some or all data syntaxes, but without change in name or semantic from ASCII. [8] [9] [10]
Seq | Dec | Hex | Replaced | Syntaxes | Acronym | Name | Description |
---|---|---|---|---|---|---|---|
^H | 08 | 08 | BS | 1, [8] 2, [9] 3 [10] | APB | Active Position Backward | Moves cursor one position backward. If it is at the start of the line, moves it to the end of the line and back one line. This retains one possible semantic of the ASCII BS. |
^I | 09 | 09 | HT | 1, [8] 2, [9] 3 [10] | APF | Active Position Forward | Moves cursor one position forward. If it is at the end of the line, moves it to the start of the line and forward one line. |
^J | 10 | 0A | LF | 1, [8] 2, [9] 3 [10] | APD | Active Position Down | Moves cursor one line forward. If it is at the last line of the screen, moves it to the first line unless Data Syntax 3 scroll mode is active. This retains one possible semantic of the ASCII LF. |
^K | 11 | 0B | VT | 1, [8] 2, [9] 3 [10] | APU | Active Position Up | Moves cursor one line backward. If it is at the first line of the screen, moves it to the last line unless Data Syntax 3 scroll mode is active. |
^L | 12 | 0C | FF | 1, [8] 2, [9] 3 [10] | CS | Clear Screen | Resets entire display to spaces with default display attributes and returns the cursor to its initial position. In Data Syntax 1, also resets macros and DRCS. This retains one possible semantic of the ASCII FF. |
^M | 13 | 0D | CR | 1, [8] 2, [9] 3 [10] | APR | Active Position Return | Moves the cursor to the start of the line. In Data Syntax 3, may instead move it to the start of the active field if it is entirely within it. This retains one possible semantic of the ASCII CR. |
^Q | 17 | 11 | DC1/XON | 2 [9] | CON | Cursor On | Makes the cursor visible. |
^R | 18 | 12 | DC2 | 2 [9] | RPT | Repeat | Repeats the immediately preceding graphic character a number of times indicated by the low six bits of the following byte (from 0x40 to 0x7F). |
^T | 20 | 14 | DC4 | 1 [1] : AnxB.3.1 | KMC | Key-In-Monitor Conceal | Takes one parameter: 0x40 makes the key-in-monitor area unconcealed, 0x41 makes it concealed. |
2 [9] | COF | Cursor Off | Makes the cursor invisible. | ||||
^X | 24 | 18 | CAN | 1, [8] 2, [9] 3 [10] | CAN | Cancel | In Data Syntax 2, fill the rest of the current line (after the current position) with spaces (compare EL). In Data Syntax 1 and 3, immediately stop all running macros. Contrast the semantic of basic ASCII CAN. |
^Y | 25 | 19 | EM | 1, [8] 2, [9] 3 [10] | SS2 | Single Shift Two | Non-locking shift code for G2. |
^Z | 26 | 1A | SUB | 3 [10] | SDC | Service Delimitor Character | Implementation-defined but non-presentational. |
^\ | 28 | 1C | FS | 1, [8] 3 [10] | APS | Active Position Set | Followed by two bytes (from 0x40 to 0x7F; may also be from 0xA0 to 0xFF in Data Syntax 3) respectively giving a row and column address in their low six bits. Compare CUP and HVP. |
^] | 29 | 1D | GS | 1, [8] 2, [9] 3 [10] | SS3 | Single Shift Three | Non-locking shift code for G3. |
^^ | 30 | 1E | RS | 1, [8] 2, [9] 3 [10] | APH | Active Position Home | Returns cursor to the initial position. |
^_ | 31 | 1F | US | 1, [8] 3 [10] | NSR | Non-Selective Reset | Resets all display attributes (including ISO 2022 state, domain, text parameters, textures, colour mode but not macros, DRCS or programmable masks), then moves the cursor to a specified position. Followed by two bytes (from 0x40 to 0x7F; may also be from 0xA0 to 0xFF in Data Syntax 3) respectively giving a row and column address in their low six bits. Compare RIS. |
2 [9] | APA | Active Position Address | Followed by two or four bytes (from 0x40 to 0x7F) giving a row and column address in their low six bits. Four bytes are used if there are more than 63 rows and columns, with the most significant six bits being first for each parameter. Compare CUP and HVP. If the following byte is not in the range of 0x40 to 0x7F, indicates a switch to another coding scheme (contrast DOCS). |
The following specialised C1 control codes are used in Videotex. There are four registered sets, with some differences between them.
8-bit | Escape | Data Syntax 1 [11] | Data Syntax 2, "Parallel" C1 set [12] [1] : AnxC.3.3.2 | Data Syntax 2, "Serial" C1 set [13] [1] : AnxC.3.3.1 | Data Syntax 3 [14] |
---|---|---|---|---|---|
0x80 | ESC 0x40 (@) | BKF, Black Foreground. | ABK, Alpha Black. Switch to alphabetic, black foreground. | DEFM, Define Macro. Next character (from 0x20 to 0x7F) gives macro name, rest is stored as part of macro until another DEF* or an END . | |
0x81 | ESC 0x41 (A) | RDF, Red Foreground. | ANR, Alpha Red. Switch to alphabetic, red foreground. | DEFP, Define P-Macro. Like DEFM , but simultaneously defines and executes the macro. | |
0x82 | ESC 0x42 (B) | GRF, Green Foreground. | ANG, Alpha Green. Switch to alphabetic, green foreground. | DEFT, Define Transmit-Macro. Like DEFM but defines a macro to be transmitted, not executed. | |
0x83 | ESC 0x43 (C) | YLF, Yellow Foreground. | ANY, Alpha Yellow. Switch to alphabetic, yellow foreground. | DEFD, Define DRCS. Defines a character in the Dynamically Redefinable Character Set. Expected to be followed by the character code defined (from 0x20 to 0x7F) unless it terminates a previous DEFD, in which case it defines the next code. Terminated by another DEF* or an END | |
0x84 | ESC 0x44 (D) | BLF, Blue Foreground. | ANB, Alpha Blue. Switch to alphabetic, blue foreground. | DEFX, Define Texture. Defines a texture mask. Expected to be followed by the texture mask ID defined (from 0x40 to 0x44). Terminated by another DEF* or an END | |
0x85 | ESC 0x45 (E) | MGF, Magenta Foreground. | ANM, Alpha Magenta. Switch to alphabetic, magenta foreground. | END, End. Terminates a macro, DRCS character or texture definition. Also used in unprotected fields. | |
0x86 | ESC 0x46 (F) | CNF, Cyan Foreground. | ANC, Alpha Cyan. Switch to alphabetic, cyan foreground. | REP, Repeat. Repeats preceding spacing graphical character a number of times specified by the following byte (from 0x40 to 0x7F). | |
0x87 | ESC 0x47 (G) | WHF, White Foreground. | ANW, Alpha White. Switch to alphabetic, white foreground. | REPE, Repeat to End of Line. Repeats preceding spacing graphical character until the end of the line is reached. | |
0x88 | ESC 0x48 (H) | SSZ, Small Size. Characters half normal width and height | FSH, Flashing. Characters displayed flashing between foreground and background. | REVV, Reverse Video. Enables reverse video mode. | |
0x89 | ESC 0x49 (I) | MSZ, Medium Size. Characters normal height, half normal width | STD, Steady. Terminates flashing. | NORV, Normal Video. Disables reverse video mode. | |
0x8A | ESC 0x4A (J) | NSZ, Normal Size. Characters normal width and height. | EBX, End Box. Terminates SBX . | SMTX, Small Text. Text size 1/80 of screen width and 5/128 of screen height. | |
0x8B | ESC 0x4B (K) | SZX, Size Control. Followed by a one-byte parameter. 0x41 means double height (DBH), 0x44 means double width (DBW), 0x45 means doubled width and height (DBS). [1] : AnxB.3.2.2 | SBX, Start Box. Defines a non-alphanumeric area, with transparent background. Terminated by EBX . | METX, Medium Text. Text size 1/32 of screen width and 3/64 of screen height. | |
0x8C | ESC 0x4C (L) | (not used) | NSZ, Normal Size. Characters normal width and height. | NOTX, Normal Text. Text size 1/40 of screen width and 5/128 of screen height. | |
0x8D | ESC 0x4D (M) | (not used) | DBH, Double Height. Characters normal width and double normal height. Inactive on top line. | DBH, Double Height. Characters normal width and double normal height. Inactive on bottom line. | DBH, Double Height. Text size 1/40 of screen width and 5/64 of screen height. |
0x8E | ESC 0x4E (N) | CON, Cursor On. Makes cursor visible. | DBW, Double Width. Characters normal height and double normal width. Inactive in last position of line. | BSTA, Blink Start. | |
0x8F | ESC 0x4F (O) | COF, Cursor Off. Makes cursor invisible. | DBS, Double Size. Characters normal height and double normal width. Inactive on top line or in last position of line. | DBS, Double Size. Characters normal height and double normal width. Inactive on bottom line or in last position of line. | DBS, Double Size. Text size 1/20 of screen width and 5/64 of screen height. |
0x90 | ESC 0x50 (P) | COL, Background or Foreground Colour. Takes a one-byte parameter. 0x48–0x4F sets a reduced intensity foreground. 0x50–0x57 sets background colour. 0x58–0x5F sets a reduced intensity background. Colour order is the same as that of the individual foreground colour controls (black, red, green, yellow, blue, magenta, cyan, white), but transparent takes the place of reduced intensity black. [1] : AnxB.3.2.1 | BKB, Black Background. | MBK, Mosaic Black. Switch to mosaic, black foreground. | PRO, Protect. Makes all character fields within the active field protected. |
0x91 | ESC 0x51 (Q) | FLC, Flashing Control. Takes one parameter: 0x40 for "normal" flashing, 0x41 through 0x47 for other flashing modes, 0x4F for steady (terminate flashing). [1] : AnxB.3.2.4 | RDB, Red Background. | MSR, Mosaic Red. Switch to mosaic, red foreground. | (EDC1, not used) |
0x92 | ESC 0x52 (R) | CDC, Conceal Display Control. Takes a one-byte parameter defining conceal display attributes, which can make text invisible until user interaction. 0x40 is used to start a concealed range (CDY), 0x4F is used to terminate it (SCD). [1] : AppB.3.2.7 | GRB, Green Background. | MSG, Mosaic Green. Switch to mosaic, green foreground. | (EDC2, not used) |
0x93 | ESC 0x53 (S) | (not used) | YLB, Yellow Background. | MSY, Mosaic Yellow. Switch to mosaic, yellow foreground. | (EDC3, not used) |
0x94 | ESC 0x54 (T) | (not used) | BLB, Blue Background. | MSB, Mosaic Blue. Switch to mosaic, blue foreground. | (EDC4, not used) |
0x95 | ESC 0x55 (U) | P-MACRO, Photo Macro. Followed by a single-byte parameter (0x40 for define, 0x41 for define and execute, 0x42 to define a transmit-macro, 0x4F to delimit the end of a macro definition). [1] : AppB.3.2.9 Second single-byte parameter (from 0x20 to 0x7F) identifies the photo macro being defined (from PM0 to PM95). | MGB, Magenta Background. | MSM, Mosaic Magenta. Switch to mosaic, magenta foreground. | WWON, Word Wrap On. |
0x96 | ESC 0x56 (V) | (not used) | CNB, Cyan Background. | MSC, Mosaic Cyan. Switch to mosaic, cyan foreground. | WWOF, Word Wrap Off. |
0x97 | ESC 0x57 (W) | (not used) | WHB, White Background. | MSW, Mosaic White. Switch to mosaic, white foreground. | SCON, Scroll On. Next-lining off the bottom of the screen moves the rest of the screen up to make space. |
0x98 | ESC 0x58 (X) | RPC, Repeat Control. Repeats preceding spacing graphical character a number of times specified by the low six bits of the following byte (from 0x40 to 0x7F). Repeats to end of line if byte is 0x40. Compare REP from Data Syntax 3. | CDY, Conceal Display. Display characters as spaces (might be terminated by SCD ). | SCOF, Scroll Off. Next-lining off the bottom of the screen wraps around to the top of the screen. | |
0x99 | ESC 0x59 (Y) | SPL, Stop Lining. Terminates underlining. For mosaic characters, non-underlined font corresponds to contiguous display, with the blocks within a mosaic character joined together. | USTA, Underline Start. Begins underlined letters, and switches to separated display for mosaics. | ||
0x9A | ESC 0x5A (Z) | STL, Start Lining. Begins underlined letters. For mosaics, this corresponds to separated display, with the blocks within a mosaic character shown separated. | USTO, Underline Stop. Terminates underlining, and switches to contiguous display for mosaics. | ||
0x9B | ESC 0x5B ([) | (not used) | CSI, Control Sequence Introducer. | FLC, Flash Cursor. User input cursor turned on, flashing. | |
0x9C | ESC 0x5C (\) | (not used) | NPO, Normal Polarity. Foreground in foreground colour, background in background colour. | BBD, Black Background. | STC, Steady Cursor. User input cursor turned on, always visible. |
0x9D | ESC 0x5D (]) | (not used) | IPO, Inverted Polarity. Foreground in background colour, background in foreground colour. | NBD, New Background. Set background colour to previous foreground colour. The current foreground colour is not affected. | COF, Cursor Off. User input cursor invisible, but still functional. |
0x9E | ESC 0x5E (^) | UNP, Unprotected. Makes following characters unprotected from user input. | TRB, Transparent Background. | HMS, Hold Mosaic. Image subsequently stored control functions as the last received mosaic character. | BSTO, Blink Stop. |
0x9F | ESC 0x5F (_) | PRT, Protected. Makes following characters protected from user input | SCD, Stop Conceal. Terminate CDY . | RMS, Release Mosaic. Terminate HMS . | UNP, Unprotect. Makes a field unprotected (open to user input). |
In telecommunication and character encoding, the term cancel character refers to a control character which may be either of:
^X
used to indicate that the data with which it is associated are in error or are to be disregarded. Exact meaning can depend on protocol. For example: ESC T
used to erase the previous character. This character was created as an unambiguous alternative to the much more common backspace character, which has a now mostly obsolete alternative function of causing the following character to be superimposed on the preceding one.ISO/IEC 646 is a set of ISO/IEC standards, described as Information technology — ISO 7-bit coded character set for information interchange and developed in cooperation with ASCII at least since 1964. Since its first edition in 1967 it has specified a 7-bit character code from which several national standards are derived.
ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.
ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sámi. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.
ISO/IEC 8859-5:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin/Cyrillic.
ISO/IEC 8859-10:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No. 6, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1992. It is informally referred to as Latin-6. It was designed to cover the Nordic languages, deemed of more use for them than ISO 8859-4.
ISO/IEC 8859-13:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. It is informally referred to as Latin-7 or Baltic Rim. It was designed to cover the Baltic languages, and added characters used in Polish missing from the earlier encodings ISO 8859-4 and ISO 8859-10. Unlike these two, it does not cover the Nordic languages. It is similar to the earlier-published Windows-1257; its encoding of the Estonian alphabet also matches IBM-922.
ISO/IEC 8859-14:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 14: Latin alphabet No. 8 (Celtic), is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. It is informally referred to as Latin-8 or Celtic. It was designed to cover the Celtic languages, such as Irish, Manx, Scottish Gaelic, Welsh, Cornish, and Breton.
ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.
T.61 is an ITU-T Recommendation for a Teletex character set. T.61 predated Unicode, and was the primary character set in ASN.1 used in early versions of X.500 and X.509 for encoding strings containing characters used in Western European languages. It is also used by older versions of LDAP. While T.61 continues to be supported in modern versions of X.500 and X.509, it has been deprecated in favor of Unicode. It is also called Code page 1036, CP1036, or IBM 01036.
The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.
T.51 / ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or more precisely ISO/IEC 646-IRV. It was developed in common with ITU-T for telematic services under the name of T.51, and first became an ISO standard in 1983. Certain byte codes are used as lead bytes for letters with diacritics (accents). The value of the lead byte often indicates which diacritic that the letter has, and the follow byte then has the ASCII-value for the letter that the diacritic is on.
YUSCII is an informal name for several JUS standards for 7-bit character encoding. These include:
The MARC-8 charset is a MARC standard used in MARC-21 library records. The MARC formats are standards for the representation and communication of bibliographic and related information in machine-readable form, and they are frequently used in library database systems. The character encoding now known as MARC-8 was introduced in 1968 as part of the MARC format. Originally based on the Latin alphabet, from 1979 to 1983 the JACKPHY initiative expanded the repertoire to include Japanese, Arabic, Chinese, and Hebrew characters, with the later addition of Cyrillic and Greek scripts. If a character is not representable in MARC-8 of a MARC-21 record, then UTF-8 must be used instead. UTF-8 has support for many more characters than MARC-8, which is rarely used outside library data.
The CCITT Chinese Primary Set is a multi-byte graphic character set for Chinese communications created for the Consultative Committee on International Telephone and Telegraph (CCITT) in 1992. It is defined in ITU T.101, annex C, which codifies Data Syntax 2 Videotex. It is registered with the ISO-IR registry for use with ISO/IEC 2022 as ISO-IR-165, and encodable in the ISO-2022-CN-EXT code version.
ISO 2047 is a standard for graphical representation of the control characters for debugging purposes, such as may be found in the character generator of a computer terminal; it also establishes a two-letter abbreviation of each control character. The graphics and two-letter codes are essentially unchanged from the 1968 European standard ECMA-17 and the 1973 American standard ANSI X3.32-1973. It became an ISO standard in 1975. It is also standardized as GB/T 3911-1983 in China, as KS X 1010 in Korea, and was enacted in Japan as "graphical representation of information exchange capabilities for character" JIS X 0209:1976.
The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.
ISO/IEC 10367:1991 is a standard developed by ISO/IEC JTC 1/SC 2, defining graphical character sets for use in character encodings implementing levels 2 and 3 of ISO/IEC 4873.
Volume 1 of the Association of Radio Industries and Businesses (ARIB) STD-B24 standard for Broadcast Markup Language specifies, amongst other details, a character encoding for use in Japanese-language broadcasting. It was introduced on 1999-10-26. The latest revision is version 6.3 as of 2016-07-06.
Bit combination 5/7 of table 3 will not be allocated in order to avoid problems with an earlier usage by CCITT.