Shift Out and Shift In characters

Last updated
Shift In and Shift Out used in a Linux terminal to access a variant DEC Special Graphics set Shift In and Shift out on Linux.png
Shift In and Shift Out used in a Linux terminal to access a variant DEC Special Graphics set

Shift Out (SO) and Shift In (SI) are ASCII control characters 14 and 15, respectively (0x0E and 0x0F). [1] These are sometimes also called "Control-N" and "Control-O".

The original purpose of these characters was to provide a way to shift a coloured ribbon, split longitudinally usually with red and black, up and down to the other colour in an electro-mechanical typewriter or teleprinter, such as the Teletype Model 38, to automate the same function of manual typewriters. Black was the conventional ambient default colour and so was shifted "in" or "out" with the other colour on the ribbon.

Later advancements in technology instigated use of this function for switching to a different font or character set and back. This was used, for instance, in the Russian character set known as KOI7-switched, where SO starts printing Russian letters, and SI starts printing Latin letters again. Similarly, they are used for switching between Katakana and Roman letters in the 7-bit version of the Japanese JIS X 0201. [2] [3]

SO/SI control characters also are used to display VT100 pseudographics. Shift In is also used in the 2G variant [4] of SoftBank Mobile's encoding for emoji.

The ISO/IEC 2022 standard (ECMA-35, JIS X 0202) standardises the generalized usage of SO and SI for switching between pre-designated character sets invoked over the 0x20–0x7F byte range. It refers to them respectively as Locking Shift One (LS1) and Locking Shift Zero (LS0) in an 8-bit environment, or as SO and SI in a 7-bit environment. [5] In ISO-2022-compliant code sets where the 0x0E and 0x0F characters are used for the purpose of emphasis (such as an italic or red font) rather than a change of character set, they are referred to respectively as Upper Rail (UR) and Lower Rail (LR), rather than SO and SI. [6]

See also

Related Research Articles

<span class="mw-page-title-main">ISO/IEC 8859-1</span> Character encoding

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets and the first two blocks of characters in Unicode.

ISO/IEC 646 is a set of ISO/IEC standards, described as Information technology — ISO 7-bit coded character set for information interchange and developed in cooperation with ASCII at least since 1964. Since its first edition in 1967 it has specified a 7-bit character code from which several national standards are derived.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

ISO/IEC 8859-10:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No. 6, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1992. It is informally referred to as Latin-6. It was designed to cover the Nordic languages, deemed of more use for them than ISO 8859-4.

ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. Originating in 1971, it was most recently revised in 1994.

Shift JIS is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1. As of October 2022, 0.2% of all web pages used Shift JIS, a decline from 1.3% in July 2014.

T.61 is an ITU-T Recommendation for a Teletex character set. T.61 predated Unicode, and was the primary character set in ASN.1 used in early versions of X.500 and X.509 for encoding strings containing characters used in Western European languages. It is also used by older versions of LDAP. While T.61 continues to be supported in modern versions of X.500 and X.509, it has been deprecated in favor of Unicode. It is also called Code page 1036, CP1036, or IBM 01036.

The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.

<span class="mw-page-title-main">Japanese postal mark</span>

is the service mark of Japan Post and its successor, Japan Post Holdings, the postal operator in Japan. It is also used as a Japanese postal code mark since the introduction of the latter in 1968. Historically, it was used by the Ministry of Communications, which operated the postal service. The mark is a stylized katakana syllable te (テ), from the word teishin. The mark was introduced on February 8, 1887.

T.51 / ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or rather of ISO/IEC 646-IRV. It was developed in common with ITU-T for telematic services under the name of T.51, and first became an ISO standard in 1983. Certain byte codes are used as lead bytes for letters with diacritics (accents). The value of the lead byte often indicates which diacritic that the letter has, and the follow byte then has the ASCII-value for the letter that the diacritic is on.

<span class="mw-page-title-main">JIS X 0201</span> Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. It is either a 7-bit encoding or an 8-bit encoding, although the 8-bit form is dominant for modern use. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is 7-bit and 8-bit double byte coded KANJI sets for information interchange. It was originally established as JIS C 6226 in 1978, and has been revised in 1983, 1990, and 1997. It is also called Code page 952 by IBM. The 1978 version is also called Code page 955 by IBM.

In mathematics, the radical sign, radical symbol, root symbol, radix, or surd is a symbol for the square root or higher-order root of a number. The square root of a number is written as

Code page 895 is a 7-bit character set and is Japan's national ISO 646 variant. It is the Roman set of the JIS X 0201 Japanese Standard and is variously called Japan 7-Bit Latin, JISCII, JIS Roman, JIS C6220-1969-ro, ISO646-JP or Japanese-Roman. Its ISO-IR registration number is 14.

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.

ISO/IEC 10367:1991 is a standard developed by ISO/IEC JTC 1/SC 2, defining graphical character sets for use in character encodings implementing levels 2 and 3 of ISO/IEC 4873.

<span class="mw-page-title-main">ARIB STD B24 character set</span> Character encoding and character set extensions used in Japanese broadcasting.

Volume 1 of the Association of Radio Industries and Businesses (ARIB) STD-B24 standard for Broadcast Markup Language specifies, amongst other details, a character encoding for use in Japanese-language broadcasting. It was introduced on 1999-10-26. The latest revision is version 6.3 as of 2016-07-06.

ISO-IR-111 or KOI8-E is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian. The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022.

The character sets used by Videotex are based, to greater or lesser extents, on ISO/IEC 2022. Three Data Syntax systems are defined by ITU T.101, corresponding to the Videotex systems of different countries.

References

  1. "The Linux Programmer's Manual" . Retrieved 2012-11-16.
  2. Japanese Industrial Standards Committee (1975-12-01). The Japanese Katakana graphic set of characters (PDF). ITSCJ/IPSJ. ISO-IR-13. Archived from the original (PDF) on 2022-03-10.
  3. Japanese Industrial Standards Committee (1975-12-01). The Japanese Roman graphic set of characters (PDF). ITSCJ/IPSJ. ISO-IR-14. Archived from the original (PDF) on 2022-03-10.
  4. Kawasaki, Yusuke (2010). Emoji encodings and cross-mapping tables in pure Perl.
  5. ECMA (1994). "7.3: Invocation of character-set code elements". Character Code Structure and Extension Techniques (PDF) (ECMA Standard) (6th ed.). p. 14. ECMA-35.
  6. Sveriges Standardiseringskommission (1975-12-01). NATS Control set for newspaper text transmission (PDF). ITSCJ/IPSJ. ISO-IR-7. Archived from the original (PDF) on 2022-03-10.