Stanford/ITS character set

Last updated
Stanford/ITS character set
Stanford-ITS-character-set-infobox.svg
Stanford/ITS character set code page layout
Language(s) English
Created by MIT [1]
Definitions RFC   734
Classification Extended ASCII
Extends US-ASCII
Based on SEASCII [2]

Stanford/ITS character set is an extended ASCII character set based on SEASCII with modifications allowing compatibility with 1968 ASCII. [2]

Contents

Usage

It is used as an alternate character set of the SUPDUP protocol for terminals with %TOSAI and %TOFCI bits set. [2] It is also recommended for TeX implementations on systems with large character sets. [1] The default plain TeX macro package sets values B16 () and 116 () as alternative character codes for superscripts and subscripts, respectively (the default being ^ and _). [3]

The Knight keyboard is an example of a keyboard capable of inputting all of the defined characters excluding ⋅γδ±⊕◊∫, as they are mapped to ASCII commands NUL, HT, LF, FF, CR, ESC and DEL, respectively.

Coverage

Each character is encoded as a single seven-bit code value. It contains all 95 printable ASCII characters along with 27 mathematical symbols and 6 Greek letters.

Code page layout

Stanford/ITS character set
0123456789ABCDEF
0x α β ¬ π λ γ δ ±
1x
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~

See also

Related Research Articles

ASCII American character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Most modern character-encoding schemes are based on ASCII, although they support many additional characters.

Donald Knuth American computer scientist, mathematician, and professor emeritus at Stanford University

Donald Ervin Knuth is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer science. Knuth has been called the "father of the analysis of algorithms".

Extended Binary Coded Decimal Interchange Code is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six-bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s. It is supported by various non-IBM platforms, such as Fujitsu-Siemens' BS2000/OSD, OS-IV, MSP, and MSP-EX, the SDS Sigma series, Unisys VS/9, Unisys MCP and ICL VME.

In mathematics and computing, the hexadecimal numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexadecimal uses 16 distinct symbols, most often the symbols "0"–"9" to represent values 0 to 9, and "A"–"F" to represent values from 10 to 15.

Hash function Type of function that maps data of arbitrary size to data of fixed size

A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Message bodies may consist of multiple parts, and header information may be specified in non-ASCII character sets. Email messages with MIME formatting are typically transmitted with standard protocols, such as the Simple Mail Transfer Protocol (SMTP), the Post Office Protocol (POP), and the Internet Message Access Protocol (IMAP).

TeX, stylized within the system as TeX, is a typesetting system which was designed and written by Donald Knuth and first released in 1978. TeX is a popular means of typesetting complex mathematical formulae; it has been noted as one of the most sophisticated digital typographical systems.

UTF-8 is a variable-width character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from UnicodeTransformation Format – 8-bit.

Metafont is a description language used to define raster fonts. It is also the name of the interpreter that executes Metafont code, generating the bitmap fonts that can be embedded into e.g. PostScript. Metafont was devised by Donald Knuth as a companion to his TeX typesetting system.

The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data connections between the client and the server. FTP users may authenticate themselves with a clear-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. For secure transmission that protects the username and password, and encrypts the content, FTP is often secured with SSL/TLS (FTPS) or replaced with SSH File Transfer Protocol (SFTP).

<i>Computers and Typesetting</i> 1986 book series on digital typography by American computer scientist Donald Knuth

Computers and Typesetting is a 5-volume set of books by Donald Knuth published in 1986 describing the TeX and Metafont systems for digital typography. Knuth's computers and typesetting project was the result of his frustration with the lack of decent software for the typesetting of mathematical and technical documents. The results of this project include TeX for typesetting, Metafont for font construction and the Computer Modern typefaces that are the default fonts used by TeX. In the series of five books Knuth not only describes the TeX and Metafont languages, he also describes and documents the source code of the TeX and Metafont interpreters, and the source code for the Computer Modern fonts used by TeX. The book set stands as a tour de force demonstration of literate programming.

In computer programming, digraphs and trigraphs are sequences of two and three characters, respectively, that appear in source code and, according to a programming language's specification, should be treated as if they were single characters.

Computer Modern Family of typefaces

Computer Modern is the original family of typefaces used by the typesetting program TeX. It was created by Donald Knuth with his Metafont program, and was most recently updated in 1992. Computer Modern, or variants of it, remains very widely used in scientific publishing, especially in disciplines that make frequent use of mathematical notation.

In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area on a page. For example, the common whitespace symbol U+0020 SPACE represents a blank space punctuation character in text, used as a word divider in Western scripts.

In computer programming, a sentinel value is a special value in the context of an algorithm which uses its presence as a condition of termination, typically in a loop or recursive algorithm.

A mathematical markup language is a computer notation for representing mathematical formulae, based on mathematical notation. Specialized markup languages are necessary because computers normally deal with linear text and more limited character sets. A formally standardized syntax also allows a computer to interpret otherwise ambiguous content, for rendering or even evaluating. For computer-interpretable syntaxes, the most popular are TeX/LaTeX and MathML.

Extended ASCII

Extended ASCII character encodings are eight-bit or larger encodings that include the standard seven-bit ASCII characters, plus additional characters. Using the term "extended ASCII" on its own is sometimes criticized, because it can be mistakenly interpreted to mean that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, neither of which is the case.

In typography, a thin space is a space character that is usually 15 or 16 of an em in width. It is used to add a narrow space, such as between nested quotation marks or to separate glyphs that interfere with one another. It is not as narrow as the hair space. It is also used in the International System of Units and in many countries as a thousands separator when writing numbers in groups of three digits, in order to facilitate reading.

The Lotus International Character Set (LICS) is a proprietary single-byte character encoding introduced in 1985 by Lotus Development Corporation. It is based on the 1983 DEC Multinational Character Set (MCS) for VT220 terminals. As such, LICS is also similar to two other descendants of MCS, the ECMA-94 character set of 1985 and the ISO 8859-1 (Latin-1) character set of 1987.

Stanford Extended ASCII (SEASCII) is a derivation of the 7-bit ASCII character set developed at the Stanford Artificial Intelligence Laboratory (SAIL/SU-AI) in the early 1970s. Not all symbols match ASCII.

References

  1. 1 2 Knuth, Donald (1986). "Appendix C: Character Codes". The TeXbook (PDF). Reading, Massachusetts: Addison-Wesley. p.  369. ISBN   0201134470.
  2. 1 2 3 Crispin, Mark (October 1977). "Stanford/ITS character set". SUPDUP Protocol. IETF. p. 12. doi: 10.17487/RFC0734 . RFC 734 . Retrieved February 18, 2019.
  3. Knuth, Donald (1986). "Appendix B: Basic Control Sequences". The TeXbook (PDF). Reading, Massachusetts: Addison-Wesley. p.  343. ISBN   0201134470.

Further reading