DEC Radix-50

Last updated

RADIX-50, commonly called Rad-50, RAD50 or DEC Squoze,[ discuss ] is an uppercase only character encoding created by Digital Equipment Corporation for use on their DECsystem, PDP, and VAX computers. RADIX-50's 40-character repertoire (050 in octal) can encode six characters plus four additional bits into one 36-bit word (PDP-6, PDP-10/DECsystem-10, DECSYSTEM-20); three characters plus two additional bits into one 18-bit word (PDP-9, PDP-15); [1] or three characters into one 16-bit word (PDP-11, VAX).

Contents

The actual encoding differed between the 36-bit and 16-bit systems.

Etymology

The name "SQUOZE" was borrowed from the scheme used in the SHARE 709 operating system for representing object code symbols. IBM SQUOZE packed six characters of a 50-character alphabet plus two additional flag bits into one 36-bit word. [1]

36-bit systems

Radix-50 in 36-bit systems was commonly used in symbol tables for assemblers or compilers which supported six-character symbol names. This left four bits to encode properties of the symbol.

Radix-50 was not normally used in 36-bit systems for encoding ordinary character strings; file names were normally encoded as six 6-bit characters, and full ASCII strings as five 7-bit characters and one unused bit per 36-bit word.

PDP-6, PDP-10/DECsystem-10, DECSYSTEM-20 [2]
Most
significant
bits
Least significant bits
000001010011100101110111
000space0123456
001789ABCDE
010FGHIJKLM
011NOPQRSTU
100VWXYZ.$%

18-bit systems

Radix-50 (called Radix 508 format) was used in Digital's 18-bit PDP-9 and PDP-15 computers to store symbols in symbol tables, leaving two extra bits per word ("symbol classification bits"). [3]

16-bit systems

Some strings in DEC's 16-bit systems were encoded as 8-bit bytes, while others used Radix-50 (also called MOD40). [4] [5] In Radix-50, strings were encoded in successive words as needed, with the first character within each word located in the most significant position. For example, using the PDP-11 encoding, the string "ABCDEF", with character values 1, 2, 3, 4, 5, and 6, would be encoded as a word containing the value 1×402 + 2×401 + 3×400 = 1,683, followed by a second word containing the value 4×402 + 5×401 + 6×400 = 6,606. Thus, 16-bit words encoded values ranging from 0 (three spaces) to 63,999 ("999"). When there were fewer than three characters in a word, the last word for the string was padded with trailing spaces. [4]

There were several minor variations of the encoding families. For example, the RT-11 operating system considered the character corresponding to value 011101 to be undefined, [4] and some utility programs used that value to represent the * character instead.

The use of Rad-50 was the source of the filename size conventions used by Digital Equipment Corporation PDP-11 operating systems. Using Rad-50 encoding, six characters of a filename could be stored in two 16-bit words, while three more extension (file type) characters could be stored in a third 16-bit word. The period that separated the filename and its extension was implied (i.e., was not stored and always assumed to be present). Rad-50 was also commonly used in the symbol tables of the various PDP-11 programming languages.

PDP-11, VAX [4] [6]
Most
significant
bits
Least significant bits
000001010011100101110111
000spaceABCDEFG
001HIJKLMNO
010PQRSTUVW
011XYZ$.%01
10023456789

See also

Related Research Articles

ASCII American computer character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Most modern character-encoding schemes are based on ASCII, although they support many additional characters.

The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures.

Binary-coded decimal class of binary encodings of decimal numbers where each decimal digit is represented by a fixed number of bits, usually four or eight. Special bit patterns are sometimes used for a sign or for other indications

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for a sign or other indications.

Digital Equipment Corporation, using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until forced to resign in 1992, after the company had gone into precipitous decline.

Octal Base-8 positional notation, using digits 0–7

The octal numeral system, or oct for short, is the base-8 number system, and uses the digits 0 to 7. Octal numerals can be made from binary numerals by grouping consecutive binary digits into groups of three. For example, the binary representation for decimal 74 is 1001010. Two zeroes can be added at the left: (00)1 001 010, corresponding the octal digits 1 1 2, yielding the octal representation 112.

PDP-10 36 bit mainframe computer family built 1966–1983

Digital Equipment Corporation (DEC)'s PDP-10, later marketed as the DECsystem-10, is a mainframe computer family manufactured beginning in 1966 and discontinued in 1983. 1970s models and beyond were marketed under the DECsystem-10 name, especially as the TOPS-10 operating system became widely used.

DECSYSTEM-20

The DECSYSTEM-20 was a 36-bit Digital Equipment Corporation PDP-10 mainframe computer running the TOPS-20 operating system.

Programmed Data Processor Name used for several lines of minicomputers

Programmed Data Processor (PDP), referred to by some customers, media and authors as "Programmable Data Processor, is a term used by the Digital Equipment Corporation from 1957 to 1990 for several lines of minicomputers. The name "PDP" intentionally avoids the use of the term "computer" because, at the time of the first PDPs, computers had a reputation of being large, complicated, and expensive machines, and the venture capitalists behind Digital would not support Digital's attempting to build a "computer"; the word "minicomputer" had not yet been coined. So instead, Digital used their existing line of logic modules to build a Programmed Data Processor and aimed it at a market that could not afford the larger computers.

PDP-8 First commercially successful minicomputer

The PDP-8 is a 12-bit minicomputer that was produced by Digital Equipment Corporation (DEC). It was the first commercially successful minicomputer, with over 50,000 units being sold over the model's lifetime. Its basic design follows the pioneering LINC but has a smaller instruction set, which is an expanded version of the PDP-5 instruction set. Similar machines from DEC are the PDP-12 which is a modernized version of the PDP-8 and LINC concepts, and the PDP-14 industrial controller system.

UTF-8 Unicode Transformation Format 8, encodes all 1,112,064 Unicode code points as 1 to 4 bytes

UTF-8 is a variable-width character encoding capable of encoding all 1,112,064 valid character code points in Unicode using one to four one-byte (8-bit) code units. The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike. The name is derived from UnicodeTransformation Format – 8-bit.

TOPS-20 operating system by Digital Equipment Corporation

The TOPS-20 operating system by Digital Equipment Corporation (DEC) was a proprietary OS used on some of DEC's 36-bit mainframe computers. The Hardware Reference Manual was described as for "DECsystem-10/DECSYSTEM-20 Processor".

RT-11 is a discontinued small, single-user real-time operating system for the Digital Equipment Corporation PDP-11 family of 16-bit computers. RT-11 was first implemented in 1970 and was widely used for real-time systems, process control, and data acquisition across the full line of PDP-11 computers.

PDP-6

The PDP-6 is a computer model developed by Digital Equipment Corporation (DEC) in 1964. It was influential primarily as the prototype (effectively) for the later PDP-10; the instruction sets of the two machines are almost identical.

36-bit computing

In computer architecture, 36-bit integers, memory addresses, or other data units are those that are 36 bits wide. Also, 36-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size. 36-bit computers were popular in the early mainframe computer era from the 1950s through the early 1970s.

The Massbus is a high-performance computer input/output bus designed in the 1970s by Digital Equipment Corporation (DEC).

Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits.

A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters. Such codes with additional parity bit were a natural way of storing data on 7-track magnetic tape.

In computer architecture, 18-bit integers, memory addresses, or other data units are those that are 18 bits wide. Also, 18-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size.

SQUOZE is a memory-efficient representation of a combined source and relocatable object program file with a symbol table on punched cards which was introduced in 1958 with the SCAT assembler on the SHARE Operating System (SOS) for the IBM 709. A program in this format was called a SQUOZE deck. It was also used on later machines including the IBM 7090 and 7094.

TENEX was an operating system developed in 1969 by BBN for the PDP-10, which later formed the basis for Digital Equipment Corporation's TOPS-20 operating system.

References

  1. 1 2 Jones, Douglas W. (2018). "Lecture 7, Object Codes, Loaders and Linkers - Final steps on the road to machine code". Operating Systems, Spring 2018. Part of the CS:3620 Operating Systems Collection. The University of Iowa, Department of Computer Science. Archived from the original on 2020-06-06. Retrieved 2020-06-06.
  2. Durda IV., Frank (2004). "RADIX50 Character Code Reference". Archived from the original on 2005-03-31. Retrieved 2005-03-31.
  3. "Appendix 1". PDP-9 Utility Programs--Advanced Software System--Programmer's Reference Manual (PDF). Maynard, Massachusetts, USA: Digital Equipment Corporation. 1968. Order No. DEC-9A-GUAB-D. Archived (PDF) from the original on 2020-06-04. Retrieved 2020-06-04.
  4. 1 2 3 4 "8.10 .RAD50". PAL-11R Assemler - Programmer's Manual - Program Assembly Language and Relocatable Assembler for the Disk Operating System (2nd revised printing ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. May 1971 [February 1971]. p. 8-8. DEC-11-ASDB-D. Retrieved 2020-06-18. […] PDP-11 systems programs often handle symbols in a specially coded form called RADIX 50 (this form is sometimes referred to as MOD40). This form allows 3 characters to be packed into 16 bits; therefore, any 6-character symbol can be held in two words. The single operand is of the form /CCC/ where the slash (the delimiter) can be any printable character except for = and : . The delimiters enclose the characters to be converted which may be A through Z, 0 through 9, dollar ($), dot (.) and space ( ). If there are fewer than 3 characters they are considered to be left justified and trailing spaces are assumed. […] The packing algorithm is as follows: […] A. Each character is translated into its RADIX 50 equivalent as indicated in the following table: Character - RADIX 50 Equivalent (octal): (space) - 0, A–Z - 1–32, $ - 33, . - 34, 0–9 - 36–47. Note that another character could be defined for code 35. […] B. The RADIX 50 equivalents for characters 1 through 3 (C1,C2,C3) are combined as follows: RESULT=((C1*50)+C2)*50+C3 […]
  5. PDP-11 Getting DOS on the Air (1 ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. August 1971. DEC-11-SYDC-D. Retrieved 2020-06-18.
  6. "Appendix B.3: Radix-50 Constants and Character Set". Compaq Fortran 77 Language Reference Manual. Compaq Computer Corporation. 1999. Archived from the original on 2012-10-14. Retrieved 2012-10-14.

Further reading