Signed overpunch

Last updated

In computing, a signed overpunch is a coding scheme which stores the sign of a number by changing (usually) the last digit. It is used in character data on IBM mainframes by languages such as COBOL, PL/I, and RPG. [1] Its purpose is to save a character that would otherwise be used by the sign digit. [2] The code is derived from the Hollerith Punched Card Code, where both a digit and a sign can be entered in the same card column. It is called an overpunch because the digit in that column has a 12-punch or an 11-punch above it to indicate the sign. The top three rows of the card are called zone punches, [3] and so numeric character data which may contain overpunches is called zoned decimal .

Contents

In IBM terminology, the low-order four bits of a byte in storage are called the digit, and the high-order four bits are the zone. [4] The digit bits contain the numeric value 0–9. The zone bits contain either 'F'x, forming the characters 0–9, or the character position containing the overpunch contains a hexadecimal value indicating a positive or negative value, forming a different set of characters. (A, C, E, and F zones indicate positive values, B and D negative).

The PACK instruction on IBM System/360 architecture machines converts the sign of a zoned decimal number when converting to packed decimal , and the corresponding UNPK instruction will set the correct overpunched sign of its zoned decimal output. [5]

Language support

PL/I

PL/I uses the PICTURE attribute to declare zoned decimal data with a signed overpunch. Each character in a numeric picture except V, which indicates the position of the assumed decimal point, represents a digit. A picture character of T, I, or R indicates a digit position which may contain an overpunch. T indicates that the position will contain {–I if positive and {–R if negative. I indicates that the position will contain {–I if positive and 0-9 if negative. R indicates that the position will contain 0–9 if positive and {–R if negative.

For example PICTURE 'Z99R' describes a four-character numeric field. The first position may be blank or will contain a digit 0–9. The next two positions will contain digits, and the fourth position will contain 0–9 for a positive number and {–R for negative. [6]

Assigning the value 1021 to the above picture will store the characters "1021" in memory; assigning -1021 will store "102J".

COBOL

COBOL uses the picture character 'S' for USAGE IS DISPLAY data without SIGN IS SEPARATE CHARACTER to indicate an overpunch. SIGN IS LEADING indicates that the overpunch is over the first character of the field. SIGN IS TRAILING, locates it over the last character. SIGN IS TRAILING is the default. [7]

C/C++

The C language has no provision for zoned decimal. The IBM ILE C/C++ compiler for System i provides functions for conversion between int or double and zoned decimal: [8]

EBCDIC overpunch codes

EBCDIC
character
DigitSignCard code [9]
{0+12-0
A1+12-1
B2+12-2
C3+12-3
D4+12-4
E5+12-5
F6+12-6
G7+12-7
H8+12-8
I9+12-9
}0-11-0
J1-11-1
K2-11-2
L3-11-3
M4-11-4
N5-11-5
O6-11-6
P7-11-7
Q8-11-8
R9-11-9

Examples

10} is -100
45A is 451

ASCII representation

Representation of signed overpunch characters "is not standardized in ASCII, and different compilers use different overpunch codes." In some cases, "the representation is not the same as the result of converting an EBCDIC Signed field to ASCII with a translation table." [10] In other cases they are the same, to maintain source-data compatibility at the loss of the connection between the character code and the corresponding digit.

An EBCDIC negative field ending with the digit '1' will encode that digit as 'D1'x, upper-case 'J', where the digit is '1' and the zone is 'D' to indicate a negative field. ASCII upper-case 'J' is '4A'x, where the hexadecimal value bears no relationship to the numeric value. An alternative encoding uses lower-case 'q', '71'x, for this representation, where the digit is '1' and the zone is '7'. This preserves the digit and the collating sequence at the cost of having to recognize and translate fields with overpunches individually.

Examples

Gnu COBOL and MicroFocus COBOL use lower-case 'p' thru 'y' to represent negative '0' thru '9'. [11] [12]

PL/I compilers on ASCII systems use the same set of characters ({, J–R) as EBCDIC to represent overpunches. [13]

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

<span class="mw-page-title-main">Binary-coded decimal</span> System of digitally encoding numbers

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for a sign or other indications.

<span class="mw-page-title-main">Character encoding</span> Using numbers to represent text characters

Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map".

<span class="mw-page-title-main">COBOL</span> Programming language with English-like syntax

COBOL is a compiled English-like computer programming language designed for business use. It is an imperative, procedural and, since 2002, object-oriented language. COBOL is primarily used in business, finance, and administrative systems for companies and governments. COBOL is still widely used in applications deployed on mainframe computers, such as large-scale batch and transaction processing jobs. Many large financial institutions were developing new systems in the language as late as 2006, but most programming in COBOL today is purely to maintain existing applications. Programs are being moved to new platforms, rewritten in modern languages or replaced with other software.

Extended Binary Coded Decimal Interchange Code is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six-bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s. It is supported by various non-IBM platforms, such as Fujitsu-Siemens' BS2000/OSD, OS-IV, MSP, and MSP-EX, the SDS Sigma series, Unisys VS/9, Unisys MCP and ICL VME.

In mathematics and computing, the hexadecimal numeral system is a positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbols, hexadecimal uses sixteen distinct symbols, most often the symbols "0"–"9" to represent values 0 to 9, and "A"–"F" to represent values from ten to fifteen.

PL/I is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has been in continuous use by academic, commercial and industrial organizations since it was introduced in the 1960s.

<span class="mw-page-title-main">IBM 1620</span> Small IBM scientific computer released in 1959

The IBM 1620 was announced by IBM on October 21, 1959, and marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on November 19, 1970. Modified versions of the 1620 were used as the CPU of the IBM 1710 and IBM 1720 Industrial Process Control Systems.

<span class="mw-page-title-main">IBM 1401</span> 1960s decimal computer

The IBM 1401 is a variable-wordlength decimal computer that was announced by IBM on October 5, 1959. The first member of the highly successful IBM 1400 series, it was aimed at replacing unit record equipment for processing data stored on punched cards and at providing peripheral services for larger computers. The 1401 is considered by IBM to be the Ford Model-T of the computer industry due to its mass appeal. Over 12,000 units were produced and many were leased or resold after they were replaced with newer technology. The 1401 was withdrawn on February 8, 1971.

Adabas, a contraction of “adaptable database system," is a database package that was developed by Software AG to run on IBM mainframes. It was launched in 1971 as a non-relational database. As of 2019, Adabas is marketed for use on a wider range of platforms, including Linux, Unix, and Windows.

In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representing the cents. More generally, the term may refer to representing fractional values as integer multiples of some fixed small unit, e.g. a fractional amount of hours as an integer multiple of ten-minute intervals. Fixed-point number representation is often contrasted to the more complicated and computationally demanding floating-point representation.

<span class="mw-page-title-main">36-bit computing</span> Computer architecture bit width

In computer architecture, 36-bit integers, memory addresses, or other data units are those that are 36 bits wide. Also, 36-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers, address buses, or data buses of that size. 36-bit computers were popular in the early mainframe computer era from the 1950s through the early 1970s.

UTF-EBCDIC is a character encoding capable of encoding all 1,112,064 valid character code points in Unicode using one to five one-byte (8-bit) code units. It is meant to be EBCDIC-friendly, so that legacy EBCDIC applications on mainframes may process the characters without much difficulty. Its advantages for existing EBCDIC-based systems are similar to UTF-8's advantages for existing ASCII-based systems. Details on UTF-EBCDIC are defined in Unicode Technical Report #16.

The Burroughs B2500 through Burroughs B4900 was a series of mainframe computers developed and manufactured by Burroughs Corporation in Pasadena, California, United States, from 1966 to 1991. They were aimed at the business world with an instruction set optimized for the COBOL programming language. They were also known as Burroughs Medium Systems, by contrast with the Burroughs Large Systems and Burroughs Small Systems.

Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format. In contrast to extended precision, arbitrary-precision arithmetic refers to implementations of much larger numeric types using special software.

<span class="mw-page-title-main">NCR 315</span>

The NCR 315 Data Processing System, released in January 1962 by NCR, is a second-generation computer. All printed circuit boards use resistor–transistor logic (RTL) to create the various logic elements. It uses 12-bit slab memory structure using magnetic-core memory. The instructions can use a memory slab as either two 6-bit alphanumeric characters or as three 4-bit BCD digits. Basic memory is 5000 "slabs" of handmade core memory, which is expandable to a maximum of 40,000 slabs in four refrigerator-size cabinets. The main processor includes three cabinets and a console section that houses the power supply, keyboard, output writer, and a panel with lights that indicate the current status of the program counter, registers, arithmetic accumulator, and system errors. Input/Output is by direct parallel connections to each type of peripheral through a two-cable bundle with 1-inch-thick cables. Some devices like magnetic tape and the CRAM are daisy-chained to allow multiple drives to be connected.

A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters. The 7-track magnetic tape format was developed to store data in such codes, along with an additional parity bit.

The PDP-11 architecture is a 16-bit CISC instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC). It is implemented by central processing units (CPUs) and microprocessors used in PDP-11 minicomputers. It was in wide use during the 1970s, but was eventually overshadowed by the more powerful VAX architecture in the 1980s.

BCD, also called alphanumeric BCD, alphameric BCD, BCD Interchange Code, or BCDIC, is a family of representations of numerals, uppercase Latin letters, and some special and control characters as six-bit character codes.

References

  1. IBM Corporation (June 1994). RPG/400 Reference (PDF). p. 403. Retrieved Aug 7, 2018.
  2. "Tech Talk, COBOL Tutorials, EBCDIC to ASCII Conversion of Signed Fields" . Retrieved 2008-03-15.
  3. Van Overberghe, Jr., Albert G. (1987). Data Processing Technician Third Class. Naval Education and Training Program. pp. 3–8. Retrieved Jan 12, 2022.
  4. IBM Corporation. IBM System/360 Principles of Operation (PDF). p. 34. Retrieved Jan 12, 2022.
  5. IBM Corporation (Oct 2001). z/Architecture Principles of Operation (2nd ed.). pp. 7–112, 7–158. Retrieved August 7, 2018.
  6. IBM Corporation (June 1995). IBM PL/I for MVS & VM Language Reference (PDF). pp. 294–296. Retrieved Aug 2, 2018.
  7. IBM Corporation. "Enterprise COBOL for z/OS, V4.2, Language Reference". IBM Knowledge Center. Retrieved May 1, 2020.
  8. IBM Corporation. "Library Functions". IBM Knowkedge Center. Retrieved May 1, 2020.
  9. IBM Corporation (1989). System/370 Extended Architecture Reference Summary. p. 41.
  10. "EBCDIC to ASCII Conversion of Signed Fields". DISC Media Conversion Specialists. Retrieved Nov 29, 2018.
  11. "GnuCOBOL Programmer's Guide". SourceForge. Retrieved Jan 12, 2022.
  12. "Micro Focus Visual COBOL 5.0 for Visual Studio 2019". Micro Focus. Retrieved Jan 12, 2022.
  13. Kednos Corporation. "Kednos PL/I for OpenVMS Systems Reference Manual". Kednos.com. Retrieved Jan 12, 2022.