Magnetic ink character recognition

Last updated

Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the MICR line, is at the bottom of cheques and other vouchers and typically includes the document-type indicator, bank code, bank account number, cheque number, cheque amount (usually added after a cheque is presented for payment), and a control indicator. The format for the bank code and bank account number is country-specific.

Contents

The technology allows MICR readers to scan and read the information directly into a data-collection device. Unlike barcode and similar technologies, MICR characters can be read easily by humans. MICR encoded documents can be processed much faster and more accurately than conventional OCR encoded documents.

Pre-Unicode standard representation

The ISO standard ISO 2033:1983, and the corresponding Japanese Industrial Standard JIS X 9010:1984 (originally JIS C 6229–1984), define character encodings for OCR-A, OCR-B and E-13B.

International spread

There are two major MICR fonts in use: E-13B and CMC-7. There is no particular international agreement on which countries use which font. [1] In practice, this does not create particular problems as cheques and other vouchers do not usually flow out of a particular jurisdiction.

The E-13B font has been adopted as an international standard in ISO 1004-1:2013, and is the standard in Australia, Canada, the United Kingdom, the United States, as well as Central America and much of Asia, besides other countries. [1]

The CMC-7 font has been adopted as an international standard in ISO 1004-2:2013, and is widely used in Europe, including France and Italy, Mexico, and South America, including Argentina, Brazil, Chile, besides other countries.

Israel is the only country that can use both fonts simultaneously, though the practice makes the system significantly less efficient. This situation is the product of the Israelis adopting CMC-7, while the Palestinians opted for E-13B. [1]

Fonts

E-13B

MICR E-13B font of 14 characters. The control characters bracketing each numeral block are (from left to right) transit, on-us, amount, and dash. MICR char.svg
MICR E-13B font of 14 characters. The control characters bracketing each numeral block are (from left to right) transit, on-us, amount, and dash.

E-13B has a 14-character set, comprising the 10 decimal digits, and the following symbols:

In the check printing and banking industries the E-13B MICR line is also commonly referred to as the TOAD line. This reference comes from the 4 characters: Transit, Onus, Amount, and Dash.[ citation needed ] Compared to CMC-7, some pairs of E-13B characters (notably 2 and 5) can produce relatively similar results when magnetically scanned; however, as a fallback if magnetic reading fails, E-13B also performs well under optical character recognition. [1]

The E-13B repertoire can be represented in Unicode (see below). Prior to Unicode, it could be encoded according to ISO 2033:1983, which encodes digits in their usual ASCII locations, transit as 0x3A, on us as 0x3C, amount as 0x3B, and dash as 0x3D. [2] For EBCDIC, IBM code page 1001 encodes digits in their usual EBCDIC locations, transit as 0xDB, on us as 0xEB, amount as 0xCB, and dash as 0xFB. [3]

IBM code page 1032 extends code page 1001 by adding alternative encodings for transit at 0x5C, 0x7A and 0xC1, on us at 0x4C, 0x61 and 0xC3, amount at 0x5B, 0x5E and 0xC2 and dash at 0x60, 0x7E and 0xC4, in addition to a zero-width space at 0x5A. [4] These alternative representations were added for interoperability with Siemens and Océ printers. [5]

CMC-7

MICR CMC-7 font of 41 characters. The control characters after the numerals are (from left to right) S I (internal), S II (terminator), S III (amount), S IV (unused), and S V (routing). CMC7.svg
MICR CMC-7 font of 41 characters. The control characters after the numerals are (from left to right) SI (internal), SII (terminator), SIII (amount), SIV (unused), and SV (routing).

CMC-7 includes 10 numeric digits, 26 capital letters, [6] [7] and 5 control characters: SI (internal),[ citation needed ] SII (terminator),[ citation needed ] SIII (amount),[ citation needed ] SIV (an unused character), and SV (routing).[ citation needed ]

CMC-7 has a barcode format, with every character having two distinct large gaps in different places, as well as distinct patterns in between, to minimize any chance for character confusion while reading magnetically; however, these bars are too close and narrow to be reliably recognised at a typical scan resolution if falling back to optical scanning. CMC-7 can also produce superficially successful, but incorrect, scans of upside-down MICR lines. [1]

Unicode does not include support for the CMC-7 control symbols. IBM code page 1033 encodes: [8]

MICR reader

MICR characters are printed on documents in one of the two MICR fonts, using magnetizable (commonly known as magnetic) ink or toner, usually containing iron oxide. In scanning, the document is passed through a MICR reader, which performs two functions: magnetization of the ink, and detection of the characters. The characters are read by a MICR reader head, a device similar to the playback head of a tape recorder. As each character passes over the head, it produces a unique waveform that can be easily identified by the system.

MICR readers are the primary tool for cheque sorting and are used across the cheque distribution network at multiple stages. For example, a merchant will use a MICR reader to sort cheques by bank and send the sorted cheques to a clearing house for redistribution to those banks. Upon receipt, the banks perform another MICR sort to determine which customer's account is charged and to which branch the cheque should be sent on its way back to the customer. However, many banks no longer offer this last step of returning the cheque to the customer. Instead, cheques are scanned and stored digitally. Sorting of cheques is done as per the geographical coverage of banks in a nation. [9]

Unicode

OCR and MICR characters have been included in the Unicode Standard since at least version 1.1 (June 1993). Since the Unicode Character Database only tracks characters starting with version 1.1, they may also have been present in Unicode 1.0 or 1.0.1. [10]

The Unicode block that includes OCR and MICR characters is called Optical Character Recognition and covers U+2440–U+245F. Of the characters in this block, four are from the MICR E-13B font:

The names of the latter two characters were inadvertently switched when they were named in ISO/IEC 10646:1993, [12] and they have been assigned accurate names as formal aliases. [11] Per the Unicode Stability Policy, the existing names remain, allowing their use as stable identifiers. [13] Additionally, all four characters have informative (non-formal) aliases in the Unicode charts: "transit", "amount", "on us", and "dash" respectively.

Prior to Unicode, these symbols had been encoded by the ISO-IR-98 encoding defined by ISO 2033:1983, in which they were simply named SYMBOL ONE through SYMBOL FOUR. They were encoded immediately following the digits, which were encoded at their ASCII locations. [2] Although ISO 2033 also specifies encoding for OCR-A and OCR-B, its encoding for E-13B is known simply as ISO_2033-1983 by the IANA. [14]

Optical Character Recognition [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+244x
U+245x
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

History

An early demonstration of use of an E-13 MICR font on a cheque. The "transit" glyph differs from E-13B. Check with MICR.jpg
An early demonstration of use of an E-13 MICR font on a cheque. The "transit" glyph differs from E-13B.

Before the mid-1940s, cheques were processed manually using the Sort-A-Matic or Top Tab Key method. The processing and cheque clearing was very time-consuming and was a significant cost in cheque clearance and bank operations. As the number of cheques increased, ways were sought for automating the process. Standards were developed to ensure uniformity in financial institutions. By the mid-1950s, the Stanford Research Institute and General Electric [15] Computer Laboratory had developed the first automated system to process cheques using MICR. The same team also developed the E-13B MICR font. "E" refers to the font being the fifth considered, and "B" to the fact that it was the second version. The "13" refers to the 0.013-inch character grid.

A cheque signed by Gerald Ford, showing E-13B markings FORD, Gerald (signed check).jpg
A cheque signed by Gerald Ford, showing E-13B markings

The trial of MICR E-13B font was shown to the American Bankers Association (ABA) in July 1956, which adopted it in 1958 as the MICR standard for negotiable documents in the United States. ABA adopted MICR as its standard because machines could read MICR accurately, and MICR could be printed using existing technology. In addition, MICR remained machine readable, even through overstamping, marking, mutilation and more. The first cheques using MICR were printed by the end of 1959. Although compliance with MICR standards was voluntary in the United States, it had been almost universally adopted in the United States by 1963. [16] In 1963, ANSI adopted the ABA's E-13B font as the American standard for MICR printing, [17] and E-13B was also standardized as ISO 1004:1995.

Other countries set their own standards, though the MICR readers and most other equipment were US manufactured. MICR technology has been adopted in many countries, with some variations. The E-13B font was adopted as the standard in the United States, Canada, United Kingdom, Australia, and many other countries. In Australia, the system is managed by the Australian Payments Network.

A cheque signed by Enzo Ferrari in the collection of the Museo Ferrari, showing CMC-7 markings Enzo Ferrari signed cheque 1970-01-21 Museo Ferrari.jpg
A cheque signed by Enzo Ferrari in the collection of the Museo Ferrari, showing CMC-7 markings

The CMC-7 font was developed in France by Groupe Bull in 1957. It was adopted as the MICR standard in Argentina, France, Italy, and some other European countries.

In the 1960s, the MICR fonts became a symbol of modernity or futurism, leading to the creation of lookalike "computer" typefaces that imitated the appearance of the MICR fonts, which unlike real MICR fonts, had a full character set.

MICR E-13B is also used to encode information in other applications, such as sales promotions, coupons, credit cards, airline tickets, insurance premium receipts, deposit tickets, and more. E-13b is the version specifically developed for offset litho printing. There was a subtly different version for letterpress,[ citation needed ] called E-13a. Also, there was a rival system named 'Fred' (Figure Reading Electronic Device) which used figures that looked more conventional.

See also

Related Research Articles

<span class="mw-page-title-main">Character encoding</span> Using numbers to represent text characters

Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map".

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.

In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte.

<span class="mw-page-title-main">Pound sign</span> Currency sign

The pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and its associated Crown Dependencies and British Overseas Territories and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Egyptian and Syrian pounds. The sign may be drawn with one or two bars depending on personal preference, but the Bank of England has used the one-bar style exclusively on banknotes since 1975.

ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Arabic. It was designed to cover Arabic. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. It does not include the extra letters needed to write most Arabic-script languages other than Arabic itself.

Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters).

<span class="mw-page-title-main">Currency sign (typography)</span> Glyph (¤) used for unspecified currency

The currency sign¤ is a character used to denote an unspecified currency. It can be described as a circle the size of a lowercase character with four short radiating arms at 45° (NE), 135° (SE), 225° (SW) and 315° (NW). It is raised slightly above the baseline. The character is sometimes called scarab.

<span class="mw-page-title-main">IBM document processors</span> Check processing peripheral for IBM mainframes

IBM manufactured and sold document processing equipment such as proof machines, inscribers and document reader/sorters for financial institutions from 1934 to 2005.

<span class="mw-page-title-main">Westminster (typeface)</span> Printing and display typeface

Westminster is a printing and display typeface inspired by the font used for the MICR numbers printed on cheques and designed by Leo Maggs.

<span class="mw-page-title-main">JIS X 0201</span> Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. Its two forms were a 7-bit encoding or an 8-bit encoding, although the 8-bit form was dominant until Unicode replaced it. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

<span class="mw-page-title-main">OCR-A</span> Typeface designed for early computer OCR

OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm apart, and the reader required to accept any spacing between 0.2286 cm and 0.4572 cm.

KS X 1001, "Code for Information Interchange ", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer.

<span class="mw-page-title-main">OCR-B</span> Sans-serif typeface

OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973. It follows the ISO 1073-2:1976 (E) standard, refined in 1979. It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes. It is also used for machine-readable passports. It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A.

The programming language APL uses a number of symbols, rather than words from natural language, to identify operations, similarly to mathematical symbols. Prior to the wide adoption of Unicode, a number of special-purpose EBCDIC and non-EBCDIC code pages were used to represent the symbols required for writing APL.

In mathematics, the radical symbol, radical sign, root symbol, radix, or surd is a symbol for the square root or higher-order root of a number. The square root of a number x is written as

Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.

The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.

The VT100 code page is a character encoding used to represent text on the Classic Mac OS for compatibility with the VT100 terminal. It encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. It is suitable for English and several other Western languages. It is similar to Mac OS Roman but includes all characters in ISO 8859-1 except for the currency sign, the no-break space, and the soft hyphen. It also includes all characters in DEC Special Graphics, except for the new line and no-break space controls. The VT100 encoding is only used on the VT100 font on the Classic Mac OS and is not an official Mac OS character encoding.

References

  1. 1 2 3 4 5 Battle of the MICR Fonts: Which Is Better, E13B or CMC7?
  2. 1 2 ISO/TC97/SC2 (1985-08-01). ISO-IR-98: A set of 14 graphic characters of the E-13B font (PDF). ITSCJ/IPSJ.{{citation}}: CS1 maint: numeric names: authors list (link)
  3. "Code Page 01001" (PDF). IBM. Archived from the original (PDF) on 2015-07-08. Retrieved 2021-10-19.
  4. "Code Page 01032" (PDF). IBM. Archived from the original (PDF) on 2015-07-08. Retrieved 2021-10-19.
  5. "MICR Fonts for Infoprint 4100 Printers". IBM. 2004-06-24.
  6. "ConnectCode MICR CMC7" (PDF). ConnectCode Pte Ltd. 2021.
  7. Information processing — Magnetic ink character recognition — Part 2: Print specifications for CMC7. ISO. 2013-06-01. ISO 1004-2:2013. (Preview excerpt)
  8. "Code Page 01033" (PDF). IBM. Archived from the original (PDF) on 2015-07-08. Retrieved 2021-10-19.
  9. "Reserve Bank of India - Publications". rbi.org.in.
  10. Unicode Consortium (2019-09-08). "Derived Age". Unicode Character Database: Derived Property Data.
  11. 1 2 3 Freytag, Asmus; McGowan, Rick; Whistler, Ken (2017-04-10). Known Anomalies in Unicode Character Names (4 ed.). Unicode Consortium. Unicode Technical Note #27.
  12. ISO/IEC JTC 1/SC 2/WG 2 (2012-01-03). "T.3. Optical Character Recognition". Unconfirmed minutes of WG 2 meeting 58 (PDF). p. 29. SC2 N4188 / WG2 N4103.{{citation}}: CS1 maint: numeric names: authors list (link)
  13. "Unicode Character Encoding Stability Policies". Unicode Consortium. 2017-06-23.
  14. "Character Sets". IANA.
  15. "ARTICLES: Magnetic Ink Character Recognition" (PDF). Computers and Automation. 5 (10): 10–16, 44 (12 - Other Sessions). Oct 1956. Retrieved 2020-09-05.
  16. Mandell, Lewis (May 1977). "Diffusion of EFTS among National Banks: Notes". Journal of Money, Credit and Banking. 9 (2): 341–348. doi:10.2307/1991983. JSTOR   1991983.
  17. ANSI standard X9.27-1995 and ANSI standard ANS X9.7-1990.