OCR-B

Last updated
OCR-B
OCR-B font.svg
Category Sans-serif
Classification Neo-grotesque
Designer(s) Adrian Frutiger
Date created1968
Typeface specimen OCR B.svg
Sample

OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973. [1] It follows the ISO 1073-2:1976 (E) standard, refined in 1979 ("letterpress" design, size I). It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes. [2] [ citation needed ] It is also used for machine-readable passports. [3] It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A.

Contents

History

In June 1961, the European Computer Manufacturers Association (ECMA) started standardization activities related to Optical Character Recognition (OCR). After evaluating existing OCR designs, it was decided to develop two new fonts: A stylized design with just digits, called “Class A”; and a more conventional type design with broader character coverage, called “Class B”. In February 1965, ECMA proposed a design for the “Class B” font to ISO, who adopted it as international standard ISO 1073-2 in October 1965. [4] The first revision contained three font sizes: I, II and III. The specification included a Letterpress design, intended for high-quality printing equipment; and a rounded-edge Constant Strokewidth design for impact printers [5] :3 with reduced typographic quality.

In September 1969, ECMA started work to revise its published standard. To make OCR-B more widely accepted, the shapes of some characters were slightly modified. The new revision removed font size II, which had been rarely used in practice; it deleted five character shapes; and it added a new font size IV. ECMA published the second edition of OCR-B in October 1971. [4]

In March 1976, ECMA published a third revision of its ECMA-11 specification. It added the symbols § and ¥ to OCR-B; two types of erasure marks (█) for blackening out mis-printed characters were added; and the length of the Vertical bar was changed to match ISO 1073-2. [4]

In 1993, Turkey proposed extending ISO 1073-2 to include the Turkish letters Ğğ, İı, and Şş. [6] The request was generalized to extend OCR-B with a number of Latin and Greek letters used in European languages. [7] :27 A revision of the ISO 1073-2:1976 standard was therefore started, producing three successive draft documents. The final draft would have extended OCR-B with 40 Latin and 10 Greek letters; for six Latin letters, the draft gave new alternate shapes. [7] :26 A request to extend OCR-B with Vietnamese accents was rejected. [7] :27 Other than previous versions of the standard, which specified glyph shapes via reference drawings, the new revision would have included the shapes in machine-readable form. [7] :26 However, industry support for testing the new font could not be secured at the time, so the revision effort was halted in 1997. [7] :IV The working group described their findings in a technical report. [7] :1

Two proposed variants for the OCR-B Euro sign OCR-B-Euro-Proposals.png
Two proposed variants for the OCR-B Euro sign

In June 1998, the European Committee for Standardization published a report for adding the Euro sign to OCR-B. [5] The report proposed both a single-stroked and a double-stroked variant of the Euro sign, leaving the decision to further testing of OCR performance. [5] :4 Testing was difficult: the theoretical design methods used when the OCR-B glyphs were originally developed could no longer be reproduced, and the technological constraints of the 1960s were also not entirely relevant anymore in the OCR environments of the 1990s. [8] A new test method was devised, using present-time OCR technology. The tests found no difference in OCR performance between the two Euro variants, and recommended the adoption of the double-stroked variant as it matches the conventional glyph shape. [8] The project did not have funds to thoroughly test the glyph extensions of the 1993 proposal; initial results were inconclusive. [8]

Availability

Microsoft Office ships a version of Letterpress OCR-B produced by Monotype. It covers Windows-1252. [9] Many vendors, including Adobe, still sell their versions of OCR-A and OCR-B.

The TeX typesetting system has a public domain Constant Strokewidth OCR-B font in METAFONT definition form. It was created by Norbert Swartz in 1995 and updated in 2010. It has a setting for square stroke ends. [10] The definition has also been translated to METATYPE1, so the rounded version is available in TrueType and OpenType too. [11]

A version of Constant Strokewidth OCR-B by Matthew Anderson has extended character coverage. It is available under CC-BY 4.0. [12]

Related Research Articles

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

<span class="mw-page-title-main">Monospaced font</span> Font whose characters occupy the same amount of horizontal space

A monospaced font, also called a fixed-pitch, fixed-width, or non-proportional font, is a font whose letters and characters each occupy the same amount of horizontal space. This contrasts with variable-width fonts, where the letters and spacings have different widths.

<span class="mw-page-title-main">Typeface</span> Set of characters that share common design features

A typeface is a design of letters, numbers and other symbols, to be used in printing or for electronic display. Most typefaces include variations in size, weight, slope, width, and so on. Each of these variations of the typeface is a font.

<span class="mw-page-title-main">Frutiger (typeface)</span> Humanist sans-serif typeface

Frutiger is a series of typefaces named after its Swiss designer, Adrian Frutiger. Frutiger is a humanist sans-serif typeface, intended to be clear and highly legible at a distance or at small text sizes. A popular design worldwide, type designer Steve Matteson described its structure as "the best choice for legibility in pretty much any situation" at small text sizes, while Erik Spiekermann named it as "the best general typeface ever".

<span class="mw-page-title-main">Univers</span> Neo-grotesque sans-serif typeface family

Univers is a sans-serif typeface family designed by Adrian Frutiger and released by his employer Deberny & Peignot in 1957. Classified as a neo-grotesque sans-serif, one based on the model of nineteenth-century German typefaces such as Akzidenz-Grotesk, it was notable for its availability from the moment of its launch in a comprehensive range of weights and widths. The original marketing for Univers deliberately referenced the periodic table to emphasise its scope.

Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the MICR line, is at the bottom of cheques and other vouchers and typically includes the document-type indicator, bank code, bank account number, cheque number, cheque amount, and a control indicator. The format for the bank code and bank account number is country-specific.

<span class="mw-page-title-main">Code 128</span> Barcode format

Code 128 is a high-density linear barcode symbology defined in ISO/IEC 15417:2007. It is used for alphanumeric or numeric-only barcodes. It can encode all 128 characters of ASCII and, by use of an extension symbol (FNC4), the Latin-1 characters defined in ISO/IEC 8859-1. It generally results in more compact barcodes compared to other methods like Code 39, especially when the texts contain mostly digits. Code 128 was developed by the Computer Identics Corporation in 1981.

Apple Symbols is a font introduced in Mac OS X 10.3 “Panther”. This is a TrueType font intended to provide coverage for characters defined as symbols in the Unicode Standard. It continues to ship with Mac OS X as part of the default installation. Prior to Mac OS X 10.5, its path was /Library/Fonts/Apple Symbols.ttf. From Mac OS X 10.5 onward, it is to be found at /System/Library/Fonts/Apple Symbols.ttf, meaning it is now considered an essential part of the system software, not to be deleted by users.

Segoe is a typeface, or family of fonts, that is best known for its use by Microsoft. The company uses Segoe in its online and printed marketing materials, including recent logos for a number of products. Additionally, the Segoe UI font sub-family is used by numerous Microsoft applications, and may be installed by applications. It was adopted as Microsoft's default operating system font, and is also used on Outlook.com, Microsoft's web-based email service. On August 23, 2012, Microsoft unveiled its new corporate logo typeset in Segoe, replacing the logo it had used for the previous 25 years.

<span class="mw-page-title-main">Font</span> Particular size, weight and style of a typeface

In metal typesetting, a font or fount is a particular size, weight and style of a typeface, defined as the set of fonts that share an overall design. For instance, the typeface Bauer Bodoni includes fonts "Roman", "bold" and "italic"; each of these exists in a variety of sizes.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

<span class="mw-page-title-main">Multiple master fonts</span> Extension to Adobe Systems Type 1 PostScript fonts

Multiple master fonts are an extension to Adobe Systems' Type 1 PostScript fonts, now superseded by the advent of OpenType and, in particular, the introduction of OpenType Font Variations in OpenType 1.8, also called variable fonts.

PostScript fonts are font files encoded in outline font specifications developed by Adobe Systems for professional digital typesetting. This system uses PostScript file format to encode font information.

Legibility is the ease with which a reader can decode symbols. In addition to written language, it can also refer to behaviour or architecture, for example. From the perspective of communication research, it can be described as a measure of the permeability of a communication channel. A large number of known factors can affect legibility.

<span class="mw-page-title-main">OCR-A</span> Typeface designed for early computer OCR

OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm apart, and the reader required to accept any spacing between 0.2286 cm and 0.4572 cm.

<span class="mw-page-title-main">Extended ASCII</span> Nickname for 8-bit ASCII-derived character sets

Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.

Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.

The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.

References

  1. Frutiger, Adrian. Type. Sign. Symbol. ABC Verlag, Zurich, 1980. p. 50
  2. "GS1 Human Readable Interpretation (HRI) Implementation Guideline" (PDF). GS1 AISBL. 2018. p. 13. Retrieved 2018-09-27.
  3. Doc 9303: Machine Readable Travel Documents, Part 3: Specifications Common to all MRTDs (PDF) (Eighth ed.). International Civil Aviation Organization. 2015. p. 25. ISBN   978-92-9249-792-7 . Retrieved 2016-03-03.
  4. 1 2 3 "Standard ECMA-11 for the Alphanumeric Character Set OCR-B for Optical Recognition" (PDF). European Computer Manufacturers Association. March 1976. Section “Brief History”.
  5. 1 2 3 4 "Draft Report on the Euro Glyph in OCR-B" (PDF). June 28, 1998.
  6. Karl Ivar Larsson (August 8, 2000). "Notes on transfer of responsibility for OCR-B standards".
  7. 1 2 3 4 5 6 "Proposal for Type 3 Technical Report, TR 15907, Information technology — Revision of OCR-B standard (ISO 1073/II-1976)" (PDF). September 28, 1998.
  8. 1 2 3 Karsson, Kent Ivar (June 28, 1998), Report to TC304 on OCR-B situation, Unicode Technical Committee, Unicode Consortium, UTC Document L2/01-259
  9. "OCRB font family - Typography". 30 March 2022.
  10. "CTAN: /Tex-archive/Fonts/Ocr-b".
  11. "OCR a and OCR B".
  12. "OCR-B". wehtt.am. Archived from the original on 28 March 2019. Retrieved 11 January 2022.