PDF417

Last updated

Sample of a PDF417 symbol Wikipedia PDF417.png
Sample of a PDF417 symbol

PDF417 is a stacked linear barcode format used in a variety of applications such as transport, identification cards, and inventory management. "PDF" stands for Portable Data File. The "417" signifies that each pattern in the code consists of 4 bars and spaces in a pattern that is 17 units (modules) long. The PDF417 symbology was invented by Dr. Ynjiun P. Wang at Symbol Technologies in 1991. [1] It is defined in ISO 15438.

Contents

Applications

PDF417 is used in many applications by both commercial and government organizations. PDF417 is one of the formats (along with Data Matrix) that can be used to print postage accepted by the United States Postal Service. PDF417 is also used by the airline industry's Bar Coded Boarding Pass (BCBP) standard as the 2D bar code symbolism for paper boarding passes. PDF417 is the standard selected by the Department of Homeland Security as the machine readable zone technology for RealID compliant driver licenses and state issued identification cards. PDF417 barcodes are also included on visas and border crossing cards issued by the State of Israel (example).

Features

In addition to features typical of two dimensional bar codes, PDF417's capabilities include:

The introduction of the ISO/IEC document states: [2]

Manufacturers of bar code equipment and users of bar code technology require publicly available standard symbology specifications to which they can refer when developing equipment and application standards. It is the intent and understanding of ISO/IEC that the symbology presented in this International Standard is entirely in the public domain and free of all user restrictions, licences and fees.

Format

PDF417 Example.svg

The PDF417 bar code (also called a symbol) consists of 3 to 90 rows, each of which is like a small linear bar code. Each row has:

All rows are the same width; each row has the same number of codewords.

Codewords

PDF417 uses a base 929 encoding. Each codeword represents a number from 0 to 928.

The codewords are represented by patterns of dark (bar) and light (space) regions. Each of these patterns contains four bars and four spaces (where the 4 in the name comes from). The total width is 17 times the width of the narrowest allowed vertical bar (the X dimension); this is where the 17 in the name comes from. Each pattern starts with a bar and ends with a space.

The row height must be at least 3 times the minimum width: Y 3 X. [2] :5.8.2

There are three distinct barspace patterns used to represent each codeword. These patterns are organized into three groups known as clusters. The clusters are labeled 0, 3, and 6. No barspace pattern is used in more than one cluster. The rows of the symbol cycle through the three clusters, so row 1 uses patterns from cluster 0, row 2 uses cluster 3, row 3 uses cluster 6, and row 4 again uses cluster 0.

Which cluster can be determined by an equation: [2] :5.3.1

Where K is the cluster number and the bi refer to the width of the i-th black bar in the symbol character (in X units).

Alternatively, [2] :76–78

Where Ei is the i-th edge-to-next-same-edge distance. Odd indices are the leading edge of a bar to the leading edge of the next bar; even indices are for the trailing edges.

One purpose of the three clusters is to determine which row (mod 3) the codeword is in. The clusters allow portions of the symbol to be read using a single scan line that may be skewed from the horizontal. [2] :5.11.1 For instance, the scan might start on row 6 at the start of the row but end on row 10. At the beginning of the scan, the scanner sees the constant start pattern, and then it sees symbols in cluster 6. When the skewed scan straddles rows 6 and 7, then the scanner sees noise. When the scan is on row 7, the scanner sees symbols in cluster 0. Consequently, the scanner knows the direction of the skew. By the time the scanner reaches the right, it is on row 10, so it sees cluster 0 patterns. The scanner will also see a constant stop pattern.

Encoding

Of the 929 available code words, 900 are used for data, and 29 for special functions, such as shifting between major modes. The three major modes encode different types of data in different ways, and can be mixed as necessary within a single bar code:

Error correction

When the PDF417 symbol is created, from 2 to 512 error detection and correction codewords are added. PDF417 uses Reed–Solomon error correction. When the symbol is scanned, the maximum number of corrections that can be made is equal to the number of codewords added, but the standard recommends that two codewords be held back to ensure reliability of the corrected information.

Comparison with other symbologies

PDF417 is a stacked barcode that can be read with a simple linear scan being swept over the symbol. [3] Those linear scans need the left and right columns with the start and stop code words. Additionally, the scan needs to know what row it is scanning, so each row of the symbol must also encode its row number. Furthermore, the reader's line scan won't scan just a row; it will typically start scanning one row, but then cross over to a neighbor and possibly continuing on to cross successive rows. In order to minimize the effect of these crossings, the PDF417 modules are tall and narrow the height is typically three times the width. Also, each code word must indicate which row it belongs to so crossovers, when they occur, can be detected. The code words are also designed to be delta-decodable, so some code words are redundant. Each PDF data code word represents about 10 bits of information (log2(900)  9.8), but the printed code word (character) is 17 modules wide. Including a height of 3 modules, a PDF417 code word takes 51 square modules to represent 10 bits. That area does not count other overhead such as the start, stop, row, format, and ECC information.

Other 2D codes, such as DataMatrix and QR, are decoded with image sensors instead of uncoordinated linear scans. Those codes still need recognition and alignment patterns, but they do not need to be as prominent. An 8 bit code word will take 8 square modules (ignoring recognition, alignment, format, and ECC information).

In practice, a PDF417 symbol takes about four times the area of a DataMatrix or QR Code. [4]

Related Research Articles

<span class="mw-page-title-main">Universal Product Code</span> Barcode symbology used for tracking trade items in stores

The Universal Product Code is a barcode symbology that is widely used worldwide for tracking trade items in stores.

<span class="mw-page-title-main">Barcode</span> Optical machine-readable representation of data

A barcode or bar code is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly referred to as linear or one-dimensional (1D), can be scanned by special optical scanners, called barcode readers, of which there are several types.

<span class="mw-page-title-main">Code 39</span> Variable length, discrete barcode symbology

Code 39 is a variable length, discrete barcode symbology defined in ISO/IEC 16388:2007.

<span class="mw-page-title-main">Barcode reader</span> Electronic device

A barcode reader or barcode scanner is an optical scanner that can read printed barcodes, decode the data contained in the barcode on a computer. Like a flatbed scanner, it consists of a light source, a lens, and a light sensor for translating optical impulses into electrical signals. Additionally, nearly all barcode readers contain decoder circuitry that can analyse the barcode's image data provided by the sensor and send the barcode's content to the scanner's output port.

<span class="mw-page-title-main">Aztec Code</span> Type of matrix barcode

The Aztec Code is a matrix code invented by Andrew Longacre, Jr. and Robert Hussey in 1995. The code was published by AIM, Inc. in 1997. Although the Aztec Code was patented, that patent was officially made public domain. The Aztec Code is also published as ISO/IEC 24778:2008 standard. Named after the resemblance of the central finder pattern to an Aztec pyramid, Aztec Code has the potential to use less space than other matrix barcodes because it does not require a surrounding blank "quiet zone".

<span class="mw-page-title-main">Code 128</span> Barcode format

Code 128 is a high-density linear barcode symbology defined in ISO/IEC 15417:2007. It is used for alphanumeric or numeric-only barcodes. It can encode all 128 characters of ASCII and, by use of an extension symbol (FNC4), the Latin-1 characters defined in ISO/IEC 8859-1. It generally results in more compact barcodes compared to other methods like Code 39, especially when the texts contain mostly digits. Code 128 was developed by the Computer Identics Corporation in 1981.

<span class="mw-page-title-main">Code 93</span>

Code 93 is a barcode symbology designed in 1982 by Intermec to provide a higher density and data security enhancement to Code 39. It is an alphanumeric, variable length symbology. Code 93 is used primarily by Canada Post to encode supplementary delivery information. Every symbol includes two check characters.

<span class="mw-page-title-main">QR code</span> Type of matrix barcode

A QR code is a type of two-dimensional matrix barcode, invented in 1994, by Japanese company Denso Wave for labelling automobile parts. A QR code consists of black squares arranged in a square grid on a white background, including some fiducial markers, which can be read by an imaging device, such as a camera, and processed using Reed–Solomon error correction until the image can be appropriately interpreted. The required data are then extracted from patterns that are present in both the horizontal and the vertical components of the QR image.

<span class="mw-page-title-main">Interleaved 2 of 5</span> Type of barcode

Interleaved 2 of 5 (ITF) is a continuous two-width barcode symbology encoding digits. It is used commercially on 135 film, for ITF-14 barcodes, and on cartons of some products, while the products inside are labeled with UPC or EAN.

<span class="mw-page-title-main">Data Matrix</span> Two-dimensional matrix barcode

A Data Matrix is a two-dimensional code consisting of black and white "cells" or dots arranged in either a square or rectangular pattern, also known as a matrix. The information to be encoded can be text or numeric data. Usual data size is from a few bytes up to 1556 bytes. The length of the encoded data depends on the number of cells in the matrix. Error correction codes are often used to increase reliability: even if one or more cells are damaged so it is unreadable, the message can still be read. A Data Matrix symbol can store up to 2,335 alphanumeric characters.

<span class="mw-page-title-main">International Article Number</span> Standard barcode system used in global trade

The International Article Number is a standard describing a barcode symbology and numbering system used in global trade to identify a specific retail product type, in a specific packaging configuration, from a specific manufacturer. The standard has been subsumed in the Global Trade Item Number standard from the GS1 organization; the same numbers can be referred to as GTINs and can be encoded in other barcode symbologies defined by GS1. EAN barcodes are used worldwide for lookup at retail point of sale, but can also be used as numbers for other purposes such as wholesale ordering or accounting. These barcodes only represent the digits 0–9, unlike some other barcode symbologies which can represent additional characters.

<span class="mw-page-title-main">David Allais</span> American expert and inventor (born 1933)

David Allais is an American expert and inventor in the fields of bar coding and automatic identification and data capture. As vice president and later president and chief executive officer of Everett, Washington-based Intermec Inc. (NYSE:IN), he built the company from a small startup into the leading manufacturer of bar code and printing equipment. Prior to Allais' role at Intermec, he served as a manager for IBM. Most recently, Allais founded PathGuide Technologies, a Bothell, Washington-based developer of warehouse management systems for distributors.

<span class="mw-page-title-main">GS1 DataBar Coupon</span>

The GS1 Databar Coupon code has been in use in retail industry since the mid-1980s. At first, it was a UPC with system ID 5. Since UPCs cannot hold more than 12 digits, it required another barcode to hold additional information like offer code, expiration date and household ID numbers. Therefore, the code was often extended with an additional UCC/EAN 128 barcode. EAN 13 was sometimes used instead of UPC, and because it starts with 99, it was called the EAN 99 coupon barcode, and subsequently GS1 DataBar. After more than 20 years in use, there is now a need to encode more data for complex coupons, and to accommodate longer company IDs, so the traditional coupon code has become less efficient and sometimes not usable at all.

Extended Channel Interpretation (ECI) is an extension to the communication protocol that is used to transmit data from a bar code reader to a host when a bar code symbol is scanned. It enables the application software to receive additional information about the intended interpretation of the message contained within the barcode symbol and even details about the scan itself. ECI was developed as a symbology-independent extension of the Global Label Identifier (GLI) system used in the PDF417 bar code.

ISO/IEC 20248Automatic Identification and Data Capture Techniques – Data Structures – Digital Signature Meta Structure is an international standard specification under development by ISO/IEC JTC 1/SC 31/WG 2. This development is an extension of SANS 1368, which is the current published specification. ISO/IEC 20248 and SANS 1368 are equivalent standard specifications. SANS 1368 is a South African national standard developed by the South African Bureau of Standards.

Barcode library or Barcode SDK is a software library that can be used to add barcode features to desktop, web, mobile or embedded applications. Barcode library presents sets of subroutines or objects which allow to create barcode images and put them on surfaces or recognize machine-encoded text / data from scanned or captured by camera images with embedded barcodes. The library can support two modes: generation and recognition mode, some libraries support barcode reading and writing in the same way, but some libraries support only one mode.

Industrial 2 of 5. is a variable length, discrete, two width symbology. Industrial 2 of 5 is a subset of two-out-of-five codes.

<span class="mw-page-title-main">Codablock</span>

Codablock is a family of stacked 1D barcodes which was invented in Identcode Systeme GmbH in Germany in 1989 by Heinrich Oehlmann. Codablock barcodes are based on stacked Code 39 and Code 128 symbologies and have some advantages of 2D barcodes.

<span class="mw-page-title-main">Matrix 2 of 5</span>

Matrix 2 of 5 is a variable length, discrete, two width symbology. Matrix 2 of 5 is a subset of two-out-of-five codes. Unlike Industrial 2 of 5 code, Matrix 2 of 5 can encode data not only with black bars but with white spaces.

<span class="mw-page-title-main">MicroPDF417</span>

MicroPDF417 is two-dimensional (2D) stacked barcode symbology invented in 1996, by Frederick Schuessler, Kevin Hunter, Sundeep Kumar and Cary Chu from Symbol Technologies company. MicroPDF417 consists from specially encoded Row Address Patterns (RAP) columns and aligned to them Data columns encoded in "417" sequence which was invented in 1990. In 2006, the standard was registered as ISO/IEC 24728:2006.

References

  1. US 5243655,Wang, Ynjiun P.,"System for Encoding and Decoding Data in Machine Readable Graphic Form",published 1993-09-07. PDF417 patent.
  2. 1 2 3 4 5 6 ISO/IEC (2006), Information technology Automatic identification and data capture techniques PDF417 bar code symbology specification (PDF) (second ed.), ISO/IEC 15438:2006(E)
  3. For example, the Symbol Technologies LS-4000 series.
  4. Using Barcodes in Documents Best Practices (PDF), Tampa, FL: Accusoft, 2007, archived from the original (PDF) on May 24, 2012, retrieved May 9, 2012