Damm algorithm

Last updated

In error detection, the Damm algorithm is a check digit algorithm that detects all single-digit errors and all adjacent transposition errors. It was presented by H. Michael Damm in 2004. [1]

Contents

Strengths and weaknesses

Strengths

The Damm algorithm is similar to the Verhoeff algorithm. It too will detect all occurrences of the two most frequently appearing types of transcription errors, namely altering a single digit or transposing two adjacent digits (including the transposition of the trailing check digit and the preceding digit). [1] [2] The Damm algorithm has the benefit that it does not have the dedicatedly constructed permutations and its position-specific powers of the Verhoeff scheme. A table of inverses can also be dispensed with when all main diagonal entries of the operation table are zero.

The Damm algorithm generates only 10 possible values, avoiding the need for a non-digit character (such as the X in the 10-digit ISBN check digit scheme).

Prepending leading zeros does not affect the check digit (a weakness for variable-length codes). [1]

There are totally anti-symmetric quasigroups that detect all phonetic errors associated with the English language (13 ↔ 30, 14 ↔ 40, ..., 19 ↔ 90). The table used in the illustrating example is based on an instance of such kind.

Weaknesses

For all checksum algorithms, including the Damm algorithm, prepending leading zeroes does not affect the check digit, [1] so 1, 01, 001, etc. produce the same check digit. Consequently variable-length codes should not be verified together.

Design

Its essential part is a quasigroup of order 10 (i.e. having a 10 × 10 Latin square as the body of its operation table) with the special feature of being weakly totally anti-symmetric. [3] [4] [lower-roman 1] [lower-roman 2] [lower-roman 3] Damm revealed several methods to create totally anti-symmetric quasigroups of order 10 and gave some examples in his doctoral dissertation. [3] [lower-roman 1] With this, Damm also disproved an old conjecture that totally anti-symmetric quasigroups of order 10 do not exist. [5]

A quasigroup (Q, ∗) is called totally anti-symmetric if for all c, x, yQ, the following implications hold: [4]

  1. (cx) ∗ y = (cy) ∗ xx = y
  2. xy = yxx = y,

and it is called weak totally anti-symmetric if only the first implication holds. Damm proved that the existence of a totally anti-symmetric quasigroup of order n is equivalent to the existence of a weak totally anti-symmetric quasigroup of order n. For the Damm algorithm with the check equation (...((0 ∗ xm) ∗ xm−1) ∗ ...) ∗ x0 = 0, a weak totally anti-symmetric quasigroup with the property xx = 0 is needed. Such a quasigroup can be constructed from any totally anti-symmetric quasigroup by rearranging the columns in such a way that all zeros lay on the diagonal. And, on the other hand, from any weak totally anti-symmetric quasigroup a totally anti-symmetric quasigroup can be constructed by rearranging the columns in such a way that the first row is in natural order. [3]

Algorithm

The validity of a digit sequence containing a check digit is defined over a quasigroup. A quasigroup table ready for use can be taken from Damm's dissertation (pages 98, 106, 111). [3] It is useful if each main diagonal entry is 0, [1] because it simplifies the check digit calculation.

Validating a number against the included check digit

  1. Set up an interim digit and initialize it to 0.
  2. Process the number digit by digit: Use the number's digit as column index and the interim digit as row index, take the table entry and replace the interim digit with it.
  3. The number is valid if and only if the resulting interim digit has the value of 0. [1]

Calculating the check digit

Prerequisite: The main diagonal entries of the table are 0.

  1. Set up an interim digit and initialize it to 0.
  2. Process the number digit by digit: Use the number's digit as column index and the interim digit as row index, take the table entry and replace the interim digit with it.
  3. The resulting interim digit gives the check digit and will be appended as trailing digit to the number. [1]

Example

The following operation table will be used. [1] It may be obtained from the totally anti-symmetric quasigroup xy in Damm's doctoral dissertation page 111 [3] by rearranging the rows and changing the entries with the permutation φ = (1 2 9 5 4 8 6 7 3) and defining xy = φ1(φ(x) ∗ y).

0123456789
00317598642
17092154863
24206871359
31750983426
46123045978
53674209581
65869720134
78945362017
89438617205
92581436790

Suppose we choose the number (digit sequence) 572.

Calculating the check digit

digit to be processed → column index572
old interim digit → row index097
table entry → new interim digit974

The resulting interim digit is 4. This is the calculated check digit. We append it to the number and obtain 5724.

Validating a number against the included check digit

digit to be processed → column index5724
old interim digit → row index0974
table entry → new interim digit9740

The resulting interim digit is 0, hence the number is valid.

Graphical illustration

This is the above example showing the detail of the algorithm generating the check digit (dashed blue arrow) and verifying the number 572 with the check digit.

Check digit TA quasigroup dhmd111rr illustration eg5724.svg

Related Research Articles

<span class="mw-page-title-main">Checksum</span> Data used to detect errors in other data

A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data integrity but are not relied upon to verify data authenticity.

<span class="mw-page-title-main">Error detection and correction</span> Techniques that enable reliable delivery of digital data over unreliable communication channels

In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communication channels. Many communication channels are subject to channel noise, and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data in many cases.

<span class="mw-page-title-main">ISBN</span> Unique numeric book identifier since 1970

The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase or receive ISBNs from an affiliate of the International ISBN Agency.

<span class="mw-page-title-main">International Bank Account Number</span> Alphanumeric code that uniquely identifies a bank account in any participating country

The International Bank Account Number (IBAN) is an internationally agreed upon system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors. An IBAN uniquely identifies the account of a customer at a financial institution. It was originally adopted by the European Committee for Banking Standards (ECBS) and since 1997 as the international standard ISO 13616 under the International Organization for Standardization (ISO). The current version is ISO 13616:2020, which indicates the Society for Worldwide Interbank Financial Telecommunication (SWIFT) as the formal registrar. Initially developed to facilitate payments within the European Union, it has been implemented by most European countries and numerous countries in other parts of the world, mainly in the Middle East and the Caribbean. As of May 2020, 77 countries were using the IBAN numbering system.

In mathematics, especially in abstract algebra, a quasigroup is an algebraic structure resembling a group in the sense that "division" is always possible. Quasigroups differ from groups mainly in that they need not be associative and need not have an identity element.

A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to digital data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. On retrieval, the calculation is repeated and, in the event the check values do not match, corrective action can be taken against data corruption. CRCs can be used for error correction.

<span class="mw-page-title-main">Permutation</span> Mathematical version of an order change

In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set.

<span class="mw-page-title-main">Universal Product Code</span> Barcode symbology used for tracking trade items in stores

The Universal Product Code is a barcode symbology that is widely used worldwide for tracking trade items in stores.

<span class="mw-page-title-main">Transpose</span> Matrix operation which flips a matrix over its diagonal

In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix A by producing another matrix, often denoted by AT.

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

A check digit is a form of redundancy check used for error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary parity bit used to check for errors in computer-generated data. It consists of one or more digits computed by an algorithm from the other digits in the sequence input.

The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm, named after its creator, IBM scientist Hans Peter Luhn, is a simple check digit formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier numbers in the United States, Canadian social insurance numbers, Israeli ID numbers, South African ID numbers, Swedish national identification numbers, Swedish Corporate Identity Numbers (OrgNr), Greek Social Security Numbers (ΑΜΚΑ), SIM card numbers, European patent application number and survey codes appearing on McDonald's, Taco Bell, and Tractor Supply Co. receipts. It is described in U.S. Patent No. 2,950,048, granted on August 23, 1960.

<span class="mw-page-title-main">Code 128</span> Barcode format

Code 128 is a high-density linear barcode symbology defined in ISO/IEC 15417:2007. It is used for alphanumeric or numeric-only barcodes. It can encode all 128 characters of ASCII and, by use of an extension symbol (FNC4), the Latin-1 characters defined in ISO/IEC 8859-1.. It generally results in more compact barcodes compared to other methods like Code 39, especially when the texts contain mostly digits. Code 128 was developed by the Computer Identics Corporation in 1981.

In computer science, data validation is the process of ensuring data has undergone data cleansing to confirm they have data quality, that is, that they are both correct and useful. It uses routines, often called "validation rules", "validation constraints", or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic of the computer and its application.

<span class="mw-page-title-main">Mathematics of Sudoku</span> Mathematical investigation of Sudoku

Mathematics can be used to study Sudoku puzzles to answer questions such as "How many filled Sudoku grids are there?", "What is the minimal number of clues in a valid puzzle?" and "In what ways can Sudoku grids be symmetric?" through the use of combinatorics and group theory.

<span class="mw-page-title-main">International Article Number</span> Standard barcode system used in global trade

The International Article Number is a standard describing a barcode symbology and numbering system used in global trade to identify a specific retail product type, in a specific packaging configuration, from a specific manufacturer. The standard has been subsumed in the Global Trade Item Number standard from the GS1 organization; the same numbers can be referred to as GTINs and can be encoded in other barcode symbologies defined by GS1. EAN barcodes are used worldwide for lookup at retail point of sale, but can also be used as numbers for other purposes such as wholesale ordering or accounting. These barcodes only represent the digits 0–9, unlike some other barcode symbologies which can represent additional characters.

The Verhoeff algorithm is a checksum for error detection first published by Dutch mathematician Jacobus Verhoeff in 1969. It was the first decimal check digit algorithm which detects all single-digit errors, and all transposition errors involving two adjacent digits, which was at the time thought impossible with such a code.

The national identity number is a unique eleven-digit unique identifier issued to residents and citizens of Norway. The issuing of the identity number corresponds to being registered in the national population register. The ID number is used throughout government administration and a number of private sector services, in particular banking and insurance. All official IDs, including passports, national ID cards, driving licenses and bank cards contain the National Identity Number.

References

  1. 1 2 3 4 5 6 7 8 Fenwick, Peter (2014). "Checksums and Error Control". In Fenwick, Peter (ed.). Introduction to Computer Data Representation. Bentham Science Publishers. pp.  191–218. doi:10.2174/9781608058822114010013. ISBN   978-1-60805-883-9.
  2. For the types of common errors and their frequencies, see Salomon, David (2005). Coding for Data and Computer Communications. Springer Science+Business Media, Inc. p. 36. ISBN   978-0387-21245-6.
  3. 1 2 3 4 5 Damm, H. Michael (2004). Total anti-symmetrische Quasigruppen (PDF) (Dr. rer. nat.) (in German). Philipps-Universität Marburg. urn:nbn:de:hebis:04-z2004-05162.
  4. 1 2 Damm, H. Michael (2007). "Totally anti-symmetric quasigroups for all orders n ≠ 2, 6". Discrete Mathematics. 307 (6): 715–729. doi: 10.1016/j.disc.2006.05.033 . ISSN   0012-365X.
  5. Damm, H. Michael (2003). "On the Existence of Totally Anti-Symmetric Quasigroups of Order 4k + 2". Computing. 70 (4): 349–357. doi:10.1007/s00607-003-0017-3. ISSN   0010-485X. S2CID   31659430.
  1. 1 2 Beliavscaia, Galina; Izbaş, Vladimir; Şcerbacov, Victor (2003). "Check character systems over quasigroups and loops" (PDF). Quasigroups and Related Systems. 10 (1): 1–28. ISSN   1561-2848. See page 23.
  2. Chen Jiannan (2009). "The NP-completeness of Completing Partial anti-symmetric Latin squares" (PDF). Proceedings of 2009 International Workshop on Information Security and Application (IWISA 2009). Academy Publisher. pp.  322–324. ISBN   978-952-5726-06-0. See page 324.
  3. Mileva, A.; Dimitrova, V. (2009). "Quasigroups constructed from complete mappings of a group (Z2n,⊕)" (PDF). Contributions, Sec. Math. Tech. Sci., MANU/MASA. XXX (1–2): 75–93. ISSN   0351-3246. See page 78.