Autokey cipher

Last updated
A tabula recta for use with an autokey cipher Vigenere square shading.svg
A tabula recta for use with an autokey cipher

An autokey cipher (also known as the autoclave cipher) is a cipher that incorporates the message (the plaintext) into the key. The key is generated from the message in some automated fashion, sometimes by selecting certain letters from the text or, more commonly, by adding a short primer key to the front of the message.

Contents

There are two forms of autokey cipher: key-autokey and text-autokey ciphers. A key-autokey cipher uses previous members of the keystream to determine the next element in the keystream. A text-autokey uses the previous message text to determine the next element in the keystream.

History

This cipher was invented in 1586 by Blaise de Vigenère with a reciprocal table of ten alphabets. Vigenère's version used an agreed-upon letter of the alphabet as a primer, making the key by writing down that letter and then the rest of the message. [1]

More popular autokeys use a tabula recta, a square with 26 copies of the alphabet, the first line starting with 'A', the next line starting with 'B' etc. Instead of a single letter, a short agreed-upon keyword is used, and the key is generated by writing down the primer and then the rest of the message, as in Vigenère's version. To encrypt a plaintext, the row with the first letter of the message and the column with the first letter of the key are located. The letter in which the row and the column cross is the ciphertext letter.

Method

The autokey cipher, as used by members of the American Cryptogram Association, starts with a relatively-short keyword, the primer, and appends the message to it. For example, if the keyword is QUEENLY and the message is attack at dawn, then the key would be QUEENLYATTACKATDAWN. [2]

Plaintext:  attackatdawn Key:        QUEENLYATTACKATDAWN Ciphertext: QNXEPVYTWTWP

The ciphertext message would thus be "QNXEPVYTWTWP".

To decrypt the message, the recipient would start by writing down the agreed-upon keyword.

QNXEPVYTWTWP QUEENLY

The first letter of the key, Q, would then be taken, and that row would be found in a tabula recta. That column for the first letter of the ciphertext would be looked across, also Q in this case, and the letter to the top would be retrieved, A. Now, that letter would be added to the end of the key:

QNXEPVYTWTWP QUEENLYAa

Then, since the next letter in the key is U and the next letter in the ciphertext is N, the U row is looked across to find the N to retrieve T:

QNXEPVYTWTWP QUEENLYATat

That continues until the entire key is reconstructed, when the primer can be removed from the start.

With Vigenère's autokey cipher, a single mistake in encryption renders the rest of the message unintelligible. [3]

Cryptanalysis

Autokey ciphers are somewhat more secure than polyalphabetic ciphers that use fixed keys since the key does not repeat within a single message. Therefore, methods like the Kasiski examination or index of coincidence analysis will not work on the ciphertext, unlike for similar ciphers that use a single repeated key. [3]

A crucial weakness of the system, however, is that the plaintext is part of the key. That means that the key will likely contain common words at various points. The key can be attacked by using a dictionary of common words, bigrams, trigrams etc. and by attempting the decryption of the message by moving that word through the key until potentially-readable text appears.

Consider an example message meet at the fountain encrypted with the primer keyword KILT: [4] To start, the autokey would be constructed by placing the primer at the front of the message:

plaintext:  meetatthefountain primer:     KILT autokey:    KILTMEETATTHEFOUN

The message is then encrypted by using the key and the substitution alphabets, here a tabula recta:

plaintext:  meetatthefountain key:        KILTMEETATTHEFOUN ciphertext: WMPMMXXAEYHBRYOCA

The attacker receives only the ciphertext and can attack the text by selecting a word that is likely to appear in the plaintext. In this example, the attacker selects the word the as a potential part of the original message and then attempts to decode it by placing THE at every possible location in the key:

ciphertext: WMP MMX XAE YHB RYO CA  key:        THE THE THE THE THE .. plaintext:  dfl tft eta fax yrk ..  ciphertext: W MPM MXX AEY HBR YOC A key:        . THE THE THE THE THE . plaintext:  . tii tqt hxu oun fhy .  ciphertext: WM PMM XXA EYH BRY OCA key:        .. THE THE THE THE THE plaintext:  .. wfi eqw lrd iku vvw

In each case, the resulting plaintext appears almost random because the key is not aligned for most of the ciphertext. However, examining the results can suggest locations of the key being properly aligned. In those cases, the resulting decrypted text is potentially part of a word. In this example, it is highly unlikely that dfl is the start of the original plaintext and so it is highly unlikely either that the first three letters of the key are THE. Examining the results, a number of fragments that are possibly words can be seen and others can be eliminated. Then, the plaintext fragments can be sorted in their order of likelihood:

unlikely ←——————————————————→ promising eqw dfl tqt ... ... ... ... eta oun fax

A correct plaintext fragment is also going to appear in the key, shifted right by the length of the keyword. Similarly, the guessed key fragment (THE) also appears in the plaintext shifted left. Thus, by guessing keyword lengths (probably between 3 and 12), more plaintext and key can be revealed.

Trying that with oun, possibly after wasting some time with the others, results in the following:

shift by 4: ciphertext: WMPMMXXAEYHBRYOCA key:        ......ETA.THE.OUN plaintext:  ......the.oun.ain
shift by 5: ciphertext: WMPMMXXAEYHBRYOCA key:        .....EQW..THE..OU plaintext:  .....the..oun..og
shift by 6: ciphertext: WMPMMXXAEYHBRYOCA key:        ....TQT...THE...O plaintext:  ....the...oun...m

A shift of 4 can be seen to look good (both of the others have unlikely Qs) and so the revealed ETA can be shifted back by 4 into the plaintext:

ciphertext: WMPMMXXAEYHBRYOCA key:        ..LTM.ETA.THE.OUN plaintext:  ..eta.the.oun.ain

A lot can be worked with now. The keyword is probably 4 characters long (..LT), and some of the message is visible:

m.eta.the.oun.ain

Because the plaintext guesses have an effect on the key 4 characters to the left, feedback on correct and incorrect guesses is given. The gaps can quickly be filled in:

meetatthefountain

The ease of cryptanalysis is caused by the feedback from the relationship between plaintext and key. A three-character guess reveals six more characters (three on each side), which then reveal further characters, creating a cascade effect. That allows incorrect guesses to be ruled out quickly.

See also

Notes

  1. "Vigenère Cipher". Crypto Corner. Retrieved 2018-08-13.
  2. "Autokey Calculator". Asecuritysite.com. Archived from the original on 2013-12-02. Retrieved 2012-12-26.
  3. 1 2 Hoffstein, Jeffrey; Pipher, Jill; Silverman, Joseph (2014). An Introduction to Mathematical Cryptography. Springer. p. 288. ISBN   9781493917112.
  4. "Autokey Calculator". Asecuritysite.com. Archived from the original on 2013-12-03. Retrieved 2012-12-26.

Related Research Articles

In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters, pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing the inverse substitution process to extract the original message.

<span class="mw-page-title-main">Transposition cipher</span> Method of encryption

In cryptography, a transposition cipher is a method of encryption which scrambles the positions of characters (transposition) without changing the characters themselves. Transposition ciphers reorder units of plaintext according to a regular system to produce a ciphertext which is a permutation of the plaintext. They differ from substitution ciphers, which do not change the position of units of plaintext but instead change the units themselves. Despite the difference between transposition and substitution operations, they are often combined, as in historical ciphers like the ADFGVX cipher or complex high-quality encryption methods like the modern Advanced Encryption Standard (AES).

<span class="mw-page-title-main">Caesar cipher</span> Simple and widely known encryption technique

In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

<span class="mw-page-title-main">Stream cipher</span> Type of symmetric key cipher

A stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream (keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystream, to give a digit of the ciphertext stream. Since encryption of each digit is dependent on the current state of the cipher, it is also known as state cipher. In practice, a digit is typically a bit and the combining operation is an exclusive-or (XOR).

<span class="mw-page-title-main">Vigenère cipher</span> Simple type of polyalphabetic encryption system

The Vigenère cipher is a method of encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key.

A polyalphabetic cipher is a substitution, using multiple substitution alphabets. The Vigenère cipher is probably the best-known example of a polyalphabetic cipher, though it is a simplified special case. The Enigma machine is more complex but is still fundamentally a polyalphabetic substitution cipher.

<span class="mw-page-title-main">Tabula recta</span> Fundamental tool in cryptography

In cryptography, the tabula recta is a square table of alphabets, each row of which is made by shifting the previous one to the left. The term was invented by the German author and monk Johannes Trithemius in 1508, and used in his Trithemius cipher.

In cryptography, coincidence counting is the technique of putting two texts side-by-side and counting the number of times that identical letters appear in the same position in both texts. This count, either as a ratio of the total or normalized by dividing by the expected count for a random source model, is known as the index of coincidence, or IC for short.

<span class="mw-page-title-main">Frequency analysis</span> Study of the frequency of letters or groups of letters in a ciphertext

In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.

<span class="mw-page-title-main">Playfair cipher</span> Early block substitution cipher

The Playfair cipher or Playfair square or Wheatstone–Playfair cipher is a manual symmetric encryption technique and was the first literal digram substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair for promoting its use.

<span class="mw-page-title-main">Ciphertext</span> Encrypted information

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

In classical cryptography, the running key cipher is a type of polyalphabetic substitution cipher in which a text, typically from a book, is used to provide a very long keystream. The earliest description of such a cipher was given in 1892 by French mathematician Arthur Joseph Hermann. Usually, the book to be used would be agreed ahead of time, while the passage to be used would be chosen randomly for each message and secretly indicated somewhere in the message.

In cryptography, the ADFGVX cipher was a manually applied field cipher used by the Imperial German Army during World War I. It was used to transmit messages secretly using wireless telegraphy. ADFGVX was in fact an extension of an earlier cipher called ADFGX which was first used on 1 March 1918 on the German Western Front. ADFGVX was applied from 1 June 1918 on both the Western Front and Eastern Front.

In cryptography, a classical cipher is a type of cipher that was used historically but for the most part, has fallen into disuse. In contrast to modern cryptographic algorithms, most classical ciphers can be practically computed and solved by hand. However, they are also usually very simple to break with modern technology. The term includes the simple systems used since Greek and Roman times, the elaborate Renaissance ciphers, World War II cryptography such as the Enigma machine and beyond.

In cryptanalysis, Kasiski examination is a method of attacking polyalphabetic substitution ciphers, such as the Vigenère cipher. It was first published by Friedrich Kasiski in 1863, but seems to have been independently discovered by Charles Babbage as early as 1846.

In cryptography, the simple XOR cipher is a type of additive cipher, an encryption algorithm that operates according to the principles:

The four-square cipher is a manual symmetric encryption technique. It was invented by the French cryptographer Felix Delastelle.

The Two-square cipher, also called double Playfair, is a manual symmetric encryption technique. It was developed to ease the cumbersome nature of the large encryption/decryption matrix used in the four-square cipher while still being slightly stronger than the single-square Playfair cipher.

<span class="mw-page-title-main">Alberti cipher</span> Polyalphabetic substitution encryption and decryption system

The Alberti Cipher, created in 1467 by Italian architect Leon Battista Alberti, was one of the first polyalphabetic ciphers. In the opening pages of his treatise De componendis cifris he explained how his conversation with the papal secretary Leonardo Dati about a recently developed movable type printing press led to the development of his cipher wheel.

<span class="mw-page-title-main">Giovan Battista Bellaso</span>

Giovan Battista Bellaso was an Italian cryptologist.

References