XOR cipher

Last updated

In cryptography, the simple XOR cipher is a type of additive cipher , [1] an encryption algorithm that operates according to the principles:

Contents

A 0 = A,
A A = 0,
A B = B A,
(A B) C = A (B C),
(B A) A = B 0 = B,

For example where denotes the exclusive disjunction (XOR) operation. [2] This operation is sometimes called modulus 2 addition (or subtraction, which is identical). [3] With this logic, a string of text can be encrypted by applying the bitwise XOR operator to every character using a given key. To decrypt the output, merely reapplying the XOR function with the key will remove the cipher.

Example

The string "Wiki" (01010111 01101001 01101011 01101001 in 8-bit ASCII) can be encrypted with the repeating key 11110011 as follows:

01010111 01101001 01101011 01101001
11110011 11110011 11110011 11110011
=10100100 10011010 10011000 10011010

And conversely, for decryption:

10100100 10011010 10011000 10011010
11110011 11110011 11110011 11110011
=01010111 01101001 01101011 01101001

Use and security

The XOR operator is extremely common as a component in more complex ciphers. By itself, using a constant repeating key, a simple XOR cipher can trivially be broken using frequency analysis. If the content of any message can be guessed or otherwise known then the key can be revealed. Its primary merit is that it is simple to implement, and that the XOR operation is computationally inexpensive. A simple repeating XOR (i.e. using the same key for xor operation on the whole data) cipher is therefore sometimes used for hiding information in cases where no particular security is required. The XOR cipher is often used in computer malware to make reverse engineering more difficult.

If the key is random and is at least as long as the message, the XOR cipher is much more secure than when there is key repetition within a message. [4] When the keystream is generated by a pseudo-random number generator, the result is a stream cipher. With a key that is truly random, the result is a one-time pad, which is unbreakable in theory.

The XOR operator in any of these ciphers is vulnerable to a known-plaintext attack, since plaintextciphertext = key. It is also trivial to flip arbitrary bits in the decrypted plaintext by manipulating the ciphertext. This is called malleability.

Usefulness in cryptography

The primary reason XOR is so useful in cryptography is because it is "perfectly balanced"; for a given plaintext input 0 or 1, the ciphertext result is equally likely to be either 0 or 1 for a truly random key bit. [5]

The table below shows all four possible pairs of plaintext and key bits. It is clear that if nothing is known about the key or plaintext, nothing can be determined from the ciphertext alone. [5]

XOR Cipher Trace Table
PlaintextKeyCiphertext
000
011
101
110

Other logical operations such and AND or OR do not have such a mapping (for example, AND would produce three 0's and one 1, so knowing that a given ciphertext bit is a 0 implies that there is a 2/3 chance that the original plaintext bit was a 0, as opposed to the ideal 1/2 chance in the case of XOR) [lower-alpha 1]

Example implementation

Example using the Python programming language. [lower-alpha 2]

fromosimporturandomdefgenkey(length:int)->bytes:"""Generate key."""returnurandom(length)defxor_strings(s,t)->bytes:"""Concate xor two strings together."""ifisinstance(s,str):# Text strings contain single charactersreturn"".join(chr(ord(a)^b)fora,binzip(s,t)).encode('utf8')else:# Bytes objects contain integer values in the range 0-255returnbytes([a^bfora,binzip(s,t)])message='This is a secret message'print('Message:',message)key=genkey(len(message))print('Key:',key)cipherText=xor_strings(message.encode('utf8'),key)print('cipherText:',cipherText)print('decrypted:',xor_strings(cipherText,key).decode('utf8'))# Verifyifxor_strings(cipherText,key).decode('utf8')==message:print('Unit test passed')else:print('Unit test failed')

A shorter example using the R programming language, based on a puzzle posted on Instagram by GCHQ.

secret_key<-c(0xc6,0xb5,0xca,0x01)|>as.raw()secret_message<-"I <3 Wikipedia"|>charToRaw()|>xor(secret_key)|>base64enc::base64encode()secret_message_bytes<-secret_message|>base64enc::base64decode()xor(secret_message_bytes,secret_key)|>rawToChar()

See also

Related Research Articles

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

<span class="mw-page-title-main">Cryptanalysis</span> Study of analyzing information systems in order to discover their hidden aspects

Cryptanalysis refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown.

<span class="mw-page-title-main">One-time pad</span> Encryption technique

In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked, but requires the use of a single-use pre-shared key that is larger than or equal to the size of the message being sent. In this technique, a plaintext is paired with a random secret key. Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition.

In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters, pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing the inverse substitution process to extract the original message.

<span class="mw-page-title-main">Caesar cipher</span> Simple and widely known encryption technique

In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

<span class="mw-page-title-main">Stream cipher</span> Type of symmetric key cipher

A stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream (keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystream, to give a digit of the ciphertext stream. Since encryption of each digit is dependent on the current state of the cipher, it is also known as state cipher. In practice, a digit is typically a bit and the combining operation is an exclusive-or (XOR).

<span class="mw-page-title-main">Symmetric-key algorithm</span> Algorithm

Symmetric-key algorithms are algorithms for cryptography that use the same cryptographic keys for both the encryption of plaintext and the decryption of ciphertext. The keys may be identical, or there may be a simple transformation to go between the two keys. The keys, in practice, represent a shared secret between two or more parties that can be used to maintain a private information link. The requirement that both parties have access to the secret key is one of the main drawbacks of symmetric-key encryption, in comparison to public-key encryption. However, symmetric-key encryption algorithms are usually better for bulk encryption. With exception of the one-time pad they have a smaller key size, which means less storage space and faster transmission. Due to this, asymmetric-key encryption is often used to exchange the secret key for symmetric-key encryption.

In cryptography, linear cryptanalysis is a general form of cryptanalysis based on finding affine approximations to the action of a cipher. Attacks have been developed for block ciphers and stream ciphers. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.

<span class="mw-page-title-main">Vigenère cipher</span> Simple type of polyalphabetic encryption system

The Vigenère cipher is a method of encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key.

In cryptography, an initialization vector (IV) or starting variable is an input to a cryptographic primitive being used to provide the initial state. The IV is typically required to be random or pseudorandom, but sometimes an IV only needs to be unpredictable or unique. Randomization is crucial for some encryption schemes to achieve semantic security, a property whereby repeated usage of the scheme under the same key does not allow an attacker to infer relationships between segments of the encrypted message. For block ciphers, the use of an IV is described by the modes of operation.

<span class="mw-page-title-main">Tabula recta</span> Fundamental tool in cryptography

In cryptography, the tabula recta is a square table of alphabets, each row of which is made by shifting the previous one to the left. The term was invented by the German author and monk Johannes Trithemius in 1508, and used in his Trithemius cipher.

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

<span class="mw-page-title-main">Ciphertext</span> Encrypted information

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

In classical cryptography, the running key cipher is a type of polyalphabetic substitution cipher in which a text, typically from a book, is used to provide a very long keystream. Usually, the book to be used would be agreed ahead of time, while the passage to be used would be chosen randomly for each message and secretly indicated somewhere in the message.

Gilbert Sandford Vernam was a Worcester Polytechnic Institute 1914 graduate and AT&T Bell Labs engineer who, in 1917, invented an additive polyalphabetic stream cipher and later co-invented an automated one-time pad cipher. Vernam proposed a teleprinter cipher in which a previously prepared key, kept on paper tape, is combined character by character with the plaintext message to produce the ciphertext. To decipher the ciphertext, the same key would be again combined character by character, producing the plaintext. Vernam later worked for the Postal Telegraph Company, and became an employee of Western Union when that company acquired Postal in 1943. His later work was largely with automatic switching systems for telegraph networks.

Disk encryption is a special case of data at rest protection when the storage medium is a sector-addressable device. This article presents cryptographic aspects of the problem. For an overview, see disk encryption. For discussion of different software packages and hardware devices devoted to this problem, see disk encryption software and disk encryption hardware.

<span class="mw-page-title-main">One-way compression function</span> Cryptographic primitive

In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly or approximately to the original data.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

Turingery or Turing's method was a manual codebreaking method devised in July 1942 by the mathematician and cryptanalyst Alan Turing at the British Government Code and Cypher School at Bletchley Park during World War II. It was for use in cryptanalysis of the Lorenz cipher produced by the SZ40 and SZ42 teleprinter rotor stream cipher machines, one of the Germans' Geheimschreiber machines. The British codenamed non-Morse traffic "Fish", and that from this machine "Tunny".

In cryptography, a padding oracle attack is an attack which uses the padding validation of a cryptographic message to decrypt the ciphertext. In cryptography, variable-length plaintext messages often have to be padded (expanded) to be compatible with the underlying cryptographic primitive. The attack relies on having a "padding oracle" who freely responds to queries about whether a message is correctly padded or not. Padding oracle attacks are mostly associated with CBC mode decryption used within block ciphers. Padding modes for asymmetric algorithms such as OAEP may also be vulnerable to padding oracle attacks.

References

Notes

  1. There are 3 ways of getting a (ciphertext) output bit of 0 from an AND operation:
    Plaintext=0, key=0;
    Plaintext=0, key=1;
    Plaintext=1, key=0.
    Therefore, if we know that the ciphertext bit is a 0, there is a 2/3 chance that the plaintext bit was also a 0 for a truly random key. For XOR, there are exactly 2 ways, so the chance is 1/2 (i.e. equally likely, so we cannot learn anything from this information)
  2. This was inspired by Richter 2012

Citations

  1. Tutte 1998 , p. 3
  2. Lewin 2012, pp. 14–19.
  3. Churchhouse 2002 , p. 11
  4. Churchhouse 2002 , p. 68
  5. 1 2 Paar & Pelzl 2009, pp. 32–34.

Sources

  • Budiman, MA; Tarigan, JT; Winata, AS (2020). "Arduino UNO and Android Based Digital Lock Using Combination of Vigenere Cipher and XOR Cipher". Journal of Physics: Conference Series. IOP Publishing. 1566 (1): 012074. Bibcode:2020JPhCS1566a2074B. doi: 10.1088/1742-6596/1566/1/012074 . ISSN   1742-6588.
  • Churchhouse, Robert (2002), Codes and Ciphers: Julius Caesar, the Enigma and the Internet , Cambridge: Cambridge University Press, ISBN   978-0-521-00890-7
  • Garg, Satish Kumar (2017). "Cryptography Using Xor Cipher". Research Journal of Science and Technology. A and V Publications. 9 (1): 25. doi:10.5958/2349-2988.2017.00004.3. ISSN   0975-4393.
  • Gödel, Kurt (December 1931). "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I". Monatshefte für Mathematik und Physik (in German). 38–38 (1): 173–198. doi:10.1007/BF01700692. ISSN   0026-9255. S2CID   197663120.
  • Lewin, Michael (June 2012). "All About XOR". Overload. 2 ((109): 14–19. Retrieved 29 August 2021.
  • Paar, Christof; Pelzl, Jan (2009). Understanding cryptography : a textbook for students and practitioners. Springer. ISBN   978-3-642-04101-3. OCLC   567365751.
  • Richter, Wolfgang (August 3, 2012), "Unbreakable Cryptography in 5 Minutes", Crossroads: The ACM Magazine for Students, Association for Computing Machinery
  • Tutte, W. T. (19 June 1998), Fish and I (PDF), retrieved 11 January 2020 Transcript of a lecture given by Prof. Tutte at the University of Waterloo