Database encryption

Last updated

Database encryption can generally be defined as a process that uses an algorithm to transform data stored in a database into "cipher text" that is incomprehensible without first being decrypted. [1] It can therefore be said that the purpose of database encryption is to protect the data stored in a database from being accessed by individuals with potentially "malicious" intentions. [2] The act of encrypting a database also reduces the incentive for individuals to hack the aforementioned database as "meaningless" encrypted data adds extra steps for hackers to retrieve the data. [3] There are multiple techniques and technologies available for database encryption, the most important of which will be detailed in this article.

Contents

Transparent/External database encryption

Transparent data encryption (often abbreviated as TDE) is used to encrypt an entire database, [2] which therefore involves encrypting "data at rest". [4] Data at rest can generally be defined as "inactive" data that is not currently being edited or pushed across a network. [5] As an example, a text file stored on a computer is "at rest" until it is opened and edited. Data at rest are stored on physical storage media solutions such as tapes or hard disk drives. [6] The act of storing large amounts of sensitive data on physical storage media naturally raises concerns of security and theft. TDE ensures that the data on physical storage media cannot be read by malicious individuals that may have the intention to steal them. [7] Data that cannot be read is worthless, thus reducing the incentive for theft. Perhaps the most important strength that is attributed to TDE is its transparency. Given that TDE encrypts all data it can be said that no applications need to be altered in order for TDE to run correctly. [8] It is important to note that TDE encrypts the entirety of the database as well as backups of the database. The transparent element of TDE has to do with the fact that TDE encrypts on "the page level", which essentially means that data is encrypted when stored and decrypted when it is called into the system's memory. [9] The contents of the database are encrypted using a symmetric key that is often referred to as a "database encryption key". [2]

Column-level encryption

In order to explain column-level encryption it is important to outline basic database structure. A typical relational database is divided into tables that are divided into columns that each have rows of data. [10] Whilst TDE usually encrypts an entire database, column-level encryption allows for individual columns within a database to be encrypted. [11] It is important to establish that the granularity of column-level encryption causes specific strengths and weaknesses to arise when compared to encrypting an entire database. Firstly, the ability to encrypt individual columns allows for column-level encryption to be significantly more flexible when compared to encryption systems that encrypt an entire database such as TDE. Secondly, it is possible to use an entirely unique and separate encryption key for each column within a database. This effectively increases the difficulty of generating rainbow tables which thus implies that the data stored within each column is less likely to be lost or leaked. The main disadvantage associated with column-level database encryption is speed, or a loss thereof. Encrypting separate columns with different unique keys in the same database can cause database performance to decrease, and additionally also decreases the speed at which the contents of the database can be indexed or searched. [12]

Field-level encryption[ dubious ]

Experimental work is being done on providing database operations (like searching or arithmetical operations) on encrypted fields without the need to decrypt them. [13] Strong encryption is required to be randomized - a different result must be generated each time. This is known as probabilistic encryption. Field-level encryption is weaker than randomized encryption, but it allows users to test for equality without decrypting the data. [14]

Filesystem-level encryption

Encrypting File System (EFS)

It is important to note that traditional database encryption techniques normally encrypt and decrypt the contents of a database. Databases are managed by "Database Management Systems" (DBMS) that run on top of an existing operating system (OS). [15] This raises a potential security concern, as an encrypted database may be running on an accessible and potentially vulnerable operating system. EFS can encrypt data that is not part of a database system, which implies that the scope of encryption for EFS is much wider when compared to a system such as TDE that is only capable of encrypting database files.[ citation needed ] Whilst EFS does widen the scope of encryption, it also decreases database performance and can cause administration issues as system administrators require operating system access to use EFS. Due to the issues concerning performance, EFS is not typically used in databasing applications that require frequent database input and output. In order to offset the performance issues it is often recommended that EFS systems be used in environments with few users. [16]

Full disk encryption

BitLocker does not have the same performance concerns associated with EFS. [16]

Symmetric and asymmetric database encryption

A visual demonstration of symmetric encryption Crypto.png
A visual demonstration of symmetric encryption

Symmetric database encryption

Symmetric encryption in the context of database encryption involves a private key being applied to data that is stored and called from a database. This private key alters the data in a way that causes it to be unreadable without first being decrypted. [17] Data is encrypted when saved, and decrypted when opened given that the user knows the private key. Thus if the data is to be shared through a database the receiving individual must have a copy of the secret key used by the sender in order to decrypt and view the data. [18] A clear disadvantage related to symmetric encryption is that sensitive data can be leaked if the private key is spread to individuals that should not have access to the data. [17] However, given that only one key is involved in the encryption process it can generally be said that speed is an advantage of symmetric encryption. [19]

Asymmetric database encryption

Asymmetric encryption expands on symmetric encryption by incorporating two different types of keys into the encryption method: private and public keys. [20] A public key can be accessed by anyone and is unique to one user whereas a private key is a secret key that is unique to and only known by one user. [21] In most scenarios the public key is the encryption key whereas the private key is the decryption key. As an example, if individual A would like to send a message to individual B using asymmetric encryption, he would encrypt the message using Individual B's public key and then send the encrypted version. Individual B would then be able to decrypt the message using his private key. Individual C would not be able to decrypt Individual A's message, as Individual C's private key is not the same as Individual B's private key. [22] Asymmetric encryption is often described as being more secure in comparison to symmetric database encryption given that private keys do not need to be shared as two separate keys handle encryption and decryption processes. [23] For performance reasons, asymmetric encryption is used in Key management rather than to encrypt the data which is usually done with symmetric encryption.

Key management

The Symmetric & Asymmetric Database Encryption section introduced the concept of public and private keys with basic examples in which users exchange keys. The act of exchanging keys becomes impractical from a logistical point of view, when many different individuals need to communicate with each-other. In database encryption the system handles the storage and exchange of keys. This process is called key management. If encryption keys are not managed and stored properly, highly sensitive data may be leaked. Additionally, if a key management system deletes or loses a key, the information that was encrypted via said key is essentially rendered "lost" as well. The complexity of key management logistics is also a topic that needs to be taken into consideration. As the number of application that a firm uses increases, the number of keys that need to be stored and managed increases as well. Thus it is necessary to establish a way in which keys from all applications can be managed through a single channel, which is also known as enterprise key management. [24] Enterprise Key Management Solutions are sold by a great number of suppliers in the technology industry. These systems essentially provide a centralised key management solution that allows administrators to manage all keys in a system through one hub. [25] Thus it can be said that the introduction of enterprise key management solutions has the potential to lessen the risks associated with key management in the context of database encryption, as well as to reduce the logistical troubles that arise when many individuals attempt to manually share keys. [24]

Hashing

Hashing is used in database systems as a method to protect sensitive data such as passwords; however it is also used to improve the efficiency of database referencing. [26] Inputted data is manipulated by a hashing algorithm. The hashing algorithm converts the inputted data into a string of fixed length that can then be stored in a database. Hashing systems have two crucially important characteristics that will now be outlined. Firstly, hashes are "unique and repeatable". As an example, running the word "cat" through the same hashing algorithm multiple times will always yield the same hash, however it is extremely difficult to find a word that will return the same hash that "cat" does. [27] Secondly, hashing algorithms are not reversible. To relate this back to the example provided above, it would be nearly impossible to convert the output of the hashing algorithm back to the original input, which was "cat". [28] In the context of database encryption, hashing is often used in password systems. When a user first creates their password it is run through a hashing algorithm and saved as a hash. When the user logs back into the website, the password that they enter is run through the hashing algorithm and is then compared to the stored hash. [29] Given the fact that hashes are unique, if both hashes match then it is said that the user inputted the correct password. One example of a popular hash function is SHA (Secure Hash Algorithm) 256. [30]

Salting

One issue that arises when using hashing for password management in the context of database encryption is the fact that a malicious user could potentially use an Input to Hash table rainbow table [31] for the specific hashing algorithm that the system uses. This would effectively allow the individual to decrypt the hash and thus have access to stored passwords. [32] A solution for this issue is to 'salt' the hash. Salting is the process of encrypting more than just the password in a database. The more information that is added to a string that is to be hashed, the more difficult it becomes to collate rainbow tables. As an example, a system may combine a user's email and password into a single hash. This increase in the complexity of a hash means that it is far more difficult and thus less likely for rainbow tables to be generated. This naturally implies that the threat of sensitive data loss is minimised through salting hashes. [33]

Pepper

Some systems incorporate a "pepper" in addition to salts in their hashing systems. Pepper systems are controversial, however it is still necessary to explain their use. [31] A pepper is a value that is added to a hashed password that has been salted. [34] This pepper is often unique to one website or service, and it is important to note that the same pepper is usually added to all passwords saved in a database. [35] In theory the inclusion of peppers in password hashing systems has the potential to decrease the risk of rainbow (Input : Hash) tables, given the system-level specificity of peppers, however the real world benefits of pepper implementation are highly disputed. [34]

Application-level encryption

In application-level encryption, the process of encrypting data is completed by the application that has been used to generate or modify the data that is to be encrypted. Essentially this means that data is encrypted before it is written to the database. This unique approach to encryption allows for the encryption process to be tailored to each user based on the information (such as entitlements or roles) that the application knows about its users. [35]

According to Eugene Pilyankevich, "Application-level encryption is becoming a good practice for systems with increased security requirements, with a general drift toward perimeter-less and more exposed cloud systems". [36]

Advantages of application-level encryption

One of the most important advantages of application-level encryption is the fact that application-level encryption has the potential to simplify the encryption process used by a company. If an application encrypts the data that it writes/modifies from a database then a secondary encryption tool will not need to be integrated into the system. The second main advantage relates to the overarching theme of theft. Given that data is encrypted before it is written to the server, a hacker would need to have access to the database contents as well as the applications that were used to encrypt and decrypt the contents of the database in order to decrypt sensitive data. [37]

Disadvantages of application-level encryption

The first important disadvantage of Application-level encryption is that applications used by a firm will need to be modified to encrypt data themselves. This has the potential to consume a significant amount of time and other resources. Given the nature of opportunity cost firms may not believe that application-level encryption is worth the investment. In addition, application-level encryption may have a limiting effect on database performance. If all data on a database is encrypted by a multitude of different applications then it becomes impossible to index or search data on the database. To ground this in reality in the form of a basic example: it would be impossible to construct a glossary in a single language for a book that was written in 30 languages. Lastly the complexity of key management increases, as multiple different applications need to have the authority and access to encrypt data and write it to the database. [37]

Risks of database encryption

When discussing the topic of database encryption it is imperative to be aware of the risks that are involved in the process. The first set of risks are related to key management. If private keys are not managed in an "isolated system", system administrators with malicious intentions may have the ability to decrypt sensitive data using keys that they have access to. The fundamental principle of keys also gives rise to a potentially devastating risk: if keys are lost then the encrypted data is essentially lost as well, as decryption without keys is almost impossible. [38]

Related Research Articles

Kerberos is a computer-network authentication protocol that works on the basis of tickets to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. Its designers aimed it primarily at a client–server model, and it provides mutual authentication—both the user and the server verify each other's identity. Kerberos protocol messages are protected against eavesdropping and replay attacks.

Pretty Good Privacy (PGP) is an encryption program that provides cryptographic privacy and authentication for data communication. PGP is used for signing, encrypting, and decrypting texts, e-mails, files, directories, and whole disk partitions and to increase the security of e-mail communications. Phil Zimmermann developed PGP in 1991.

<span class="mw-page-title-main">Public-key cryptography</span> Cryptographic system with public and private keys

Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic algorithms based on mathematical problems termed one-way functions. Security of public-key cryptography depends on keeping the private key secret; the public key can be openly distributed without compromising security.

A key in cryptography is a piece of information, usually a string of numbers or letters that are stored in a file, which, when processed through a cryptographic algorithm, can encode or decode cryptographic data. Based on the used method, the key can be different sizes and varieties, but in all cases, the strength of the encryption relies on the security of the key being maintained. A key's security strength is dependent on its algorithm, the size of the key, the generation of the key, and the process of key exchange.

Cryptography, the use of codes and ciphers to protect secrets, began thousands of years ago. Until recent decades, it has been the story of what might be called classical cryptography — that is, of methods of encryption that use pen and paper, or perhaps simple mechanical aids. In the early 20th century, the invention of complex mechanical and electromechanical machines, such as the Enigma rotor machine, provided more sophisticated and efficient means of encryption; and the subsequent introduction of electronics and computing has allowed elaborate schemes of still greater complexity, most of which are entirely unsuited to pen and paper.

<span class="mw-page-title-main">Key exchange</span> Cryptographic protocol enabling the sharing of a secret key over an insecure channel

Key exchange is a method in cryptography by which cryptographic keys are exchanged between two parties, allowing use of a cryptographic algorithm.

The Encrypting File System (EFS) on Microsoft Windows is a feature introduced in version 3.0 of NTFS that provides filesystem-level encryption. The technology enables files to be transparently encrypted to protect confidential data from attackers with physical access to the computer.

Key management refers to management of cryptographic keys in a cryptosystem. This includes dealing with the generation, exchange, storage, use, crypto-shredding (destruction) and replacement of keys. It includes cryptographic protocol design, key servers, user procedures, and other relevant protocols.

Multiple encryption is the process of encrypting an already encrypted message one or more times, either using the same or a different algorithm. It is also known as cascade encryption, cascade ciphering, multiple encryption, and superencipherment. Superencryption refers to the outer-level encryption of a multiple encryption.

Encryption software is software that uses cryptography to prevent unauthorized access to digital information. Cryptography is used to protect digital information on computers as well as the digital information that is sent to other computers over the Internet.

Cryptovirology refers to the study of cryptography use in malware, such as ransomware and asymmetric backdoors. Traditionally, cryptography and its applications are defensive in nature, and provide privacy, authentication, and security to users. Cryptovirology employs a twist on cryptography, showing that it can also be used offensively. It can be used to mount extortion based attacks that cause loss of access to information, loss of confidentiality, and information leakage, tasks which cryptography typically prevents.

Data Protection Application Programming Interface (DPAPI) is a simple cryptographic application programming interface available as a built-in component in Windows 2000 and later versions of Microsoft Windows operating systems. In theory, the Data Protection API can enable symmetric encryption of any kind of data; in practice, its primary use in the Windows operating system is to perform symmetric encryption of asymmetric private keys, using a user or system secret as a significant contribution of entropy. A detailed analysis of DPAPI inner-workings was published in 2011 by Bursztein et al.

The Linux Unified Key Setup (LUKS) is a disk encryption specification created by Clemens Fruhwirth in 2004 and originally intended for Linux.

In cryptography, key wrap constructions are a class of symmetric encryption algorithms designed to encapsulate (encrypt) cryptographic key material. The Key Wrap algorithms are intended for applications such as protecting keys while in untrusted storage or transmitting keys over untrusted communications networks. The constructions are typically built from standard primitives such as block ciphers and cryptographic hash functions.

Private biometrics is a form of encrypted biometrics, also called privacy-preserving biometric authentication methods, in which the biometric payload is a one-way, homomorphically encrypted feature vector that is 0.05% the size of the original biometric template and can be searched with full accuracy, speed and privacy. The feature vector's homomorphic encryption allows search and match to be conducted in polynomial time on an encrypted dataset and the search result is returned as an encrypted match. One or more computing devices may use an encrypted feature vector to verify an individual person or identify an individual in a datastore without storing, sending or receiving plaintext biometric data within or between computing devices or any other entity. The purpose of private biometrics is to allow a person to be identified or authenticated while guaranteeing individual privacy and fundamental human rights by only operating on biometric data in the encrypted space. Some private biometrics including fingerprint authentication methods, face authentication methods, and identity-matching algorithms according to bodily features. Private biometrics are constantly evolving based on the changing nature of privacy needs, identity theft, and biotechnology.

<span class="mw-page-title-main">Cryptography</span> Practice and study of secure communication techniques

Cryptography, or cryptology, is the practice and study of techniques for secure communication in the presence of adversarial behavior. More generally, cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages. Modern cryptography exists at the intersection of the disciplines of mathematics, computer science, information security, electrical engineering, digital signal processing, physics, and others. Core concepts related to information security are also central to cryptography. Practical applications of cryptography include electronic commerce, chip-based payment cards, digital currencies, computer passwords, and military communications.

Transparent Data Encryption is a technology employed by Microsoft, IBM and Oracle to encrypt database files. TDE offers encryption at file level. TDE solves the problem of protecting data at rest, encrypting databases both on the hard drive and consequently on backup media. It does not protect data in transit nor data in use. Enterprises typically employ TDE to solve compliance issues such as PCI DSS which require the protection of data at rest.

<span class="mw-page-title-main">Key Management Interoperability Protocol</span> Communication protocol for the manipulation of cryptographic keys

The Key Management Interoperability Protocol (KMIP) is an extensible communication protocol that defines message formats for the manipulation of cryptographic keys on a key management server. This facilitates data encryption by simplifying encryption key management. Keys may be created on a server and then retrieved, possibly wrapped by other keys. Both symmetric and asymmetric keys are supported, including the ability to sign certificates. KMIP also allows for clients to ask a server to encrypt or decrypt data, without needing direct access to the key.

Linux.Encoder is considered to be the first ransomware Trojan targeting computers running Linux. There are additional variants of this Trojan that target other Unix and Unix-like systems. Discovered on November 5, 2015, by Dr. Web, this malware affected at least tens of Linux users.

Crypto-shredding is the practice of 'deleting' data by deliberately deleting or overwriting the encryption keys. This requires that the data have been encrypted. Data may be considered to exist in three states: data at rest, data in transit and data in use. General data security principles, such as in the CIA triad of confidentiality, integrity, and availability, require that all three states must be adequately protected.

References

  1. "What is Database Encryption and Decryption? - Definition from Techopedia". Techopedia.com. Retrieved November 4, 2015.
  2. 1 2 3 "Transparent Data Encryption with Azure SQL Database". msdn.microsoft.com. Retrieved November 4, 2015.
  3. "SQL SERVER - Introduction to SQL Server Encryption and Symmetric Key Encryption Tutorial with Script". Journey to SQL Authority with Pinal Dave. April 28, 2009. Retrieved October 25, 2015.
  4. "Transparent Data Encryption (TDE)". msdn.microsoft.com. Retrieved October 25, 2015.
  5. "What is data at rest? - Definition from WhatIs.com". SearchStorage. Retrieved October 25, 2015.
  6. "Encryption techniques and products for hardware-based data storage security". ComputerWeekly. Retrieved October 31, 2015.
  7. "Storage Encryption Solutions". www.thales-esecurity.com. Archived from the original on February 24, 2017. Retrieved October 25, 2015.
  8. "Transparent Data Encryption (TDE) in SQL Server — DatabaseJournal.com". www.databasejournal.com. May 19, 2014. Retrieved November 2, 2015.
  9. "Using Transparent Data Encryption". sqlmag.com. Archived from the original on October 14, 2017. Retrieved November 2, 2015.
  10. "A Tutorial on Database Concepts, SQL using MySQL". www.atlasindia.com. Retrieved November 4, 2015.
  11. "SQL Server Encryption Options". sqlmag.com. Archived from the original on October 27, 2017. Retrieved November 2, 2015.
  12. "Differences Between Whole Database and Column Encryption". www.netlib.com. Retrieved November 2, 2015.
  13. "Optimized and Controlled Provisioning of Encrypted Outsourced Data" (PDF). www.fkerschbaum.org. Archived from the original (PDF) on March 26, 2017. Retrieved April 13, 2016.
  14. Suciu, Dan (2012). "Technical Perspective: SQL on an Encrypted Database". Communications of the ACM. doi:10.1145/2330667.2330690. S2CID   33705485.
  15. Spooner, David L.; Gudes, E. (May 1, 1984). "A Unifying Approach to the Design of a Secure Database Operating System". IEEE Transactions on Software Engineering. SE-10 (3): 310–319. doi:10.1109/TSE.1984.5010240. ISSN   0098-5589. S2CID   15407701.
  16. 1 2 "Database Encryption in SQL Server 2008 Enterprise Edition". technet.microsoft.com. September 4, 2009. Retrieved November 3, 2015.
  17. 1 2 "Description of Symmetric and Asymmetric Encryption". support.microsoft.com. Retrieved October 25, 2015.
  18. "How Encryption Works". HowStuffWorks. April 6, 2001. Retrieved October 25, 2015.
  19. "Asymmetric vs. Symmetric – Hacking with PHP - Practical PHP". www.hackingwithphp.com. Retrieved November 3, 2015.
  20. "How Encryption Works". HowStuffWorks. April 6, 2001. Retrieved November 1, 2015.
  21. Young, Dr. Bill. "Foundations of Computer Security Lecture 44: Symmetric vs. Asymmetric Encryption" (PDF). University of Texas at Austin. Archived from the original (PDF) on March 5, 2016. Retrieved November 1, 2015.
  22. "What is asymmetric cryptography and how do I use it?". Two Factor Authenticity. Retrieved November 1, 2015.
  23. "Advantages and Disadvantages of Asymmetric and Symmetric Cryptosystems" (PDF). University of Babylon. Retrieved November 3, 2015.
  24. 1 2 "Encryption key management is vital to securing enterprise data storage". ComputerWeekly. Retrieved November 2, 2015.
  25. "What is Enterprise Key Management?". web.townsendsecurity.com. Retrieved November 2, 2015.
  26. "What is hashing? - Definition from WhatIs.com". SearchSQLServer. Retrieved November 1, 2015.
  27. "How data encryption software creates one way hash files using the sha1 hashing algorithm". www.metamorphosite.com. November 12, 2007. Retrieved November 1, 2015.
  28. "Understanding Encryption – Symmetric, Asymmetric, & Hashing". Atomic Spin. November 20, 2014. Retrieved November 1, 2015.
  29. "PHP: Password Hashing - Manual". php.net. Retrieved November 1, 2015.
  30. "JavaScript Implementation of SHA-256 Cryptographic Hash Algorithm | Movable Type Scripts". www.movable-type.co.uk. Retrieved November 3, 2015.
  31. 1 2 "Salt and pepper - How to encrypt database passwords". blog.kablamo.org. Retrieved November 1, 2015.
  32. "PHP: Password Hashing - Manual". php.net. Retrieved November 1, 2015.
  33. "Why You Should Always Salt Your Hashes - Web Development in Brighton - Added Bytes". Added Bytes. Retrieved November 1, 2015.
  34. 1 2 "ircmaxell's blog: Properly Salting Passwords, The Case Against Pepper". blog.ircmaxell.com. April 17, 2012. Retrieved November 2, 2015.
  35. 1 2 "Application Encryption from Thales e-Security". www.thales-esecurity.com. Archived from the original on February 24, 2017. Retrieved October 25, 2015.
  36. Pilyankevich, Eugene (December 18, 2020). "Application Level Encryption for Software Architects". InfoQ.
  37. 1 2 Baccam, Tanya (April 2010). "Transparent Data Encryption: New Technologies and Best Practices for Database Encryption". Sans.org. SANS Institute. Archived from the original on April 12, 2018. Retrieved October 25, 2015.
  38. "Database Encryption: Challenges, Risks, and Solutions". www.thales-esecurity.com. Archived from the original on February 24, 2017. Retrieved October 25, 2015.