Searchable symmetric encryption

Last updated
Keyword search using an SSE scheme Searchable Symmetric Encryption (SSE) scheme.png
Keyword search using an SSE scheme

Searchable symmetric encryption (SSE) is a form of encryption that allows one to efficiently search over a collection of encrypted documents or files without the ability to decrypt them. [1] [2] [3] SSE can be used to outsource files to an untrusted cloud storage server without ever revealing the files in the clear but while preserving the server's ability to search over them.

Contents

Description

A searchable symmetric encryption scheme is a symmetric-key encryption scheme that encrypts a collection of documents , where each document is viewed as a set of keywords from a keyword space . Given the encryption key and a keyword , one can generate a search token with which the encrypted data collection can be searched for . The result of the search is the subset of encrypted documents that contain the keyword .

Static SSE

A static SSE scheme consists of three algorithms that work as follows:

A static SSE scheme is used by a client and an untrusted server as follows. The client encrypts its data collection using the algorithm which returns a secret key and an encrypted document collection . The client keeps secret and sends and to the untrusted server. To search for a keyword , the client runs the algorithm on and to generate a search token which it sends to the server. The server runs Search with , , and and returns the resulting encrypted documents back to the client.

Dynamic SSE

A dynamic SSE scheme supports, in addition to search, the insertion and deletion of documents. A dynamic SSE scheme consists of seven algorithms where , and are as in the static case and the remaining algorithms work as follows:

To add a new document the client runs on and to generate an insert token which it sends to the server. The server runs with and and stores the updated encrypted document collection. To delete a document with identifier , the client runs the algorithm with and to generate a delete token which it sends to the server. The server runs with and and stores the updated encrypted document collection.

An SSE scheme that does not support and is called semi-dynamic.

History of Searchable Symmetric Encryption

The problem of searching on encrypted data was considered by Song, Wagner and Perrig [1] though previous work on Oblivious RAM by Goldreich and Ostrovsky [4] could be used in theory to address the problem. This work [1] proposed an SSE scheme with a search algorithm that runs in time , where . Goh [5] and Chang and Mitzenmacher [6] gave new SSE constructions with search algorithms that run in time , where is the number of documents. Curtmola, Garay, Kamara and Ostrovsky [2] later proposed two static constructions with search time, where is the number of documents that contain , which is optimal. This work also proposed a semi-dynamic construction with search time, where is the number of updates. An optimal dynamic SSE construction was later proposed by Kamara, Papamanthou and Roeder. [7]

Goh [5] and Chang and Mitzenmacher [6] proposed security definitions for SSE. These were strengthened and extended by Curtmola, Garay, Kamara and Ostrovsky [2] who proposed the notion of adaptive security for SSE. This work also was the first to observe leakage in SSE and to formally capture it as part of the security definition. Leakage was further formalized and generalized by Chase and Kamara. [8] Islam, Kuzu and Kantarcioglu described the first leakage attack. [9]

All the previously mentioned constructions support single keyword search. Cash, Jarecki, Jutla, Krawczyk, Rosu and Steiner [10] proposed an SSE scheme that supports conjunctive search in sub-linear time in . The construction can also be extended to support disjunctive and Boolean searches that can be expressed in searchable normal form (SNF) in sub-linear time. At the same time, Pappas, Krell, Vo, Kolesnikov, Malkin, Choi, George, Keromytis and Bellovin [11] described a construction that supports conjunctive and all disjunctive and Boolean searches in sub-linear time.

Security

SSE schemes are designed to guarantee that the untrusted server cannot learn any partial information about the documents or the search queries beyond some well-defined and reasonable leakage. The leakage of a scheme is formally described using a leakage profile which itself can consists of several leakage patterns. SSE constructions attempt to minimize leakage while achieving the best possible search efficiency.

SSE security can be analyzed in several adversarial models but the most common are:

Security in the Persistent Model

In the persistent model, there are SSE schemes that achieve a wide variety of leakage profiles. The most common leakage profile for static schemes that achieve single keyword search in optimal time is which reveals the number of documents in the collection, the size of each document in the collection, if and when a query was repeated and which encrypted documents match the search query. [2] [13] It is known, however, how to construct schemes that leak considerably less at an additional cost in search time and storage. [14] [15]

When considering dynamic SSE schemes, the state-of-the-art constructions with optimal time search have leakage profiles that guarantee forward privacy [16] which means that inserts cannot be correlated with past search queries.

Security in the Snapshot Model

In the snapshot model, efficient dynamic SSE schemes with no leakage beyond the number of documents and the size of the collection can be constructed. [12] When using an SSE construction that is secure in the snapshot model one has to carefully consider how the scheme will be deployed because some systems might cache previous search queries. [17]

Cryptanalysis

A leakage profile only describes the leakage of an SSE scheme but it says nothing about whether that leakage can be exploited or not. Cryptanalysis is therefore used to better understand the real-world security of a leakage profile. There is a wide variety of attacks working in different adversarial models, based on a variety of assumptions and attacking different leakage profiles. [18] [19]

See also

Related Research Articles

<span class="mw-page-title-main">Enthalpy</span> Measure of energy in a thermodynamic system

In thermodynamics, enthalpy, is the sum of a thermodynamic system's internal energy and the product of its pressure and volume. It is a state function used in many measurements in chemical, biological, and physical systems at a constant external pressure, which is conveniently provided by the large ambient atmosphere. The pressure–volume term expresses the work that was done against constant external pressure to establish the system's physical dimensions from to some final volume , i.e. to make room for it by displacing its surroundings. The pressure-volume term is very small for solids and liquids at common conditions, and fairly small for gases. Therefore, enthalpy is a stand-in for energy in chemical systems; bond, lattice, solvation, and other chemical "energies" are actually enthalpy differences. As a state function, enthalpy depends only on the final configuration of internal energy, pressure, and volume, not on the path taken to achieve it.

<span class="mw-page-title-main">Poynting vector</span> Measure of directional electromagnetic energy flux

In physics, the Poynting vector represents the directional energy flux or power flow of an electromagnetic field. The SI unit of the Poynting vector is the watt per square metre (W/m2); kg/s3 in base SI units. It is named after its discoverer John Henry Poynting who first derived it in 1884. Nikolay Umov is also credited with formulating the concept. Oliver Heaviside also discovered it independently in the more general form that recognises the freedom of adding the curl of an arbitrary vector field to the definition. The Poynting vector is used throughout electromagnetics in conjunction with Poynting's theorem, the continuity equation expressing conservation of electromagnetic energy, to calculate the power flow in electromagnetic fields.

In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

<span class="mw-page-title-main">Inductance</span> Property of electrical conductors

Inductance is the tendency of an electrical conductor to oppose a change in the electric current flowing through it. The electric current produces a magnetic field around the conductor. The magnetic field strength depends on the magnitude of the electric current, and follows any changes in the magnitude of the current. From Faraday's law of induction, any change in magnetic field through a circuit induces an electromotive force (EMF) (voltage) in the conductors, a process known as electromagnetic induction. This induced voltage created by the changing current has the effect of opposing the change in current. This is stated by Lenz's law, and the voltage is called back EMF.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric tensor on M consists of a metric tensor at each point p of M that varies smoothly with p.

<span class="mw-page-title-main">Georgi–Glashow model</span> Grand Unified Theory proposed in 1974

In particle physics, the Georgi–Glashow model is a particular Grand Unified Theory (GUT) proposed by Howard Georgi and Sheldon Glashow in 1974. In this model, the Standard Model gauge groups SU(3) × SU(2) × U(1) are combined into a single simple gauge group SU(5). The unified group SU(5) is then thought to be spontaneously broken into the Standard Model subgroup below a very high energy scale called the grand unification scale.

In mathematics, the Grassmannian is a differentiable manifold that parameterizes the set of all -dimensional linear subspaces of an -dimensional vector space over a field . For example, the Grassmannian is the space of lines through the origin in , so it is the same as the projective space of one dimension lower than . When is a real or complex vector space, Grassmannians are compact smooth manifolds, of dimension . In general they have the structure of a nonsingular projective algebraic variety.

<span class="mw-page-title-main">Projection (linear algebra)</span> Idempotent linear transformation from a vector space to itself

In linear algebra and functional analysis, a projection is a linear transformation from a vector space to itself such that . That is, whenever is applied twice to any vector, it gives the same result as if it were applied once. It leaves its image unchanged. This definition of "projection" formalizes and generalizes the idea of graphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection on points in the object.

System F is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorphism in programming languages, thus forming a theoretical basis for languages such as Haskell and ML. It was discovered independently by logician Jean-Yves Girard (1972) and computer scientist John C. Reynolds.

<span class="mw-page-title-main">Change of basis</span> Coordinate change in linear algebra

In mathematics, an ordered basis of a vector space of finite dimension n allows representing uniquely any element of the vector space by a coordinate vector, which is a sequence of n scalars called coordinates. If two different bases are considered, the coordinate vector that represents a vector v on one basis is, in general, different from the coordinate vector that represents v on the other basis. A change of basis consists of converting every assertion expressed in terms of coordinates relative to one basis into an assertion expressed in terms of coordinates relative to the other basis.

In cryptography, Optimal Asymmetric Encryption Padding (OAEP) is a padding scheme often used together with RSA encryption. OAEP was introduced by Bellare and Rogaway, and subsequently standardized in PKCS#1 v2 and RFC 2437.

In information retrieval, tf–idf, short for term frequency–inverse document frequency, is a measure of importance of a word to a document in a collection or corpus, adjusted for the fact that some words appear more frequently in general. It was often used as a weighting factor in searches of information retrieval, text mining, and user modeling. A survey conducted in 2015 showed that 83% of text-based recommender systems in digital libraries used tf–idf.

In mathematical logic, Heyting arithmetic is an axiomatization of arithmetic in accordance with the philosophy of intuitionism. It is named after Arend Heyting, who first proposed it.

Axiomatic constructive set theory is an approach to mathematical constructivism following the program of axiomatic set theory. The same first-order language with "" and "" of classical set theory is usually used, so this is not to be confused with a constructive types approach. On the other hand, some constructive theories are indeed motivated by their interpretability in type theories.

The Hamiltonian is a function used to solve a problem of optimal control for a dynamical system. It can be understood as an instantaneous increment of the Lagrangian expression of the problem that is to be optimized over a certain time period. Inspired by—but distinct from—the Hamiltonian of classical mechanics, the Hamiltonian of optimal control theory was developed by Lev Pontryagin as part of his maximum principle. Pontryagin proved that a necessary condition for solving the optimal control problem is that the control should be chosen so as to optimize the Hamiltonian.

Vector space model or term vector model is an algebraic model for representing text documents as vectors such that the distance between vectors represents the relevance between the documents. It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System.

In cryptography, learning with errors (LWE) is a mathematical problem that is widely used to create secure encryption algorithms. It is based on the idea of representing secret information as a set of equations with errors. In other words, LWE is a way to hide the value of a secret by introducing noise to it. In more technical terms, it refers to the computational problem of inferring a linear -ary function over a finite ring from given samples some of which may be erroneous. The LWE problem is conjectured to be hard to solve, and thus to be useful in cryptography.

In discrete mathematics, ideal lattices are a special class of lattices and a generalization of cyclic lattices. Ideal lattices naturally occur in many parts of number theory, but also in other areas. In particular, they have a significant place in cryptography. Micciancio defined a generalization of cyclic lattices as ideal lattices. They can be used in cryptosystems to decrease by a square root the number of parameters necessary to describe a lattice, making them more efficient. Ideal lattices are a new concept, but similar lattice classes have been used for a long time. For example, cyclic lattices, a special case of ideal lattices, are used in NTRUEncrypt and NTRUSign.

Seny Kamara is a Senegalese-French-American computer scientist best known for his work on cryptography. He has delivered multiple congressional testimonies about the potential harms and opportunities with technology. He leads or co-leads numerous centers and activities focused on cryptography and social good. His work has been covered extensively in high-profile media, including Wired and Forbes.

Structured encryption (STE) is a form of encryption that encrypts a data structure so that it can be privately queried. Structured encryption can be used as a building block to design end-to-end encrypted databases, efficient searchable symmetric encryption (SSE) and other algorithms that can be efficiently executed on encrypted data.

References

  1. 1 2 3 Dawn Xiaoding Song; Wagner, D.; Perrig, A. (2000). "Practical techniques for searches on encrypted data". Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000. IEEE Comput. Soc. pp. 44–55. doi:10.1109/secpri.2000.848445. ISBN   0-7695-0665-8. S2CID   2829840.
  2. 1 2 3 4 5 Curtmola, Reza; Garay, Juan; Kamara, Seny; Ostrovsky, Rafail (2006-10-30). "Searchable symmetric encryption". Proceedings of the 13th ACM conference on Computer and communications security. CCS '06. Alexandria, Virginia, USA: Association for Computing Machinery. pp. 79–88. doi:10.1145/1180405.1180417. ISBN   978-1-59593-518-2. S2CID   961719.
  3. Amorim, Ivone; Costa, Ivan (2023-07-01). "Leveraging Searchable Encryption through Homomorphic Encryption: A Comprehensive Analysis". Mathematics. 11 (13): 2948. doi: 10.3390/math11132948 . ISSN   2227-7390.
  4. Goldreich, Oded; Ostrovsky, Rafail (May 1996). "Software protection and simulation on oblivious RAMs". Journal of the ACM. 43 (3): 431–473. doi:10.1145/233551.233553. hdl: 1721.1/103684 . ISSN   0004-5411. S2CID   7502114.
  5. 1 2 Goh, Eu-Jin (2003). "Secure Indexes".
  6. 1 2 Chang, Yan-Cheng; Mitzenmacher, Michael (2005). "Privacy Preserving Keyword Searches on Remote Encrypted Data". In Ioannidis, John; Keromytis, Angelos; Yung, Moti (eds.). Applied Cryptography and Network Security. Lecture Notes in Computer Science. Vol. 3531. Berlin, Heidelberg: Springer. pp. 442–455. doi: 10.1007/11496137_30 . ISBN   978-3-540-31542-1.
  7. Kamara, Seny; Papamanthou, Charalampos; Roeder, Tom (2012-10-16). "Dynamic searchable symmetric encryption". Proceedings of the 2012 ACM conference on Computer and communications security. CCS '12. New York, NY, USA: Association for Computing Machinery. pp. 965–976. doi:10.1145/2382196.2382298. ISBN   978-1-4503-1651-4. S2CID   243046.
  8. Chase, Melissa; Kamara, Seny (2010). "Structured Encryption and Controlled Disclosure". In Abe, Masayuki (ed.). Advances in Cryptology - ASIACRYPT 2010. Lecture Notes in Computer Science. Vol. 6477. Berlin, Heidelberg: Springer. pp. 577–594. doi: 10.1007/978-3-642-17373-8_33 . ISBN   978-3-642-17373-8.
  9. Islam, Mohammad; Kuzu, Mehmet; Kantarcioglu, Murat. "Access Pattern disclosure on Searchable Encryption:Ramification, Attack and Mitigation" (PDF). Network and Distributed System Security (NDSS) Symposium.
  10. Cash, David; Jarecki, Stanislaw; Jutla, Charanjit; Krawczyk, Hugo; Roşu, Marcel-Cătălin; Steiner, Michael (2013). "Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries". In Canetti, Ran; Garay, Juan A. (eds.). Advances in Cryptology – CRYPTO 2013. Lecture Notes in Computer Science. Vol. 8042. Berlin, Heidelberg: Springer. pp. 353–373. doi: 10.1007/978-3-642-40041-4_20 . ISBN   978-3-642-40041-4.
  11. Pappas, Vasilis; Krell, Fernando; Vo, Binh; Kolesnikov, Vladimir; Malkin, Tal; Choi, Seung Geol; George, Wesley; Keromytis, Angelos; Bellovin, Steve (May 2014). "Blind Seer: A Scalable Private DBMS". 2014 IEEE Symposium on Security and Privacy. IEEE. pp. 359–374. doi: 10.1109/sp.2014.30 . ISBN   978-1-4799-4686-0. S2CID   9165575.
  12. 1 2 Amjad, Ghous; Kamara, Seny; Moataz, Tarik (2019-01-01). "Breach-Resistant Structured Encryption". Proceedings on Privacy Enhancing Technologies. 2019 (1): 245–265. doi: 10.2478/popets-2019-0014 . S2CID   4047057.
  13. "Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation – NDSS Symposium" . Retrieved 2022-02-22.
  14. Kamara, Seny; Moataz, Tarik; Ohrimenko, Olya (2018). "Structured Encryption and Leakage Suppression". In Shacham, Hovav; Boldyreva, Alexandra (eds.). Advances in Cryptology – CRYPTO 2018. Lecture Notes in Computer Science. Vol. 10991. Cham: Springer International Publishing. pp. 339–370. doi:10.1007/978-3-319-96884-1_12. ISBN   978-3-319-96884-1. S2CID   51603585.
  15. "Revisiting Leakage Abuse Attacks – NDSS Symposium" . Retrieved 2022-02-22.
  16. "Practical Dynamic Searchable Encryption with Small Leakage – NDSS Symposium" . Retrieved 2022-02-22.
  17. Grubbs, Paul; Ristenpart, Thomas; Shmatikov, Vitaly (2017-05-07). "Why Your Encrypted Database is Not Secure". Proceedings of the 16th Workshop on Hot Topics in Operating Systems. HotOS '17. New York, NY, USA: Association for Computing Machinery. pp. 162–168. doi:10.1145/3102980.3103007. ISBN   978-1-4503-5068-6. S2CID   10111288.
  18. Yao, Jing; Zheng, Yifeng; Guo, Yu; Wang, Cong (2020-10-06). "SoK: A Systematic Study of Attacks in Efficient Encrypted Cloud Data Search". Proceedings of the 8th International Workshop on Security in Blockchain and Cloud Computing. SBC '20. New York, NY, USA: Association for Computing Machinery. pp. 14–20. doi:10.1145/3384942.3406869. ISBN   978-1-4503-7609-9. S2CID   222179683.
  19. Kamara, Seny; Kati, Abdelkarim; Moataz, Tarik; Schneider, Thomas; Treiber, Amos; Yonli, Michael (2021). "Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data".{{cite journal}}: Cite journal requires |journal= (help)