LSH is a cryptographic hash function designed in 2014 by South Korea to provide integrity in general-purpose software environments such as PCs and smart devices. [1] LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). And it is the national standard of South Korea (KS X 3262).
The overall structure of the hash function LSH is shown in the following figure.
The hash function LSH has the wide-pipe Merkle-Damgård structure with one-zeros padding. The message hashing process of LSH consists of the following three stages.
One-zeros padding of Generation of message blocks , where from the padded bit string fortodo end for return |
The specifications of the hash function LSH are as follows.
Algorithm | Digest size in bits () | Number of step functions () | Chaining variable size in bits | Message block size in bits | Word size in bits () |
---|---|---|---|---|---|
LSH-256-224 | 224 | 26 | 512 | 1024 | 32 |
LSH-256-256 | 256 | ||||
LSH-512-224 | 224 | 28 | 1024 | 2048 | 64 |
LSH-512-256 | 256 | ||||
LSH-512-384 | 384 | ||||
LSH-512-512 | 512 |
Let be a given bit string message. The given is padded by one-zeros, i.e., the bit ‘1’ is appended to the end of , and the bit ‘0’s are appended until a bit length of a padded message is , where and is the smallest integer not less than .
Let be the one-zeros-padded -bit string of . Then is considered as a -byte array , where for all . The -byte array converts into a -word array as follows.
From the word array , we define the 32-word array message blocks as follows.
The 16-word array chaining variable is initialized to the initialization vector .
The initialization vector is as follows. In the following tables, all values are expressed in hexadecimal form.
068608D3 | 62D8F7A7 | D76652AB | 4C600A43 | BDC40AA8 | 1ECA0B68 | DA1A89BE | 3147D354 |
707EB4F9 | F65B3862 | 6B0B2ABE | 56B8EC0A | CF237286 | EE0D1727 | 33636595 | 8BB8D05F |
46A10F1F | FDDCE486 | B41443A8 | 198E6B9D | 3304388D | B0F5A3C7 | B36061C4 | 7ADBD553 |
105D5378 | 2F74DE54 | 5C2F2D95 | F2553FBE | 8051357A | 138668C8 | 47AA4484 | E01AFB41 |
0C401E9FE8813A55 | 4A5F446268FD3D35 | FF13E452334F612A | F8227661037E354A |
A5F223723C9CA29D | 95D965A11AED3979 | 01E23835B9AB02CC | 52D49CBAD5B30616 |
9E5C2027773F4ED3 | 66A5C8801925B701 | 22BBC85B4C6779D9 | C13171A42C559C23 |
31E2B67D25BE3813 | D522C4DEED8E4D83 | A79F5509B43FBAFE | E00D2CD88B4B6C6A |
6DC57C33DF989423 | D8EA7F6E8342C199 | 76DF8356F8603AC4 | 40F1B44DE838223A |
39FFE7CFC31484CD | 39C4326CC5281548 | 8A2FF85A346045D8 | FF202AA46DBDD61E |
CF785B3CD5FCDB8B | 1F0323B64A8150BF | FF75D972F29EA355 | 2E567F30BF1CA9E1 |
B596875BF8FF6DBA | FCCA39B089EF4615 | ECFF4017D020B4B6 | 7E77384C772ED802 |
53156A66292808F6 | B2C4F362B204C2BC | B84B7213BFA05C4E | 976CEB7C1B299F73 |
DF0CC63C0570AE97 | DA4441BAA486CE3F | 6559F5D9B5F2ACC2 | 22DACF19B4B52A16 |
BBCDACEFDE80953A | C9891A2879725B3E | 7C9FE6330237E440 | A30BA550553F7431 |
BB08043FB34E3E30 | A0DEC48D54618EAD | 150317267464BC57 | 32D1501FDE63DC93 |
ADD50F3C7F07094E | E3F3CEE8F9418A4F | B527ECDE5B3D0AE9 | 2EF6DEC68076F501 |
8CB994CAE5ACA216 | FBB9EAE4BBA48CC7 | 650A526174725FEA | 1F9A61A73F8D8085 |
B6607378173B539B | 1BC99853B0C0B9ED | DF727FC19B182D47 | DBEF360CF893A457 |
4981F5E570147E80 | D00C4490CA7D3E30 | 5D73940C0E4AE1EC | 894085E2EDB2D819 |
In this stage, the 32-word array message blocks , which are generated from a message in the initialization stage, are compressed by iteration of compression functions. The compression function has two inputs; the -th 16-word chaining variable and the -th 32-word message block . And it returns the -th 16-word chaining variable . Here and subsequently, denotes the set of all -word arrays for .
The following four functions are used in a compression function:
The overall structure of the compression function is shown in the following figure.
In a compression function, the message expansion function generates 16-word array sub-messages from given . Let be a temporary 16-word array set to the -th chaining variable . The -th step function having two inputs and updates , i.e., . All step functions are proceeded in order . Then one more operation by is proceeded, and the -th chaining variable is set to . The process of a compression function in detail is as follows.
fortodo end for return |
Here the -th step function is as follows.
The following figure shows the -th step function of a compression function.
Let be the -th 32-word array message block. The message expansion function generates 16-word array sub-messages from a message block . The first two sub-messages and are defined as follows.
The next sub-messages are generated as follows.
Here is the permutation over defined as follows.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
3 | 2 | 0 | 1 | 7 | 4 | 5 | 6 | 11 | 10 | 8 | 9 | 15 | 12 | 13 | 14 |
For two 16-word arrays and , the message addition function is defined as follows.
The -th mix function updates the 16-word array by mixing every two-word pair; and for . For , the mix function proceeds as follows.
Here is a two-word mix function. Let and be words. The two-word mix function is defined as follows.
;; ; ;; ;; return, ; |
The two-word mix function is shown in the following figure.
The bit rotation amounts , , used in are shown in the following table.
32 | even | 29 | 1 | 0 | 8 | 16 | 24 | 24 | 16 | 8 | 0 |
odd | 5 | 17 | |||||||||
64 | even | 23 | 59 | 0 | 16 | 32 | 48 | 8 | 24 | 40 | 56 |
odd | 7 | 3 |
The -th 8-word array constant used in for is defined as follows. The initial 8-word array constant is defined in the following table. For , the -th constant is generated by for .
917caf90 | 97884283c938982a | |
6c1b10a2 | ba1fca93533e2355 | |
6f352943 | c519a2e87aeb1c03 | |
cf778243 | 9a0fc95462af17b1 | |
2ceb7472 | fc3dda8ab019a82b | |
29e96ff2 | 02825d079a895407 | |
8a9ba428 | 79f2d0a7ee06a6f7 | |
2eeb2642 | d76d15eed9fdf5fe |
Let be a 16-word array. The word-permutation function is defined as follows.
Here is the permutation over defined by the following table.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
6 | 4 | 5 | 7 | 12 | 15 | 14 | 13 | 2 | 0 | 1 | 3 | 8 | 11 | 10 | 9 |
The finalization function returns -bit hash value from the final chaining variable . When is an 8-word variable and is a -byte variable, the finalization function performs the following procedure.
Here, denotes , the sub-bit string of a word for . And denotes , the sub-bit string of a -bit string for .
LSH is secure against known attacks on hash functions up to now. LSH is collision-resistant for and preimage-resistant and second-preimage-resistant for in the ideal cipher model, where is a number of queries for LSH structure. [1] LSH-256 is secure against all the existing hash function attacks when the number of steps is 13 or more, while LSH-512 is secure if the number of steps is 14 or more. Note that the steps which work as security margin are 50% of the compression function. [1]
LSH outperforms SHA-2/3 on various software platforms. The following table shows the speed performance of 1MB message hashing of LSH on several platforms.
Platform | P1 [lower-alpha 1] | P2 [lower-alpha 2] | P3 [lower-alpha 3] | P4 [lower-alpha 4] | P5 [lower-alpha 5] | P6 [lower-alpha 6] | P7 [lower-alpha 7] | P8 [lower-alpha 8] |
---|---|---|---|---|---|---|---|---|
LSH-256- | 3.60 | 3.86 | 5.26 | 3.89 | 11.17 | 15.03 | 15.28 | 14.84 |
LSH-512- | 2.39 | 5.04 | 7.76 | 5.52 | 8.94 | 18.76 | 19.00 | 18.10 |
The following table is the comparison at the platform based on Haswell, LSH is measured on Intel Core i7-4770k @ 3.5 GHz quad core platform, and others are measured on Intel Core i5-4570S @ 2.9 GHz quad core platform.
Algorithm | Message size in bytes | |||||
---|---|---|---|---|---|---|
long | 4,096 | 1,536 | 576 | 64 | 8 | |
LSH-256-256 | 3.60 | 3.71 | 3.90 | 4.08 | 8.19 | 65.37 |
Skein-512-256 | 5.01 | 5.58 | 5.86 | 6.49 | 13.12 | 104.50 |
Blake-256 | 6.61 | 7.63 | 7.87 | 9.05 | 16.58 | 72.50 |
Grøstl-256 | 9.48 | 10.68 | 12.18 | 13.71 | 37.94 | 227.50 |
Keccak-256 | 10.56 | 10.52 | 9.90 | 11.99 | 23.38 | 187.50 |
SHA-256 | 10.82 | 11.91 | 12.26 | 13.51 | 24.88 | 106.62 |
JH-256 | 14.70 | 15.50 | 15.94 | 17.06 | 31.94 | 257.00 |
LSH-512-512 | 2.39 | 2.54 | 2.79 | 3.31 | 10.81 | 85.62 |
Skein-512-512 | 4.67 | 5.51 | 5.80 | 6.44 | 13.59 | 108.25 |
Blake-512 | 4.96 | 6.17 | 6.82 | 7.38 | 14.81 | 116.50 |
SHA-512 | 7.65 | 8.24 | 8.69 | 9.03 | 17.22 | 138.25 |
Grøstl-512 | 12.78 | 15.44 | 17.30 | 17.99 | 51.72 | 417.38 |
JH-512 | 14.25 | 15.66 | 16.14 | 17.34 | 32.69 | 261.00 |
Keccak-512 | 16.36 | 17.86 | 18.46 | 20.35 | 21.56 | 171.88 |
The following table is measured on Samsung Exynos 5250 ARM Cortex-A15 @ 1.7 GHz dual core platform.
Algorithm | Message size in bytes | |||||
---|---|---|---|---|---|---|
long | 4,096 | 1,536 | 576 | 64 | 8 | |
LSH-256-256 | 11.17 | 11.53 | 12.16 | 12.63 | 22.42 | 192.68 |
Skein-512-256 | 15.64 | 16.72 | 18.33 | 22.68 | 75.75 | 609.25 |
Blake-256 | 17.94 | 19.11 | 20.88 | 25.44 | 83.94 | 542.38 |
SHA-256 | 19.91 | 21.14 | 23.03 | 28.13 | 90.89 | 578.50 |
JH-256 | 34.66 | 36.06 | 38.10 | 43.51 | 113.92 | 924.12 |
Keccak-256 | 36.03 | 38.01 | 40.54 | 48.13 | 125.00 | 1000.62 |
Grøstl-256 | 40.70 | 42.76 | 46.03 | 54.94 | 167.52 | 1020.62 |
LSH-512-512 | 8.94 | 9.56 | 10.55 | 12.28 | 38.82 | 307.98 |
Blake-512 | 13.46 | 14.82 | 16.88 | 20.98 | 77.53 | 623.62 |
Skein-512-512 | 15.61 | 16.73 | 18.35 | 22.56 | 75.59 | 612.88 |
JH-512 | 34.88 | 36.26 | 38.36 | 44.01 | 116.41 | 939.38 |
SHA-512 | 44.13 | 46.41 | 49.97 | 54.55 | 135.59 | 1088.38 |
Keccak-512 | 63.31 | 64.59 | 67.85 | 77.21 | 121.28 | 968.00 |
Grøstl-512 | 131.35 | 138.49 | 150.15 | 166.54 | 446.53 | 3518.00 |
Test vectors for LSH for each digest length are as follows. All values are expressed in hexadecimal form.
LSH-256-224("abc") = F7 C5 3B A4 03 4E 70 8E 74 FB A4 2E 55 99 7C A5 12 6B B7 62 36 88 F8 53 42 F7 37 32
LSH-256-256("abc") = 5F BF 36 5D AE A5 44 6A 70 53 C5 2B 57 40 4D 77 A0 7A 5F 48 A1 F7 C1 96 3A 08 98 BA 1B 71 47 41
LSH-512-224("abc") = D1 68 32 34 51 3E C5 69 83 94 57 1E AD 12 8A 8C D5 37 3E 97 66 1B A2 0D CF 89 E4 89
LSH-512-256("abc") = CD 89 23 10 53 26 02 33 2B 61 3F 1E C1 1A 69 62 FC A6 1E A0 9E CF FC D4 BC F7 58 58 D8 02 ED EC
LSH-512-384("abc") = 5F 34 4E FA A0 E4 3C CD 2E 5E 19 4D 60 39 79 4B 4F B4 31 F1 0F B4 B6 5F D4 5E 9D A4 EC DE 0F 27 B6 6E 8D BD FA 47 25 2E 0D 0B 74 1B FD 91 F9 FE
LSH-512-512("abc") = A3 D9 3C FE 60 DC 1A AC DD 3B D4 BE F0 A6 98 53 81 A3 96 C7 D4 9D 9F D1 77 79 56 97 C3 53 52 08 B5 C5 72 24 BE F2 10 84 D4 20 83 E9 5A 4B D8 EB 33 E8 69 81 2B 65 03 1C 42 88 19 A1 E7 CE 59 6D
LSH is free for any use public or private, commercial or non-commercial. The source code for distribution of LSH implemented in C, Java, and Python can be downloaded from KISA's cryptography use activation webpage. [2]
LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). [3]
LSH is included in the following standard.
In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.
The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the choice of a Mersenne prime as its period length.
Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.
In the calculus of variations and classical mechanics, the Euler–Lagrange equations are a system of second-order ordinary differential equations whose solutions are stationary points of the given action functional. The equations were discovered in the 1750s by Swiss mathematician Leonhard Euler and Italian mathematician Joseph-Louis Lagrange.
Vapnik–Chervonenkis theory was developed during 1960–1990 by Vladimir Vapnik and Alexey Chervonenkis. The theory is a form of computational learning theory, which attempts to explain the learning process from a statistical point of view.
In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.
In abstract algebra and multilinear algebra, a multilinear form on a vector space over a field is a map
A Dynkin system, named after Eugene Dynkin, is a collection of subsets of another universal set satisfying a set of axioms weaker than those of 𝜎-algebra. Dynkin systems are sometimes referred to as 𝜆-systems or d-system. These set families have applications in measure theory and probability.
In mathematics, Doob's martingale inequality, also known as Kolmogorov’s submartingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a submartingale exceeds any given value over a given interval of time. As the name suggests, the result is usually given in the case that the process is a martingale, but the result is also valid for submartingales.
Linear Programming Boosting (LPBoost) is a supervised classifier from the boosting family of classifiers. LPBoost maximizes a margin between training samples of different classes and hence also belongs to the class of margin-maximizing supervised classification algorithms. Consider a classification function
In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.
In set theory, a mathematical discipline, the Jensen hierarchy or J-hierarchy is a modification of Gödel's constructible hierarchy, L, that circumvents certain technical difficulties that exist in the constructible hierarchy. The J-Hierarchy figures prominently in fine structure theory, a field pioneered by Ronald Jensen, for whom the Jensen hierarchy is named. Rudimentary functions describe a method for iterating through the Jensen hierarchy.
The uncertainty theory invented by Baoding Liu is a branch of mathematics based on normality, monotonicity, self-duality, countable subadditivity, and product measure axioms.
ACE is the collection of units, implementing both a public key encryption scheme and a digital signature scheme. Corresponding names for these schemes — «ACE Encrypt» and «ACE Sign». Schemes are based on Cramer-Shoup public key encryption scheme and Cramer-Shoup signature scheme. Introduced variants of these schemes are intended to achieve a good balance between performance and security of the whole encryption system.
Input-to-state stability (ISS) is a stability notion widely used to study stability of nonlinear control systems with external inputs. Roughly speaking, a control system is ISS if it is globally asymptotically stable in the absence of external inputs and if its trajectories are bounded by a function of the size of the input for all sufficiently large times. The importance of ISS is due to the fact that the concept has bridged the gap between input–output and state-space methods, widely used within the control systems community.
In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a smooth approximation to the maximum function meaning a parametric family of functions such that for every α, the function is smooth, and the family converges to the maximum function as . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, as and as . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.
Dynamic epistemic logic (DEL) is a logical framework dealing with knowledge and information change. Typically, DEL focuses on situations involving multiple agents and studies how their knowledge changes when events occur. These events can change factual properties of the actual world : for example a red card is painted in blue. They can also bring about changes of knowledge without changing factual properties of the world : for example a card is revealed publicly to be red. Originally, DEL focused on epistemic events. We only present in this entry some of the basic ideas of the original DEL framework; more details about DEL in general can be found in the references.
A central problem in algorithmic graph theory is the shortest path problem. One of the generalizations of the shortest path problem is known as the single-source-shortest-paths (SSSP) problem, which consists of finding the shortest paths from a source vertex to all other vertices in the graph. There are classical sequential algorithms which solve this problem, such as Dijkstra's algorithm. In this article, however, we present two parallel algorithms solving this problem.
The finite promise games are a collection of mathematical games developed by American mathematician Harvey Friedman in 2009 which are used to develop a family of fast-growing functions , and . The greedy clique sequence is a graph theory concept, also developed by Friedman in 2010, which are used to develop fast-growing functions , and .
The method of (hypergraph) containers is a powerful tool that can help characterize the typical structure and/or answer extremal questions about families of discrete objects with a prescribed set of local constraints. Such questions arise naturally in extremal graph theory, additive combinatorics, discrete geometry, coding theory, and Ramsey theory; they include some of the most classical problems in the associated fields.