Misplaced Pages

One-way compression function

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In cryptography , a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is "one-way" , meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly (lossless compression) or approximately (lossy compression) to the original data.

#868131

57-592: One-way compression functions are for instance used in the Merkle–Damgård construction inside cryptographic hash functions . One-way compression functions are often built from block ciphers . Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer , Matyas–Meyer–Oseas , Miyaguchi–Preneel (single-block-length compression functions) and MDC-2/Meyer–Schilling , MDC-4 , Hirose (double-block-length compression functions). These methods are described in detail further down. ( MDC-2

114-615: A 2 k {\displaystyle 2^{k}} -message-block message in time 3 × 2 n / 2 + 1 + 2 n − k + 1 {\displaystyle 3\times 2^{n/2+1}+2^{n-k+1}} . If the construction does not allow easy creation of fixed points (like Matyas–Meyer–Oseas or Miyaguchi–Preneel) then this attack can be done in k × 2 n / 2 + 1 + 2 n − k + 1 {\displaystyle k\times 2^{n/2+1}+2^{n-k+1}} time. In both cases

171-508: A 2 k {\displaystyle 2^{k}} -message-block message in time k × 2 n / 2 + 1 + 2 n − k + 1 {\displaystyle k\times 2^{n/2+1}+2^{n-k+1}} . The complexity is above 2 n / 2 {\displaystyle 2^{n/2}} but below 2 n {\displaystyle 2^{n}} when messages are long, and that when messages get shorter

228-508: A 2 k {\displaystyle 2^{k}} -message-block message in time k × 2 n / 2 + 1 + 2 n − k + 1 {\displaystyle k\times 2^{n/2+1}+2^{n-k+1}} . The complexity is above 2 n / 2 {\displaystyle 2^{n/2}} but below 2 n {\displaystyle 2^{n}} when messages are long, and that when messages get shorter

285-801: A 2 k {\displaystyle 2^{k}} -message-block message in time k × 2 n / 2 + 1 + 2 n − k + 1 {\displaystyle k\times 2^{n/2+1}+2^{n-k+1}} . The complexity of this attack reaches a minimum of 2 3 n / 4 + 2 {\displaystyle 2^{3n/4+2}} for long messages when k = 2 n / 4 {\displaystyle k=2^{n/4}} and approaches 2 n {\displaystyle 2^{n}} when messages are short. One-way compression functions are often built from block ciphers. Block ciphers take (like one-way compression functions) two fixed size inputs (the key and

342-479: A 128-bit block cipher can be turned into a 256-bit hash function. These methods are then used inside the Merkle–Damgård construction to build the actual hash function. These methods are described in detail further down. Using a block cipher to build the one-way compression function for a hash function is usually somewhat slower than using a specially designed one-way compression function in the hash function. This

399-435: A block cipher into a one-way compression function some extra operations have to be added. Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2, MDC-4, Hirose (double-block-length compressions functions). Single-block-length compression functions output the same number of bits as processed by

456-403: A collision-resistant hash function from a collision-resistant compression function. The hash function PARSHA-256 was designed using the parallel algorithm and the compression function of SHA-256. As mentioned in the introduction, the padding scheme used in the Merkle–Damgård construction must be chosen carefully to ensure the security of the scheme. Mihir Bellare gives sufficient conditions for

513-508: A one-way compression function. The compression function can either be specially designed for hashing or be built from a block cipher. The last block processed should also be length padded , which is crucial to the security of this construction. When length padding (also called MD-strengthening) is applied, attacks cannot find collisions faster than the birthday paradox ( 2 n / 2 {\displaystyle 2^{n/2}} , n {\displaystyle n} being

570-449: A padding scheme to possess to ensure that the MD construction is secure: it suffices that the scheme be "MD-compliant" (the original length-padding scheme used by Merkle is an example of MD-compliant padding). The conditions are: Here, | X | denotes the length of X . With these conditions in place, a collision in the MD hash function exists exactly when there is a collision in

627-404: A set containing all appropriate block ciphers. In this model an attacker may freely encrypt and decrypt any blocks, but does not have access to an implementation of the block cipher. The encryption and decryption function are represented by oracles that receive a pair of either a plaintext and a key or a ciphertext and a key. The oracles then respond with a randomly chosen plaintext or ciphertext, if

SECTION 10

#1732787831869

684-601: A single fixed length output of the same size as one of the inputs. This can also be seen as that the compression function transforms one large fixed-length input into a shorter, fixed-length output. For instance, input A might be 128 bits, input B 128 bits and they are compressed together to a single output of 128 bits. This is equivalent to having a single 256-bit input compressed to a single output of 128 bits. Some compression functions do not compress by half, but instead by some other factor. For example, input A might be 256 bits, and input B 128 bits, which are compressed to

741-441: A single output of 128 bits. That is, a total of 384 input bits are compressed together to 128 output bits. The mixing is done in such a way that full avalanche effect is achieved. That is, every output bit depends on every input bit. A one-way function is a function that is easy to compute but hard to invert. A one-way compression function (also called hash function) should have the following properties: Ideally one would like

798-403: A smaller output hash size or to guarantee a better mixing and avalanche effect on the bits in the hash sum. The finalisation function is often built by using the compression function. (Note that in some documents a different terminology is used: the act of length padding is called "finalisation". ) The popularity of this construction is due to the fact, proven by Merkle and Damgård , that if

855-399: A wide-pipe hash function can be made approximately twice as fast if the wide-pipe state can be divided in half in the following manner: one half is input to the succeeding compression function while the other half is combined with the output of that compression function. The main idea of the hash construction is to forward half of the previous chaining value forward to XOR it to the output of

912-636: Is a property that random functions certainly do not have. So far, no practical attack has been based on this property, but one should be aware of this "feature". The fixed-points can be used in a second preimage attack (given a message m 1 {\displaystyle m_{1}} , attacker finds another message m 2 {\displaystyle m_{2}} to satisfy hash ⁡ ( m 1 ) = hash ⁡ ( m 2 ) {\displaystyle \operatorname {hash} (m_{1})=\operatorname {hash} (m_{2})} ) of Kelsey and Schneier for

969-469: Is also the name of a hash function patented by IBM .) Another method is 2BOW (or NBOW in general), which is a "high-rate multi-block-length hash function based on block ciphers" and typically achieves (asymptotic) rates between 1 and 2 independent of the hash size (only with small constant overhead). This method has not yet seen any serious security analysis, so should be handled with care. A compression function mixes two fixed length inputs and produces

1026-417: Is because all known secure constructions do the key scheduling for each block of the message. Black, Cochran and Shrimpton have shown that it is impossible to construct a one-way compression function that makes only one call to a block cipher with a fixed key. In practice reasonable speeds are achieved provided the key scheduling of the selected block cipher is not a too heavy operation. But, in some cases it

1083-500: Is built from a compression function applying this block cipher (0 else). The probability that the algorithm returns 1 is dependent on the number of queries which determine the security level. The Davies–Meyer single-block-length compression function feeds each block of the message ( m i {\displaystyle m_{i}} ) as the key to a block cipher. It feeds the previous hash value ( H i − 1 {\displaystyle H_{i-1}} ) as

1140-424: Is easier because a single implementation of a block cipher can be used for both a block cipher and a hash function. It can also save code space in very tiny embedded systems like for instance smart cards or nodes in cars or other machines. Therefore, the hash-rate or rate gives a glimpse of the efficiency of a hash function based on a certain compression function. The rate of an iterated hash function outlines

1197-617: Is encrypted under the key m i {\displaystyle m_{i}} , thus making this method an extension of Davies–Meyer instead. A second preimage attack (given a message m 1 {\displaystyle m_{1}} an attacker finds another message m 2 {\displaystyle m_{2}} to satisfy hash ⁡ ( m 1 ) = hash ⁡ ( m 2 ) {\displaystyle \operatorname {hash} (m_{1})=\operatorname {hash} (m_{2})} ) can be done according to Kelsey and Schneier for

SECTION 20

#1732787831869

1254-399: Is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value ( H 0 {\displaystyle H_{0}} ). If the block cipher has different block and key sizes the hash value ( H i − 1 {\displaystyle H_{i-1}} ) will have the wrong size for use as

1311-748: Is larger than the block length n {\displaystyle n} , and produces a hash of size 2 n {\displaystyle 2n} . For example, any of the AES candidates with a 192- or 256-bit key (and 128-bit block). Each round accepts a portion of the message m i {\displaystyle m_{i}} that is k − n {\displaystyle k-n} bits long, and uses it to update two n {\displaystyle n} -bit state values G {\displaystyle G} and H {\displaystyle H} . Merkle%E2%80%93Damg%C3%A5rd construction In cryptography ,

1368-450: Is then XORed (⊕) with the same message block ( m i {\displaystyle m_{i}} ) and then also XORed with the previous hash value ( H i − 1 {\displaystyle H_{i-1}} ) to produce the next hash value ( H i {\displaystyle H_{i}} ). The previous hash value ( H i − 1 {\displaystyle H_{i-1}} )

1425-430: Is very similar to the Merkle–Damgård construction but has a larger internal state size, meaning that the bit-length that is internally used is larger than the output bit-length. If a hash of n bits is desired, then the compression function f takes 2 n bits of chaining value and m bits of the message and compresses this to an output of 2 n bits. Therefore, in a final step, a second compression function compresses

1482-527: The Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions . This construction was used in the design of many popular hash algorithms such as MD5 , SHA-1 , and SHA-2 . The Merkle–Damgård construction was described in Ralph Merkle 's Ph.D. thesis in 1979. Ralph Merkle and Ivan Damgård independently proved that

1539-431: The plaintext ) and return one single output (the ciphertext ) which is the same size as the input plaintext. However, modern block ciphers are only partially one-way. That is, given a plaintext and a ciphertext it is infeasible to find a key that encrypts the plaintext to the ciphertext. But, given a ciphertext and a key a matching plaintext can be found simply by using the block cipher's decryption function. Thus, to turn

1596-401: The "infeasibility" in preimage-resistance and second preimage-resistance to mean a work of about 2 n {\displaystyle 2^{n}} where n {\displaystyle n} is the number of bits in the hash function's output. However, particularly for second preimage-resistance this is a difficult problem. A common use of one-way compression functions is in

1653-611: The Davies–Meyer construction is that even if the underlying block cipher is totally secure, it is possible to compute fixed points for the construction: for any m {\displaystyle m} , one can find a value of h {\displaystyle h} such that E m ( h ) ⊕ h = h {\displaystyle E_{m}(h)\oplus h=h} : one just has to set h = E m − 1 ( 0 ) {\displaystyle h=E_{m}^{-1}(0)} . This

1710-404: The Merkle–Damgård construction inside cryptographic hash functions. Most widely used hash functions, including MD5 , SHA-1 (which is deprecated) and SHA-2 use this construction. A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equal-sized blocks, and operating on them in sequence using

1767-402: The block cipher has different block and key sizes the hash value ( H i − 1 {\displaystyle H_{i-1}} ) will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function g {\displaystyle g} to be converted/padded to fit as key for

One-way compression function - Misplaced Pages Continue

1824-718: The block size in bits) if the used function f {\displaystyle f} is collision-resistant. Hence, the Merkle–Damgård hash construction reduces the problem of finding a proper hash function to finding a proper compression function. A second preimage attack (given a message m 1 {\displaystyle m_{1}} an attacker finds another message m 2 {\displaystyle m_{2}} to satisfy hash ⁡ ( m 1 ) = hash ⁡ ( m 2 ) {\displaystyle \operatorname {hash} (m_{1})=\operatorname {hash} (m_{2})} can be done according to Kelsey and Schneier for

1881-405: The block size of the compression function is 8 bytes (64 bits). We get two blocks (the padding octets shown with the lightblue background color): This implies that other messages having the same content but ending with additional zeros at the end will result in the same hash value. In the above example, another almost-identical message (0x48617368496e7075 74 00 ) will generate the same hash value as

1938-570: The cipher. In mathematical notation Matyas–Meyer–Oseas can be described as: The scheme has the rate: A second preimage attack (given a message m 1 {\displaystyle m_{1}} an attacker finds another message m 2 {\displaystyle m_{2}} to satisfy hash ⁡ ( m 1 ) = hash ⁡ ( m 2 ) {\displaystyle \operatorname {hash} (m_{1})=\operatorname {hash} (m_{2})} ) can be done according to Kelsey and Schneier for

1995-480: The complexity is above 2 n / 2 {\displaystyle 2^{n/2}} but below 2 n {\displaystyle 2^{n}} when messages are long and that when messages get shorter the complexity of the attack approaches 2 n {\displaystyle 2^{n}} . The security of the Davies–Meyer construction in the Ideal Cipher Model

2052-422: The complexity of the attack approaches 2 n {\displaystyle 2^{n}} . The Hirose double-block-length one-way compression function consists of a block cipher plus a permutation p {\displaystyle p} . It was proposed by Shoichi Hirose in 2006 and is based on a work by Mridul Nandi . It uses a block cipher whose key length k {\displaystyle k}

2109-451: The complexity of the attack approaches 2 n {\displaystyle 2^{n}} . The Miyaguchi–Preneel single-block-length one-way compression function is an extended variant of Matyas–Meyer–Oseas. It was independently proposed by Shoji Miyaguchi and Bart Preneel . It feeds each block of the message ( m i {\displaystyle m_{i}} ) as the plaintext to be encrypted. The output ciphertext

2166-424: The compression function. In so doing the construction takes in longer message blocks every iteration than the original wide pipe. Using the same function f as before, it takes n -bit chaining values and n + m bits of the message. However, the price to pay is the extra memory used in the construction for feed-forward. The MD construction is inherently sequential. There is a parallel algorithm which constructs

2223-409: The diagram, the one-way compression function is denoted by f , and transforms two fixed length inputs to an output of the same size as one of the inputs. The algorithm starts with an initial value, the initialization vector (IV). The IV is a fixed value (algorithm- or implementation-specific). For each message block, the compression (or compacting) function f takes the result so far, combines it with

2280-409: The following conditions are met: The constructions presented below: Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel and Hirose have been shown to be secure under the black-box analysis. The goal is to show that any attack that can be found is at most as efficient as the birthday attack under certain assumptions. The black-box model assumes that a block cipher is used that is randomly chosen from

2337-585: The key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function g {\displaystyle g} to be converted/padded to fit as key for the cipher. In mathematical notation Miyaguchi–Preneel can be described as: The scheme has the rate: The roles of m i {\displaystyle m_{i}} and H i − 1 {\displaystyle H_{i-1}} may be switched, so that H i − 1 {\displaystyle H_{i-1}}

One-way compression function - Misplaced Pages Continue

2394-456: The last internal hash value ( 2 n bits) to the final hash value ( n bits). This can be done as simply as discarding half of the last 2 n -bit output. SHA-512/224 and SHA-512/256 take this form since they are derived from a variant of SHA-512. SHA-384 and SHA-224 are similarly derived from SHA-512 and SHA-256, respectively, but the width of their pipe is much less than 2 n . It has been demonstrated by Mridul Nandi and Souradyuti Paul that

2451-443: The message block, and produces an intermediate result. The last block is padded with zeros as needed and bits representing the length of the entire message are appended. (See below for a detailed length-padding example.) To harden the hash further, the last result is then sometimes fed through a finalisation function . The finalisation function can have several purposes such as compressing a bigger internal state (the last result) into

2508-409: The message length value (see SHA-1 pseudocode ). Further improvement can be made by inserting the length value in the last block if there is enough space. Doing so avoids having an extra block for the message length. If we assume the length value is encoded on 5 bytes (40 bits), the message becomes: Storing the message length out-of-band in metadata, or otherwise embedded at the start of the message,

2565-407: The next hash value ( H i {\displaystyle H_{i}} ). The previous hash value ( H i − 1 {\displaystyle H_{i-1}} ) is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value ( H 0 {\displaystyle H_{0}} ). If

2622-440: The one-way compression function f is collision resistant , then so is the hash function constructed using it. Unfortunately, this construction also has several undesirable properties: Due to several structural weaknesses of Merkle–Damgård construction, especially the length extension problem and multicollision attacks, Stefan Lucks proposed the use of the wide-pipe hash instead of Merkle–Damgård construction. The wide-pipe hash

2679-426: The original message "HashInput" above. In other words, any message having extra zeros at the end makes it indistinguishable from the one without them. To prevent this situation, the first bit of the first padding octet is changed to "1" (0x80), yielding: However, most common implementations use a fixed bit-size (generally 64 or 128 bits in modern algorithms) at a fixed position at the end of the last block for inserting

2736-401: The pair was asked for the first time. They both share a table for these triplets, a pair from the query and corresponding response, and return the record, if a query was received for the second time. For the proof there is a collision finding algorithm that makes randomly chosen queries to the oracles. The algorithm returns 1, if two responses result in a collision involving the hash function that

2793-537: The plaintext to be encrypted. The output ciphertext is then also XORed (⊕) with the previous hash value ( H i − 1 {\displaystyle H_{i-1}} ) to produce the next hash value ( H i {\displaystyle H_{i}} ). In the first round when there is no previous hash value it uses a constant pre-specified initial value ( H 0 {\displaystyle H_{0}} ). In mathematical notation Davies–Meyer can be described as: The scheme has

2850-467: The rate (k is the keysize): If the block cipher uses for instance 256-bit keys then each message block ( m i {\displaystyle m_{i}} ) is a 256-bit chunk of the message. If the same block cipher uses a block size of 128 bits then the input and output hash values in each round is 128 bits. Variations of this method replace XOR with any other group operation, such as addition on 32-bit unsigned integers. A notable property of

2907-474: The ratio between the number of block cipher operations and the output. More precisely, the rate represents the ratio between the number of processed bits of input m {\displaystyle m} , the output bit-length n {\displaystyle n} of the block cipher, and the necessary block cipher operations s {\displaystyle s} to produce these n {\displaystyle n} output bits. Generally,

SECTION 50

#1732787831869

2964-413: The result into blocks of fixed size, and processes them one at a time with the compression function, each time combining a block of the input with the output of the previous round. In order to make the construction secure, Merkle and Damgård proposed that messages be padded with a padding that encodes the length of the original message. This is called length padding or Merkle–Damgård strengthening . In

3021-464: The structure is sound: that is, if an appropriate padding scheme is used and the compression function is collision-resistant , then the hash function will also be collision-resistant. The Merkle–Damgård hash function first applies an MD-compliant padding function to create an input whose size is a multiple of a fixed number (e.g. 512 or 1024) — this is because compression functions cannot handle inputs of arbitrary size. The hash function then breaks

3078-414: The underlying block cipher. Consequently, double-block-length compression functions output twice the number of bits. If a block cipher has a block size of say 128 bits single-block-length methods create a hash function that has the block size of 128 bits and produces a hash of 128 bits. Double-block-length methods make hashes with double the hash size compared to the block size of the block cipher used. So

3135-470: The underlying compression function. Therefore, the Merkle–Damgård construction is provably secure when the underlying compression function is secure. To be able to feed the message to the compression function, the last block must be padded with constant data (generally with zeroes) to a full block. For example, suppose the message to be hashed is "HashInput" (9 octet string, 0x48617368496e707574 in ASCII ) and

3192-457: The usage of fewer block cipher operations results in a better overall performance of the entire hash function, but it also leads to a smaller hash-value which could be undesirable. The rate is expressed by the formula: R h = | m i | s ⋅ n {\displaystyle R_{h}={\frac {\left|m_{i}\right|}{s\cdot n}}} The hash function can only be considered secure if at least

3249-452: Was first proven by R. Winternitz. The Matyas–Meyer–Oseas single-block-length one-way compression function can be considered the dual (the opposite) of Davies–Meyer. It feeds each block of the message ( m i {\displaystyle m_{i}} ) as the plaintext to be encrypted. The output ciphertext is then also XORed (⊕) with the same message block ( m i {\displaystyle m_{i}} ) to produce

#868131