A cryptographic hash function ( CHF ) is a hash algorithm (a map of an arbitrary binary string to a binary string with a fixed size of n {\displaystyle n} bits) that has special properties desirable for a cryptographic application:
58-585: Snefru is a cryptographic hash function invented by Ralph Merkle in 1990 while working at Xerox PARC . The function supports 128-bit and 256-bit output. It was named after the Egyptian Pharaoh Sneferu , continuing the tradition of the Khufu and Khafre block ciphers . The original design of Snefru was shown to be insecure by Eli Biham and Adi Shamir who were able to use differential cryptanalysis to find hash collisions. The design
116-406: A malicious adversary cannot replace or modify the input data without changing its digest. Thus, if two strings have the same digest, one can be very confident that they are identical. Second pre-image resistance prevents an attacker from crafting a document with the same hash as a document the attacker cannot control. Collision resistance prevents an attacker from creating two distinct documents with
174-447: A CRC was used for message integrity in the WEP encryption standard, but an attack was readily discovered, which exploited the linearity of the checksum. In cryptographic practice, "difficult" generally means "almost certainly beyond the reach of any adversary who must be prevented from breaking the system for as long as the security of the system is deemed important". The meaning of the term
232-661: A block cipher. A hash function built with the Merkle–Damgård construction is as resistant to collisions as is its compression function; any collision for the full hash function can be traced back to a collision in the compression function. The last block processed should also be unambiguously length padded ; this is crucial to the security of this construction. This construction is called the Merkle–Damgård construction . Most common classical hash functions, including SHA-1 and MD5 , take this form. A straightforward application of
290-501: A conventional mode of operation, without the same security guarantees; for example, SHACAL , BEAR and LION . Pseudorandom number generators (PRNGs) can be built using hash functions. This is done by combining a (secret) random seed with a counter and hashing it. Some hash functions, such as Skein , Keccak , and RadioGatún , output an arbitrarily long stream and can be used as a stream cipher , and stream ciphers can also be built from fixed-length digest hash functions. Often this
348-432: A cryptographic hash is as follows: Alice poses a tough math problem to Bob and claims that she has solved it. Bob would like to try it himself, but would yet like to be sure that Alice is not bluffing. Therefore, Alice writes down her solution, computes its hash, and tells Bob the hash value (whilst keeping the solution secret). Then, when Bob comes up with the solution himself a few days later, Alice can prove that she had
406-553: A cryptographic hash to be calculated over the message. This allows the signature calculation to be performed on the relatively small, statically sized hash digest. The message is considered authentic if the signature verification succeeds given the signature and recalculated hash digest over the message. So the message integrity property of the cryptographic hash is used to create secure and efficient digital signature schemes. Password verification commonly relies on cryptographic hashes. Storing all user passwords as cleartext can result in
464-525: A cryptographic hash to increase the time (and in some cases computer memory) required to perform brute-force attacks on stored password hash digests. For details, see § Attacks on hashed passwords . A password hash also requires the use of a large random, non-secret salt value that can be stored with the password hash. The salt is hashed with the password, altering the password hash mapping for each password, thereby making it infeasible for an adversary to store tables of precomputed hash values to which
522-454: A deliberate attack. For example, a denial-of-service attack on hash tables is possible if the collisions are easy to find, as in the case of linear cyclic redundancy check (CRC) functions. Most cryptographic hash functions are designed to take a string of any length as input and produce a fixed-length hash value. A cryptographic hash function must be able to withstand all known types of cryptanalytic attack . In theoretical cryptography,
580-477: A hash by trying all possible messages in the set. Because cryptographic hash functions are typically designed to be computed quickly, special key derivation functions that require greater computing resources have been developed that make such brute-force attacks more difficult. In some theoretical analyses "difficult" has a specific mathematical meaning, such as "not solvable in asymptotic polynomial time ". Such interpretations of difficulty are important in
638-801: A larger internal state size – which range from tweaks of the Merkle–Damgård construction to new constructions such as the sponge construction and HAIFA construction . None of the entrants in the NIST hash function competition use a classical Merkle–Damgård construction. Meanwhile, truncating the output of a longer hash, such as used in SHA-512/256, also defeats many of these attacks. Hash functions can be used to build other cryptographic primitives . For these other primitives to be cryptographically secure, care must be taken to build them correctly. Message authentication codes (MACs) (also called keyed hash functions) are often built from hash functions. HMAC
SECTION 10
#1732781043767696-454: A massive security breach if the password file is compromised. One way to reduce this danger is to only store the hash digest of each password. To authenticate a user, the password presented by the user is hashed and compared with the stored hash. A password reset method is required when password hashing is performed; original passwords cannot be recalculated from the stored hash value. However, use of standard cryptographic hash functions, such as
754-516: A number of zero bits. The average work that the sender needs to perform in order to find a valid message is exponential in the number of zero bits required in the hash value, while the recipient can verify the validity of the message by executing a single hash function. For instance, in Hashcash, a sender is asked to generate a header whose 160-bit SHA-1 hash value has the first 20 bits as zeros. The sender will, on average, have to try 2 times to find
812-405: A particular kind, cryptographic hash functions lend themselves well to this application too. However, compared with standard hash functions, cryptographic hash functions tend to be much more expensive computationally. For this reason, they tend to be used in contexts where it is necessary for users to protect themselves against the possibility of forgery (the creation of data with the same digest as
870-439: A specified type to a specific user at a location could be implemented to handle requests of the given format: The server would perform the request given (to deliver ten waffles of type eggo to the given location for user "1") only if the signature is valid for the user. The signature used here is a MAC , signed with a key not known to the attacker. It is possible for an attacker to modify the request in this example by switching
928-442: A valid header. A message digest can also serve as a means of reliably identifying a file; several source code management systems, including Git , Mercurial and Monotone , use the sha1sum of various types of content (file content, directory trees, ancestry information, etc.) to uniquely identify them. Hashes are used to identify files on peer-to-peer filesharing networks. For example, in an ed2k link , an MD4 -variant hash
986-908: Is a stub . You can help Misplaced Pages by expanding it . Cryptographic hash function Cryptographic hash functions have many information-security applications, notably in digital signatures , message authentication codes (MACs), and other forms of authentication . They can also be used as ordinary hash functions , to index data in hash tables , for fingerprinting , to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption. Indeed, in information-security contexts, cryptographic hash values are sometimes called ( digital ) fingerprints , checksums , or just hash values , even though all these terms stand for more general functions with rather different properties and purposes. Non-cryptographic hash functions are used in hash tables and to detect accidental errors; their constructions frequently provide no resistance to
1044-530: Is based on a sponge construction, which can also be used to build other cryptographic primitives such as a stream cipher. SHA-3 provides the same output sizes as SHA-2: 224, 256, 384, and 512 bits. Length extension attack In cryptography and computer security , a length extension attack is a type of attack where an attacker can use Hash ( message 1 ) and the length of message 1 to calculate Hash ( message 1 ‖ message 2 ) for an attacker-controlled message 2 , without needing to know
1102-591: Is based on a substantially modified version of the Advanced Encryption Standard (AES). Whirlpool produces a hash digest of 512 bits (64 bytes). SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA), first published in 2001. They are built using the Merkle–Damgård structure, from a one-way compression function itself built using
1160-422: Is combined with the file size, providing sufficient information for locating file sources, downloading the file, and verifying its contents. Magnet links are another example. Such file hashes are often the top hash of a hash list or a hash tree , which allows for additional benefits. One of the main applications of a hash function is to allow the fast look-up of data in a hash table . Being hash functions of
1218-508: Is done by first building a cryptographically secure pseudorandom number generator and then using its stream of random bytes as keystream . SEAL is a stream cipher that uses SHA-1 to generate internal tables, which are then used in a keystream generator more or less unrelated to the hash algorithm. SEAL is not guaranteed to be as strong (or weak) as SHA-1. Similarly, the key expansion of the HC-128 and HC-256 stream ciphers makes heavy use of
SECTION 20
#17327810437671276-416: Is given by the extension to the "SHA" name, so SHA-224 has an output size of 224 bits (28 bytes); SHA-256, 32 bytes; SHA-384, 48 bytes; and SHA-512, 64 bytes. SHA-3 (Secure Hash Algorithm 3) was released by NIST on August 5, 2015. SHA-3 is a subset of the broader cryptographic primitive family Keccak. The Keccak algorithm is the work of Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. Keccak
1334-484: Is insufficient for many practical uses. In addition to collision resistance, it should be impossible for an adversary to find two messages with substantially similar digests; or to infer any useful information about the data, given only its digest. In particular, a hash function should behave as much as possible like a random function (often called a random oracle in proofs of security) while still being deterministic and efficiently computable. This rules out functions like
1392-442: Is similar to content-addressable memory . CAS systems work by passing the content of the file through a cryptographic hash function to generate a unique key, the "content address". The file system 's directory stores these addresses and a pointer to the physical storage of the content. Because an attempt to store the same file will generate the same key, CAS systems ensure that the files within them are unique, and because changing
1450-430: Is such a MAC. Just as block ciphers can be used to build hash functions, hash functions can be used to build block ciphers. Luby-Rackoff constructions using hash functions can be provably secure if the underlying hash function is secure. Also, many hash functions (including SHA-1 and SHA-2 ) are built by using a special-purpose block cipher in a Davies–Meyer or other construction. That cipher can also be used in
1508-478: Is therefore somewhat dependent on the application since the effort that a malicious agent may put into the task is usually proportional to their expected gain. However, since the needed effort usually multiplies with the digest length, even a thousand-fold advantage in processing power can be neutralized by adding a dozen bits to the latter. For messages selected from a limited set of messages, for example passwords or other short messages, it can be feasible to invert
1566-475: The Merkle–Damgård construction are susceptible to this kind of attack. Truncated versions of SHA-2, including SHA-384 and SHA-512/256 are not susceptible, nor is the SHA-3 algorithm. HMAC also uses a different construction and so is not vulnerable to length extension attacks. Lastly, just performing Hash ( message ‖ secret ) is enough to not be affected. The vulnerable hashing functions work by taking
1624-570: The SHA-256 hash function. Concatenating outputs from multiple hash functions provide collision resistance as good as the strongest of the algorithms included in the concatenated result. For example, older versions of Transport Layer Security (TLS) and Secure Sockets Layer (SSL) used concatenated MD5 and SHA-1 sums. This ensures that a method to find collisions in one of the hash functions does not defeat data protected by both hash functions. For Merkle–Damgård construction hash functions,
1682-483: The SWIFFT function, which can be rigorously proven to be collision-resistant assuming that certain problems on ideal lattices are computationally difficult, but, as a linear function, does not satisfy these additional properties. Checksum algorithms, such as CRC32 and other cyclic redundancy checks , are designed to meet much weaker requirements and are generally unsuitable as cryptographic hash functions. For example,
1740-521: The shattered attack and the hash function should be considered broken. SHA-1 produces a hash digest of 160 bits (20 bytes). Documents may refer to SHA-1 as just "SHA", even though this may conflict with the other Secure Hash Algorithms such as SHA-0, SHA-2, and SHA-3. RIPEMD (RACE Integrity Primitives Evaluation Message Digest) is a family of cryptographic hash functions developed in Leuven, Belgium, by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel at
1798-655: The COSIC research group at the Katholieke Universiteit Leuven, and first published in 1996. RIPEMD was based upon the design principles used in MD4 and is similar in performance to the more popular SHA-1. RIPEMD-160 has, however, not been broken. As the name implies, RIPEMD-160 produces a hash digest of 160 bits (20 bytes). Whirlpool is a cryptographic hash function designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000. Whirlpool
Snefru - Misplaced Pages Continue
1856-504: The Davies–Meyer structure from a (classified) specialized block cipher. SHA-2 basically consists of two hash algorithms: SHA-256 and SHA-512. SHA-224 is a variant of SHA-256 with different starting values and truncated output. SHA-384 and the lesser-known SHA-512/224 and SHA-512/256 are all variants of SHA-512. SHA-512 is more secure than SHA-256 and is commonly faster than SHA-256 on 64-bit machines such as AMD64 . The output size in bits
1914-428: The Merkle–Damgård construction, where the size of hash output is equal to the internal state size (between each compression step), results in a narrow-pipe hash design. This design causes many inherent flaws, including length-extension , multicollisions, long message attacks, generate-and-paste attacks, and also cannot be parallelized. As a result, modern hash functions are built on wide-pipe constructions that have
1972-488: The SHA series, is no longer considered safe for password storage. These algorithms are designed to be computed quickly, so if the hashed values are compromised, it is possible to try guessed passwords at high rates. Common graphics processing units can try billions of possible passwords each second. Password hash functions that perform key stretching – such as PBKDF2 , scrypt or Argon2 – commonly use repeated invocations of
2030-525: The algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology). It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS ; PUB 180-1 and commonly designated SHA-1. Collisions against the full SHA-1 algorithm can be produced using
2088-417: The attacker would need to know the key the message was signed with, and generate a new signature by generating a new MAC. However, with a length extension attack, it is possible to feed the hash (the signature given above) into the state of the hashing function, and continue where the original request had left off, so long as the length of the original request is known. In this request, the original key's length
2146-409: The concatenated function is as collision-resistant as its strongest component, but not more collision-resistant. Antoine Joux observed that 2-collisions lead to n -collisions: if it is feasible for an attacker to find two messages with the same MD5 hash, then they can find as many additional messages with that same MD5 hash as they desire, with no greater difficulty. Among those n messages with
2204-408: The content of message 1 . This is problematic when the hash is used as a message authentication code with construction Hash ( secret ‖ message ), and message and the length of secret is known, because an attacker can include extra information at the end of the message and produce a valid hash without knowing the secret. Algorithms like MD5 , SHA-1 and most of SHA-2 that are based on
2262-442: The expected data) by potentially malicious participants. Content-addressable storage (CAS), also referred to as content-addressed storage or fixed-content storage, is a way to store information so it can be retrieved based on its content, not its name or location. It has been used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations . Content-addressable storage
2320-455: The file will result in a new key, CAS systems provide assurance that the file is unchanged. There are several methods to use a block cipher to build a cryptographic hash function, specifically a one-way compression function . The methods resemble the block cipher modes of operation usually used for encryption. Many well-known hash functions, including MD4 , MD5 , SHA-1 and SHA-2 , are built from block-cipher-like components designed for
2378-461: The hashes are posted on a trusted site – usually the originating site – authenticated by HTTPS . Using a cryptographic hash and a chain of trust detects malicious changes to the file. Non-cryptographic error-detecting codes such as cyclic redundancy checks only prevent against non-malicious alterations of the file, since an intentional spoof can readily be crafted to have the colliding code value. Almost all digital signature schemes require
Snefru - Misplaced Pages Continue
2436-452: The input message, and using it to transform an internal state. After all of the input has been processed, the hash digest is generated by outputting the internal state of the function. It is possible to reconstruct the internal state from the hash digest, which can then be used to process the new data. In this way, one may extend the message and compute the hash that is a valid signature for the new message. A server for delivering waffles of
2494-410: The internal states of their message and the original will line up. Thus, the attacker constructs a slightly different message using these padding rules: This message includes all of the padding that was appended to the original message inside of the hash function before their payload (in this case, a 0x80 followed by a number of 0x00s and a message length, 0x228 = 552 = (14+55)*8, which is the length of
2552-475: The key changes each block; and related-key attacks make it potentially less secure for use in a hash function than for encryption. A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equally sized blocks, and operating on them in sequence using a one-way compression function . The compression function can either be specially designed for hashing or be built from
2610-422: The key plus the original message, appended at the end). The attacker knows that the state behind the hashed key/message pair for the original message is identical to that of new message up to the final "&." The attacker also knows the hash digest at this point, which means he knows the internal state of the hashing function at that point. It is then trivial to initialize a hashing algorithm at that point, input
2668-400: The message) calculated before, and after, transmission can determine whether any changes have been made to the message or file . MD5 , SHA-1 , or SHA-2 hash digests are sometimes published on websites or forums to allow verification of integrity for downloaded files, including files retrieved using file sharing such as mirroring . This practice establishes a chain of trust as long as
2726-416: The password hash digest can be compared or to test a large number of purloined hash values in parallel. A proof-of-work system (or protocol, or function) is an economic measure to deter denial-of-service attacks and other service abuses such as spam on a network by requiring some work from the service requester, usually meaning processing time by a computer. A key feature of these schemes is their asymmetry:
2784-976: The purpose, with feedback to ensure that the resulting function is not invertible. SHA-3 finalists included functions with block-cipher-like components (e.g., Skein , BLAKE ) though the function finally selected, Keccak , was built on a cryptographic sponge instead. A standard block cipher such as AES can be used in place of these custom block ciphers; that might be useful when an embedded system needs to implement both encryption and hashing with minimal code size or hardware area. However, that approach can have costs in efficiency and security. The ciphers in hash functions are built for hashing: they use large keys and blocks, can efficiently change keys every block, and have been designed and vetted for resistance to related-key attacks . General-purpose ciphers tend to have different design goals. In particular, AES has key and block sizes that make it nontrivial to use to generate long hash values; AES encryption becomes less efficient when
2842-460: The requested waffle from " eggo " to " liege ." This can be done by taking advantage of a flexibility in the message format if duplicate content in the query string gives preference to the latter value. This flexibility does not indicate an exploit in the message format, because the message format was never designed to be cryptographically secure in the first place, without the signature algorithm to help it. In order to sign this new message, typically
2900-479: The same MD5 hash, there is likely to be a collision in SHA-1. The additional work needed to find the SHA-1 collision (beyond the exponential birthday search) requires only polynomial time . There are many cryptographic hash algorithms; this section lists a few algorithms that are referenced relatively often. A more extensive list can be found on the page containing a comparison of cryptographic hash functions . MD5
2958-527: The same hash. A function meeting these criteria may still have undesirable properties. Currently, popular cryptographic hash functions are vulnerable to length-extension attacks : given hash( m ) and len( m ) but not m , by choosing a suitable m ′ an attacker can calculate hash( m ∥ m ′ ) , where ∥ denotes concatenation . This property can be used to break naive authentication schemes based on hash functions. The HMAC construction works around these problems. In practice, collision resistance
SECTION 50
#17327810437673016-465: The security level of a cryptographic hash function has been defined using the following properties: Collision resistance implies second pre-image resistance but does not imply pre-image resistance. The weaker assumption is always preferred in theoretical cryptography, but in practice, a hash-function that is only second pre-image resistant is considered insecure and is therefore not recommended for real applications. Informally, these properties mean that
3074-461: The solution earlier by revealing it and having Bob hash it and check that it matches the hash value given to him before. (This is an example of a simple commitment scheme ; in actual practice, Alice and Bob will often be computer programs, and the secret would be something less easily spoofed than a claimed puzzle solution.) An important application of secure hashes is the verification of message integrity . Comparing message digests (hash digests over
3132-414: The study of provably secure cryptographic hash functions but do not usually have a strong connection to practical security. For example, an exponential-time algorithm can sometimes still be fast enough to make a feasible attack. Conversely, a polynomial-time algorithm (e.g., one that requires n steps for n -digit keys) may be too slow for any practical use. An illustration of the potential use of
3190-508: The work must be moderately hard (but feasible) on the requester side but easy to check for the service provider. One popular system – used in Bitcoin mining and Hashcash – uses partial hash inversions to prove that work was done, to unlock a mining reward in Bitcoin, and as a good-will token to send an e-mail in Hashcash. The sender is required to find a message whose hash value begins with
3248-502: Was 14 bytes, which could be determined by trying forged requests with various assumed lengths, and checking which length results in a request that the server accepts as valid. The message as fed into the hashing function is often padded , as many algorithms can only work on input messages whose lengths are a multiple of some given size. The content of this padding is always specified by the hash function used. The attacker must include all of these padding bits in their forged message before
3306-454: Was designed by Ronald Rivest in 1991 to replace an earlier hash function, MD4, and was specified in 1992 as RFC 1321. Collisions against MD5 can be calculated within seconds, which makes the algorithm unsuitable for most use cases where a cryptographic hash is required. MD5 produces a digest of 128 bits (16 bytes). SHA-1 was developed as part of the U.S. Government's Capstone project. The original specification – now commonly called SHA-0 – of
3364-443: Was then modified by increasing the number of iterations of the main pass of the algorithm from two to eight. Although differential cryptanalysis can break the revised version with less complexity than brute force search (a certificational weakness), the attack requires 2 88.5 {\displaystyle 2^{88.5}} operations and is thus not currently feasible in practice. This cryptography-related article
#766233