The news of NIST and their SHA-3 algorithm competition and a recent lunch and learn at Denim Group reminded me of the Cryptographic lectures I gave at UTSA. One of the hardest concepts my students had grasping was secure cryptographic hash functions, partially because of the number theory, but also in differentiating between the three properties of a secure hash function: collision resistance, preimage resistance, and second preimage resistance.
Preimage resistance: Given a hash value h, it should be hard to find any message m such that h = hash(k, m).
This is the most usual property developers think of when they think of a cryptographic hash function. Unlike an encryption, there should be no “dehashing” function. A good preimage resistant function should be “hard” to invert. An example of a hash function that is not preimage resistant is h = hash(k, m) = m mod 2k. For cryptographic hash functions, “hard” is, for a hash function with a range of size 2k, it should take the adversary 2k/2 or 2k-1 attempts before he or she finds the solution. Since this is very easy to invert, for any value h, a message of m can be found (basically, every message of the form h + x * 2k, where x is an integer.
Collision resistance: Given two messages m1 and m2, it should be “hard” to find a hash such that hash(k, m1) = hash(k, m2), where k is the hash key.
What this says is that given complete control over picking any messages you want, it should be “hard” to find two of them such that have the same hash value the same hash. An example of a hash function that is not collision resistant is hash(k, m) = 4, since all hashes result in 4, making it 100% likely that two messages will have the same hash. This is the hardest standard for a hash function to attain; collision resistant hash functions are in fact a superset of second preimage resistant ones (if you’ve found a second preimage of a hash function, you have necessarily found a collision).
Second preimage resistance: Given a message m1, it should be hard to find a different message m2 such that hash(k, m1) = hash(k, m2).
This is the toughest of the three for students in cryptography to get their head around. On the surface this seems to be an easier version of collision resistance, because it seems that we have extra information to use. We only need one more message rather than two. In actuality, this is the birthday paradox in action—second preimage resistance is actually a much stronger property than collision resistance.
Of course, it’s important to reiterate that hashes are “digests”, and not “encryption” algorithms. Encryption is a two-way operation. That is, given a message m, an encryption algorithm enc and its corresponding decryption algorithm dec, dec(enc(m)) = m. Hash functions are different—there is (ideally) no way to “decrypt” a hashed message, per preimage resistance. Because of this, it is common to say that hash functions are one-way operations.