Cryptographic Hash Codes
Reference > Mathematics > Codes and Secret MessagesA hash code is an odd sort of code that doesn't work quite like the other codes described here, but it's very important. A hash code is designed to take any message of any length and return a fixed length code. For example, the SHA2656 hash code (which is displayed in the encoder on this page), takes a code of any length and turns it into a 64 character code. Go ahead and try it. Type in 'John', and see it get expanded to 64 characters. Now type in the entire Declaration of Independence, and watch it get squished down to 64 characters!
"But wait a minute," you say. "If we are limited to a fixed length response (the hash), you have a finite number of possible hashes, which means that not every message can have its own unique hash!"
You're absolutely right. Since the SHA256 is 64 characters, and each character is one of 16 possible values (see our Hex page for more about hexadecimal!), there is a total of 1.16x1077 hashes. That may seem like a lot, but that means that if you have 2x1077 messages to encode, some of them will have the same hash. That's called a "collision."
This should lead you to realize something interesting: it's not possible to start with the hash, and work backwards to get the original message, because more than one possible message can lead to the same hash!
So what good is a hash code, if you can't decode it? Well, see, that's kind of the point. Here's an example of how we might use a hash code:
When you create a user account on this site, you gave me a password. Now, I could store your password in my database, but if I did, and someone gained unauthorized access to the database, everyone's passwords would be compromised. Not a big deal, unless you used that same password for your credit card, or some other account.
So I don't store your password. Instead, I run your password through a cryptographic hash code and store the result in the database.
When you log into the site, you send your password to the site. I run your password through the hash function, and check to make sure that the hash of your password matches the hash in my database. If it matches, I log you in.
And your password is never stored, which means no one can access it.
Are there problems with hash codes? Sure. If you've been following this explanation, you might be thinking, "But wait a minute! More than one password could have the same hash! So someone could enter something else with the same hash, and get logged in!"
Yep, that's right. But with 1.16x1077 possible hashes, the odds of someone creating another message with your hash are extremely low.
There are four main goals of a good cryptographic hash function:
- It shouldn't require too much computation to calculate the hash of a message (otherwise, websites and other venues would get way too bogged down with dealing with hashes!)
- If you have a particular hash, you shouldn't be able to invent a message that has that hash
- If you change even a tiny bit of a message, the new message should have a different hash.
- Even though there are collisions, it should not be easy to find them!
Hashes that don't match up to those four rules eventually get hacked, and are no longer any good!