Have you ever wondered how your passwords are being kept secure in servers? Well, your passwords aren’t stolen because of hashing, which is what we’ll introduce in this article today. Let’s find out about how it works and how it’s used.
A hashing algorithm takes in a string and decodes it so that it becomes an integer. Then, it mathematically maps it to a number, which is converted into a string. Well, if this is only a function of these algorithms, why can’t you choose an arbitrary function, like x^2, for hashing?
That’s because a hash function must be a one-way function (i.e., it can be computed, but not reverse-engineered). In the example mentioned above, you can take the square root of the output number to know what the input is. Thus, to keep hashes one-way, the output is of finite size. Since there are infinitely many possible strings, infinitely many strings exist that get mapped to the same value in any hash function. Thus, information is lost in a hash function, making it mathematically impossible to derive the original string from the hash.
Many hashing algorithms exist, each of which works differently and returns hashes of different sizes. For example, the MD5 algorithm returns a hash of 128 bits, while the SHA-256 algorithm returns a hash of 256 bits.
How Is Hashing Used?
Hashing is used in many places where data integrity has to be ensured. For instance, how do you quickly compare the contents of two files or two large databases? You don’t compare it file by file and character by character if you don’t want the same time-consuming operation of comparing the databases file by file and byte by byte. Instead, you save the hash of the original directory and hash the other directory using the same function. If the hashes are identical, you can be confident that the two directories are the same. Otherwise, the two directories must be different, which is likely caused by data corruption or malware if you compare two directories that are meant to be the same.
Moreover, these kinds of algorithms are used for password integrity. Passwords should never be stored or transmitted in plaintext. If it does, they would be easily stolen by hackers or data breaches, compromising the security of the users. Thus, passwords are hashed to ensure that no one knows the actual password from the hash alone.
The Weaknesses of Hashing
No matter how robust hashing seems, it still has its shortcomings, and many types of cyberattacks exploit them. However, it has remained a primary method of keeping personal data safe because there are measures that can be implemented to counter these.
First of all, there are attacks known as rainbow table attacks. They work by storing large amounts of hash values of different short strings generated by different hash functions so that attackers can find the original value based on the hash inside the database. Fortunately, this is relatively easy to counter from the user’s perspective, as simply using better, longer, and more complex passwords can make the related search space infeasibly large. Secondly, a random string, known as a grain of “salt”, can be added to the password whenever a user types inside a password box. Even a tiny change in the initial input can make the hash completely different, so this essentially eliminates rainbow table attacks as there will likely be no entries inside the database, even though the initial password might be there.
Without rainbow tables, there’s still the possibility that attackers can guess your passwords by brute force, even when powerful algorithms are protecting them. However, choosing good passwords for your accounts makes them secure. To ensure an elevated probability of successfully hacking accounts, attackers often try short and simple passwords rather than long and complicated ones.
Secondly, hackers can also utilize hash collisions to fake that two files are identical, thus enabling corruption to go undetected. While the probability of this happening remains non-zero even with the most robust hashing algorithms, longer hashes tend to be more collision-resistant than shorter ones because the number of possible values exponentially increases as the number of characters allowed also increases.
In this article, we’ve explained how hashing works, how it can be used to protect data, what its shortcomings are, and how users should protect their hashes. If we’ve missed anything substantial that we should have included, please leave them in the comments below. If you want to learn more about hashing, please visit the websites in the references below.
- Chris Odogwu. (2022, March 4). What Is Hashing and How Does It Work? Retrieved August 20, 2022, from https://www.makeuseof.com/what-is-hashing/
- (n.d.). MD5 Class. Retrieved August 20, 2022, from https://docs.microsoft.com/en-us/dotnet/api/system.security.cryptography.md5
- (n.d.). SHA256 Class. Retrieved August 20, 2022, from https://docs.microsoft.com/en-us/dotnet/api/system.security.cryptography.sha256
- (n.d.). What is a rainbow table attack and how does it work? Retrieved August 20, 2022, from https://www.futurelearn.com/info/courses/hands-on-password-attacks-and-security/0/steps/202820
- Dan Arias. (2021, February 25). Adding Salt to Hashing: A Better Way to Store Passwords. Retrieved August 20, 2022, from https://auth0.com/blog/adding-salt-to-hashing-a-better-way-to-store-passwords/