Hashing
What is Hashing
Very simplified: Hashing is a mathmatical operation used to take an Input and turn it into a different Output.
More specifically: the hasning function will take an input of variable size and produce an output of a fixed size. The mathmatical process should not be reversible, the same input should always yield a constant output, a small change to the input should have a large change on the output, and two different inputs should never generate the same output.
Why we Hash
There are few reason, but let’s focus on two. Files and Secure Communications.
-
Files:
If you download a file from a website, you want to make sure the file on your computer is the file as it was put onto the website - you don’t want a file that was modified by a malicious actor and you don’t want a file corrupted during download. Using a hash published on the original site you can see what hash the author said the file had, when they uploaded it to the site. Then you check the hash of the file on your computer and see if they match. Here I’ve written up how to do this with PowerShell. -
Secure Communications:
As part of a TLS communication channel, the messages sent, after being encrypted are also hashed - this is a signature - so that they can validated. This is effectively the same thing as validating the file example above, but built into the protocol to happen automatically.
Some Hash Functions
There have been a lot of Hash Functions over the years - many have been discarded. The most commonly used today are MD5, SHA-1, and SHA-2. MD5 is considered cryptographically broken, but it’s probably fine for checking to see if that EXE you downloaded was what the author said it was. SHA-1 is considered insecure, and isn’t recommended for Secure Communications, but is still fine for looking at files you grab off the internet. SHA-2 is the best option for Secure Communications, but actually isn’t the newest. SHA-3 has been published for a while - NIST published it as a standard back in 2015…
I’m not sure why this hasn’t been adopted widely, yet. I see the OpenSSL has it listed in their man page. But looking around more it isn’t mentioned much anywhere else…
So, until SHA-3 gains wider traction use SHA-2 where you can.