This week I spent a little bit of time working on Hash Length Extensions. HLE is a technique that allows an attacker to abuse poorly constructed authentication schemes. For example, a MAC scheme such as H(secret | message)
would be vulnerable to HLE.
While I mostly understood how this attack worked, I wanted to try performing the exploit myself so that it is better ingrained in my mind. Let’s suppose a scenario that would make sense to use a hash length extension attack. Imagine that you captured an HTTP request that had this in the POST body:
to=12345&amt=103.54¬e=Thanks!
and the following header
cookie=__Host-Mac=ef80cdcd3a4ffbd090b55bdd4874d92b
Hash functions like MD5 and SHA1 produce digests after taking some input. If the inputs are over a certain size, they are split up into “blocks”. Each block is then processed separately, in order. If you’re familiar with CBC (Cipher Block Chaining) mode, then you’re aware that in CBC each blocks’ result acts as an input into the next block. One thing to note is that these digest functions will add padding to a message before the message is processed. The padding is predictable and therefore doesn’t add to the security of the construction.
We can view this process like a math equation. Let’s assume our message is made up of two parts: a secret and the message body, which we’ll call message
for simplicity’s sake. On the surface, this MAC in our previous example was constructed in the following way: md5(secret | message)
.
However, we know that md5
will automatically add padding. As long as we know the original size of secret | message
then we can reliably predict the padding (and even if we didn’t, we can quickly iterate through all of the possible outcomes, especially if the service acts as an oracle). In our case, for the sake of time, I’ll cheat and inform you that the size of secret
is 8 ascii characters long.
So after md5
adds padding, we’ll actually see the following inside the digest function: md5(secret | message | padding)
. This produces the “MAC” we found in the cookie. However, since we know that the size of secret | message
is very small (less than 64 bytes) and we know that ef80cdcd3a4ffbd090b55bdd4874d92b
is the result of md5 on the first block of input, we have all the information we need to calculate a second block.
Imagine for a moment that we wanted to add some text to the original message (in our case, we’ll add the message "&to=98765&amt=1000.00"
to simulate the attacker forcing the user to send him money). We’ll refer to this additional message as message_2
. If the attacker leaves the padding in and appends their message to the end we get md5(secret | message | padding | message_2)
. In this case, however, the md5 digest function will see the internally because we have enough information to be larger than 1 block but less than 2 blocks, so we add some additional padding.
We know the result of the first block but we want to figure out what the result of the second block is, which is the result of md5'ing the whole combined message together. This value would also be a valid MAC because the secret
is still in the front (the red color). Thankfully, when calculating the second block, the first block’s hash is the input into the second block. So with m_2 | p'
and the first block’s hash of ef8...92b
we can calculate the second, and final, block’s digest.
Here’s a gist with code that will do this for us:
Closing Remarks
I look forward to adding a lot more crypto entries to this blog. Though, in the coming months, I’ll probably start making my way back on over to doing more penetration testing stuff since my PWK start date is getting closer and closer…
See ya next week*!
* maybe