2012-04-06

We don't understand hashes

At least I don't, nor do I understand math. It only this week dawned to me, that we consistently choose wrong hash for password hashing.

When I started using Linux DES was the standard way to hash your passwd, then it was MD5, now at least Ubuntu is using SHA. And I can bet that in 2 years time SHA-3 (will be selected this year) will be used widely for protecting passwords.

But what were design goals for MD5 and SHA? Design goals are obviously avoidance of collisions and more importantly algorithmic cheapness in terms of computational requirements and ability to implement it cheaply and easily in hardware requiring no branching. So MD5 and SHA are _by design_ simple to brute-force in hardware, even the new SHA-3. You don't want your 'git commit' or 'sha3sum /dev/cdrom' to take days, you want it to be very very fast and very very unlikely to represent any other data.

It should be quite obvious that those requirements are orthogonal to the requirements of password hash. Avoidance of collisions is not critical to password hash and absolutely opposite requirement of computational expensiveness exists for password hash, it needs to be very expensive and it needs to be poorly implementable in cheap hardware.

This only dawned to me, when co-worker was cracking password of one DWDM system and as it was DES we naively assumed it'll be cracked in seconds, but it turns out in CUDA systems DES is hundreds of times slower to crack than MD5 (it might have been that this was apples to oranges unix DES compared to MD5 instead of unix MD5, but doesn't change the fact that hash like bcrypt makes much more sense for passwords than hashes like MD5 or SHA). It was really illuminating moment for me, obviously this is good for general applications of hash, MD5 was designed to be fast (and so will SHA-3), obviously fast to calculate also means fast to brute-force. Only problem here is, that we're not understanding our application when choosing hash for protecting password, MD5 never was designed to be good for that application. Bcrypt is, I'm not familiar with it, I don't really know how good it is, but at least it's designed to be computationally very expensive and as machines get faster you can make it slower and slower without changing the implementation, you'll just give it different parameter.

1 comment:

  1. Password hashing with sha256/512 in Linux (glibc) iterates hash function 5000 times by default, and iteration count can be raised up to 999999999.

    I bet that'll slow down hacking quite a bit :)

    ReplyDelete