5

I created a hash which is encrypted like this: $What_i_made=BCrypt(MD5(Plain Text Password)) and I wonder if it can be cracked. Currently, I thought of two ways:

  1. Brute force $What_i_made to get the MD5 Hash then do a dictionary attack on the MD5 Hash. However, this will take ages as Bcrypt is so slow and a MD5 is 32 characters long.
  2. $result=Bcrypt(MD5(random combination)) and compare $resultto $What_i_madeuntil they match. This will be much faster, but I am not really sure how to do this. I tried John and Hashcat but I am not really sure how you can do this with them, so I am turning to the community for help. Thanks. :)

BTW, any other tools that work will also do and I would prefer a method which allows for trying every single combinations instead of dictionary attaks.

3 Answers3

7

As a password cracker, I encourage all of my targets to use this technique. ;)

It seems like a good idea, but it turns out that against real-world attacks, wrapping an unsalted hash with bcrypt is demonstrably weaker than simply using bcrypt.

This is because attackers can do this:

  1. Acquire existing MD5 passwords - even ones that haven't been cracked yet
  2. Run these MD5s as a wordlist against your bcrypt(md5($pass)) corpus, to identify bcrypts with known MD5s
  3. crack the MD5s outside of bcrypt at much higher speed

In other words, in many cases you can simply crack the inner hash first. And for a fast hash like MD5, that means that for any password that can be cracked first, bcrypt's resistance to brute-force attack is dramatically weakened.

(I can't take credit for the technique, known as "password shucking", but it's very effective - especially when users reuse passwords across multiple sites, and the attacker has access to leaked password data.)

Here's a more specific, single-user scenario:

  • User jo@example.net has an account on Site B that uses bcrypt(md5($pass))
  • Site B is compromised, and its password-hash lists is leaked online
  • Attacker acquires the Site B leak
  • Attacker does a fast run against the Site B dump and determines from testing that the site is using bcrypt(md5($pass))
  • Attacker first checks to see if any other known leaks in their collection contain jo@example.net
  • If so, and any of those other leaks use MD5, Attacker simply tries jo@example.net's other MD5s to see if Jo was reusing her password, and it's the one "inside" the bcrypt
  • If Jo's MD5 is inside that bcrypt, Attacker can now attack that MD5 at massive speeds until they find a crack. Attacker now knows joe@example.net's original password on Site B

Now, imagine that Attacker wants to attack all 100,000 bcrypt hashes on Site B ... but Attacker also has access to thousands of other leaks:

  • Attacker writes a script to check all MD5 leaks for email addresses that match Site B
  • Attacker first tries the user-specific MD5s against each specific user's bcrypt from Site B. (This is a "correlation attack"). Attacker quickly cracks about 20% (which also removes their salts from the overall attack, increasing attack speed for the remaining hashes)
  • Attacker next tries other known MD5s from common passwords, and similarly removes those bcrypts from the attack
  • Attacker then tries other unknown MD5s from those leaks. If they find one, they can then attack those MD5 as direct MD5s (without involving bcrypt at all)

And yes, the attack can also be done directly - either by MD5'ing the candidate passwords yourself, or using a tool that natively supports bcrypt(md5($pass)), such as MDXfind:

$ echo "password" | tee bcrypt-md5.dict 
password

$ echo -n 'password' | md5sum | awk '{print $1}' 5f4dcc3b5aa765d61d8327deb882cf99

$ htpasswd -bnBC 10 "" 5f4dcc3b5aa765d61d8327deb882cf99 | tr -d ':\n' | tee bcrypt-md5.hash; echo $2y$10$JUbSRB0GQv.yKorqYdBaqeVYLtbZ/sRXdbPWt6u/6R3tqbaWTlQyW

$ mdxfind -h '^BCRYPTMD5$' -f bcrypt-md5.hash bcrypt-md5.dict Working on hash types: BCRYPTMD5 Took 0.00 seconds to read hashes Searching through 0 unique hashes from bcrypt-md5.hash Searching through 1 unique BCRYPT hashes Maximum hash chain depth is 0 Minimum hash length is 512 characters Using 4 cores BCRYPTMD5 $2y$10$JUbSRB0GQv.yKorqYdBaqeVYLtbZ/sRXdbPWt6u/6R3tqbaWTlQyW:password

Done - 1 threads caught 1 lines processed in 0 seconds 1.00 lines per second 0.07 seconds hashing, 2 total hash calculations 0.00M hashes per second (approx) 1 total files 1 BCRYPTMD5x01 hashes found 1 Total hashes found

Unfortunately (for the attacker ;) ), it looks like John the Ripper "jumbo" edition doesn't support this algorithm using its dynamic syntax:

$ john --format=dynamic='bcrypt(md5($pass))' --test
Error: dynamic hash must start with md4/md5/sha1 and NOT a *_raw version. This expression one does not

But for a focused attacker, it's much more efficient to simply dig out those MD5s from your hashes, and then attack those MD5s at speeds of billions of candidates per second on GPU.

If you want to do something like this - for example, to work around bcrypt's 72-character maximum - use a per-plain salt, a site-wide pepper, or true encryption in the MD5 step.

6

This composite hash has no benefit over plain bcrypt. It could be marginally weaker due to MD5 collisions, but I don't think one could actually exploit it to make this attack faster. Anyway, there's no reason to pre-hash passwords with MD5. Now, to the actual answer.


You approach #2 won't work because bcrypt uses integrated random salts. Hashing the same input twice will produce two different hashes because different salts will be generated.

The result of bcrypt is actually a data structure containing actual hash and salt. To verify if a password is correct, you have to extract original salt from original hash structure and use it to hash the password to be verified. If the resulting hash matches the original, the password is valid. This feature is usually provided by bcrypt implementations.


If you're going to perform a dictionary attack, simply pre-hash all entries in your dictionary with MD5 and then run a bcrypt dictionary attack with the dictionary of MD5s. Reversing MD5s for successfully cracked passwords will be very easy and I'm leaving it for you to figure out.

A basic brute-force attack where all combinations are tested is impractical against bcrypt and MD5 is irrelevant.

gronostaj
  • 58,482
0

Just use hashcat with mode 25600. Since the algorithm is based on bcrypt - which is slow - you can only use a small dictionary, like the most commonly used passwords.

You can ignore the password shucking answer, since it's wrong. The basic assumptions for

wrapping an unsalted hash with bcrypt is demonstrably weaker than simply using bcrypt

are incorrect. Let me explain why.

The idea: you crack bcrypt(md5($password)) with a dictionaty of MD5s. The plain value of the MD5 is or is not known. The conclusion: attack is faster, bcrypt(md5($password)) is worse than bcrypt($password).

The issues with this claim:

  1. The speed advantage of not doing the MD5 is practically non-existent. Let's check the RTX-4090 https://www.onlinehashcrack.com/tools-benchmark-hashcat-nvidia-rtx-4090.php. Compare hashcat mode for plain bcrypt (3200) and bcryptmd5 (25600). The difference is too small to measure reliably. bcrypt(md5($password) is even faster in these - and many other - benchmarks (because of marginal difference, it is actually slower).

    More details. Bcrypt work factor 05: ~4 microsecond (4.16e-6). Work factor 05 is used for benchmarking only. Nowadays 10 is commonly used, making bcrypt 32x slower: ~133 microseconds (1.33e-4). The 4090 uses about ~7 picoseconds (6.66e-12) per MD5 (~150G/s). Resulting in a whopping theoretical maximal 0.000005% advantage (6.66e-12 / 1.33e-4 x 100%).

    The only real speed advantage can be found with extremely fast algorithms, where 7 picoseconds is a significant amount of time. If you use such an algorithm (e.g. md5(md5($password))), you're screwed anyway.

  2. Using a dictionary of MD5s is nonsense or at best sub-optimal, even if there is a speed advantage. Three reasons:

    1. The seed of all hashed passwords is in most cases greater than that of MD5-hashes passwords only. So if you crack e.g. all MD5s and SHA1s of a targeted user, you've got at least the same and in many cases a better dictionary.

    2. If you can NOT crack the MD5, the whole exercise of cracking bcrypt(md5($password)) was a waste of time. So not a theoretical 0.000005% gain, but a 100% loss.

    3. Throwing a randomly large dictionary against bcrypts will not work since bcrypt is slow. You want to use a small and optimized dictionary to have any chance of cracking the password.

JSS
  • 1