Naked Security

XKCD forums breached

How did the Correct Horse Battery get Stapled?

Written by

September 03, 2019

Naked Security bcrypt breach correcthorsebatterystaple data breach Data Leak hashing MD5 phpBB salting xkcd

The forum for the techie-darling comic strip XKCD was still offline on Monday afternoon after Troy Hunt’s breach site, Have I Been Pwned, reported on Sunday that 562,000 of the forum’s accounts had been breached sometime in August.

New breach: XKCD had 562k accounts breached last month. The phpBB forum exposed email and IP addresses, usernames and passwords stored in MD5 phpBB3 format. 58% of addresses were already in @haveibeenpwned https://t.co/LGaAnj1hUA
— Have I Been Pwned (@haveibeenpwned) September 1, 2019

A breach notice on the echochamber.me/xkcd forums echoed Hunt’s message: portions of the forums’ phpBB user table showed up in a cache of leaked data, it said.

XKCD forums said that the breached passwords that showed up in Have I Been Pwned were salted and hashed, making them harder to crack than if they were simply hashed. A salt is a random string of bytes, different for every password, that is mixed in with the password when it is scrambled for storage in the password database.

Hashing a password means that the original password doesn’t need to be stored where a crook who stole it could simply re-use it directly. Salting ensures that even if two users choose the same password, each user ends up with a different hash, so crooks can’t simply make a giant ‘dictionary’ of hashes that would let them look up the most common passwords in one go:

We’ve been alerted that portions of the PHPBB user table from our forums showed up in a leaked data collection. The data includes usernames, email addresses, salted, hashed passwords, and in some cases an IP address from the time of registration.

Flaw in phpBB/no flaw in phpBB??

An earlier version of the breach notification that was up on Sunday suggested that the leak may have been enabled by an attacker scanning for a vulnerability in phpBB:

It is likely that it was gathered up in some automated scan taking advantage of a vulnerability in the forum software.

…but given that the breach notification was amended at some point to ditch the possibility of this flaw in phpBB, such a flaw has presumably been ruled out.

According to Hunt, 58% of the addresses were already in his trove of breached accounts.

Has the Correct Horse Battery been Stapled?

It’s impossible not to note the irony of XKCD being targeted and that there’s even a hint of a possibility that the security of its password storage might come into play.

As it is, the comic’s musings/teachings on password entropy are a constant touchstone in conversations about how to pick a proper password: the correct horse battery staple strip about password strength is a classic.

But regardless of how the passwords got breached, we can turn to another XKCD strip – this one about password reuse – for the “What to do?” answer. We can also get it from the XKCD forums’ notification.

Namely, if you’re an echochamber.me/xkcd forums user, you should immediately change your password for any other accounts on which you used the same or a similar password.

Using the same passwords on multiple sites leaves you a sitting duck.

Here’s how to pick a proper one, and by that we mean one that’s both strong and unique for each site:

(No video? Watch on YouTube. No audio? Click on the [CC] icon for subtitles.)

And if a website gives you the option to turn on two-factor authentication (2FA or MFA), do that too. Here’s an informative podcast that tells you all about 2FA, if you’d like to learn more:

LISTEN NOW

(Audio player above not working? Download MP3 or listen on Soundcloud.)

Read Similar Articles

May 24, 2021

What to expect when you’ve been hit with Avaddon ransomware

May 19, 2021

What’s New in Sophos EDR 4.0

May 19, 2021

Sophos XDR: Driven by data

14 Comments

Surely there is a relevant xkcd for this.

If there isn’t I suspect there will be :-)

I really don’t see MD5 as an issue. At the end of the day the way the passwords will be cracked is by running though a list of 1M+ known passwords and derivatives, salting and hashing them and looking for a match, on a per-password basis, same as any other salted scheme. Okay, if someone ultra-famous is in there they might try brute-forcing one with a handy bot-net or three…
The problem with MD5 is in signing stuff, where if you put a ton of work in you might be able to get two messages/files/certificates to hash the same. Not an issue here.
You could say there is a minor issue that MD5 has some fast implementations, so you can crack a little more quickly, but if your password is in a list of known passwords you’re screwed either way.
Salting is the main key, as avoids the use of Rainbow Tables and so probably limits the bad-guys to known passwords and maybe brute-forcing really short ones if they have a lot of spare bots.

If you create an account on a site before you harvest the passwords, don’t you learn the salt by finding your own username/password? Then the salt is useless, unless the salt is rotated like a key should be.

A salt is a random string of bytes chosen for you when you create your password and combined with the password when calculating the hash. The site then stores {username,salt,hash} instead of {username,password}. The salt is different for every user (well it’s *supposed* to be!), so discovering your own salt after a breach gives you nothing – it doesn’t help you crack anyone else’s password faster.

The purpose of the salt is [a] to ensure that if two people pick the same password, the database doesn’t give that away and [b] to stop a crook calculating a so-called ‘rainbow table’ in advance that maps hashes back to passwords. You need a different lookup table *for every possible salt*, which would take way too long to compute and way too much space to store.

Thank you both Paul and John. I had done a little thing at DefCon and I thought the salt was to easy to figure out. They must have been on the premise that they victim in that instance only used one salt for everyone.

This may be wrong, so don’t treat this as a reference, but it seems that a phpBB3 MD5 password database entry is like this:

$H$xyyyyyyyyzzzz.....zzzz

Where $H$ is literally ‘$H$’, x is a one-byte base64 character that says to do 2decode(x) loops of MD5 in the stretching part, yyyyyyyyis an 8-character base64 encoding of a 6-byte random value that is used as the salt for this password, and zzzz....zzzz is a 22-character base64 encoding of the final 16-byte salted-hashed-and-stretched MD5 value.

Although yyyyyyyy is generated by base64-encoding 6 random bytes, the 8-character encoded string is used directly as the salt, without decoding it first.

Oh, the base64 encoding is non-standard. It used the substitution alphabet ./0-9A-Za-Z instead of the usual A-Za-z0-9+/.

To check a password against the x, yyyyyyyy and zzzz....zzzz values extracted from in a user’s database entry, you do something like this:

count = 2**decode64('x')
salt = 'yyyyyyyy'

hash = md5(salt+password)
for i = 1,count do
hash = md5(hash+password)
end

if hash == decode64('zzzz....zzzz') then
print('correct')
end

I can’t see people doing Rainbow Tables for 2^48 combinations of say 555,278,657 known passwords (length of the Pwned Passwords list as I type, note I really meant a Billion in my first post, not Million, but the + gets me out of jail, I like to think, err) so I’m sure even a 6-byte (of entropy, hopefully) salt should achieve the desired purpose of compelling brute forcing (2^48 being a bit over 2.8*10^14). Obviously it will take longer to crack passwords with a more complex encoding method, although not a vast time (if coding 1 password has to appear almost instant, 555M isn’t dreadful), plus hey that’s what bots are for… just sit back and wait for the results to roll in, or buy some GPUs from BitCoin miners who have moved on to ASICs (or pay them a percentage).

P.S. MD5 on a GPU seems to be about 2.5x faster than SHA256, so the number of rounds is presumably the key figure.

P.P.S. A 25 GPU cluster was trying 180 Billion MD5 passwords a second in 2012 (only 71,000 guesses against Bcrypt and 364,000 guesses against SHA512crypt, BUT in 2012). A RTX2080Ti is about 6.6x faster at cracking hashes than the Radeon HD7970s they used.

Simply put, even with bcrypt, if you have a really poor password choice you might get your pasword cracked, because password crackers don’t just go from AA…AA to ZZ…ZZ – they try the most likely ones first.

With 2049 MD5s a cracker will be able go a lot faster than with passwords processed, say, the PBKDF2 settings I recommended above, and the stronger your password, the lower down the ‘try these first’ list it will be, and the longer you have to change it before the crooks crack it, so…

…pick a proper password!

We covered the 2012 GPU cracker you mentioned here:
https://nakedsecurity.sophos.com/2012/12/17/windows-passwords-dead-in-six-hours-paper-from-oslo-password-hacking-conference/

I see my links didn’t make it, no problem.
One issue is if you want to check your passwords aren’t in a list of known ones you either need to type them in at somewhere like:
https://haveibeenpwned.com/Passwords
and trust them, plus everything between you and them, a lot, or download the password list, which is huge (plus changes fairly often) and I found very hard to get (the Cloudfare links always die about 10% in for me, P2P worked eventually) and then have an editor that is happy with a 23GB text file (which I do, but that might not be common).

Has Sophos ever considered doing an offline password tester for people? It could do incremental updates so only large the first time. It could have a generic rating for the password (how long to brute-force now and in three years, say) as well as the list test.

I don’t think we’ve considered a password tester of that sort… the 23GB initial download (and remember HiBP has only been going 6 years so that’s 4GB a year just using incremental updates!) is enough to put me off.

My personal opinion is that password strength meters are mostly pointless, and that’s the best you can say about them. If you use any half-decent password manager and let it pick a random password of, say, 18 characters then you simply don’t need a strength meter to tell you that your password is good enough. Or if you do this…

$ dd if=/dev/urandom bs=18 count=1 | base64
1+0 records in
1+0 records out
18 bytes transferred in 0.000047 secs (383236 bytes/sec)
uodEHkttdHPySUv9m7NMJe23

…then you are golden.

What gets my goat is when a website with a so-called ‘password strength meter’ does this:

You enter: Pa55word!! <--STRONG, allowed

But also does this;

$ dd if=/dev/urandom bs=18 count=1 | hexdump -e '18/1 "%02x" "\n"'
1+0 records in
1+0 records out
18 bytes transferred in 0.000025 secs (719024 bytes/sec)
c45b2cd9df18c9b2560cda43befeb9f1e818

You enter: c45b2cd9df18c9b2560cda43befeb9f1e818 <--NOT ALLOWED, needs capitals and punctuation

That steams me too (very complex passwords with no # or whatever getting bounced). Also sites who have changed their password rules to demand punctuation/Caps/etc. and won’t let you logon as your (correct and complex) password doesn’t include all of that, but just bounce you rather than telling you to do a password reset, had one recently.

Although I’m not a big fan of password managers as you REALLY have to trust them not to have an issue (or be iffy) plus they usually aren’t available for all the devices I’d like to use. Swings and roundabouts I guess.

In case it was not obvious from Paul Ducklin’s answer, the salt is stored in the data base, and is unique for each user. So if you steal the data base, you have access to the user name, the salt, and the salted and hashed password.

AFAIK, the ‘phpBB3 with MD5’ algorithm uses a 6-byte salt and a variable (e.g. 2048) number of re-applications of MD5 to do salting, hashing and stretching. You can find a bunch of faults with this, and we would urge you to use something that is a bit more standard and complex, e.g. the PBKDF2 algorithm, using at least 16 bytes of salt, HMAC-SHA256 as the core hash, and 80,000 loops or more. (HMAC-SHA256 does at least two SHA256 calculations at every step so the actual number of times the core hash gets called is twice the loop count.)

https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/

Nevertheless, I agree with you that MD5 is a bit of a red herring here, given that its known flaws are not an obvious lack of randomness in its output but that it’s easy to provoke collisions where two different inputs have the same hash. As you say, hash collisions are not the issue here.

I have therefore modified the article accordingly – thanks for the comment!

Comments are closed.

XKCD forums breached

Flaw in phpBB/no flaw in phpBB??

Has the Correct Horse Battery been Stapled?

Read Similar Articles

What to expect when you’ve been hit with Avaddon ransomware

What’s New in Sophos EDR 4.0

Sophos XDR: Driven by data

14 Comments

Eddie

Paul Ducklin

Dr_Jon

Mahhn

Paul Ducklin

Mahhn

Paul Ducklin

Dr_Jon

Paul Ducklin

Dr_Jon

Paul Ducklin

Dr_Jon

John

Paul Ducklin