We all know that you’re not supposed to save raw passwords to disk these days.
The reason is obvious: disk storage is generally supposed to be both permanent and shared.
Once you’ve written something to disk unencrypted, there’s always a chance that someone else might be able to get it back later, especially if they know it’s there and it’s worth looking for.
At worst, they could shut down the computer your program is running on, remove the disk (or desolder the chips that make up a solid-state storage device) and try to extract the data elsewhere at their leisure.
As we like to say at Naked Security, Dance like no one’s watching. Encrypt like everyone is.
Of course, blunders happen – even companies that pride themselves on being leaders in secure coding practices have recently admitted to saving plaintext passwords by mistake.
Facebook let plaintext passwords escape into logfiles for about seven years before noticing the error; rivals Google made a similar mistake in a sysadmin toolkit for an astonishing 14 years, admitting in May 2019 that “we made an error when implementing this functionality back in 2005.”
But there are well-known ways of storing passwords and cryptographic keys securely, including:
- Salting-hashing-and-stretching passwords if all you need to do is verify later that someone typed in the right password.
- Keeping passwords and cryptographic keys in an encrypted file that is only ever decrypted temporarily into memory when a password is needed.
- Generating keys in a tamperproof hardware device where they can be used to perform encryption and decryption but not extracted.
What about RAM?
So far, so good.
But what to do about data that’s sitting around in memory?
How do you stop someone snooping on your software while it’s running, during the temporary period that you need access to a password or an encryption key?
For example, if you want to verify a password against your salted-and-stretched database of password hashes, you need the actual password in memory briefly while you calculate its hash.
You can wipe the password from memory as soon as the calculation is finished, leaving you with a one-way hash in memory to compare against the list of hashes on disk.
The hashes can’t be reversed, so anyone who can figure out which database entry you matched against still doesn’t know anthing about the original password that was supplied.
But there’s still what automative safety gurus call TE2D, or time exposed to danger, while the password itself is being hashed.
And what if you need to keep the password or the encryption key around for some time, such as when you need to access a private key repeatedly for calculating digital signatures on a whole sequence of network requests?
Obviously, the operating system and your computer hardware can help here, by enforcing what’s called privilege separation between your application and all the other programs running at the same time.
In theory, a sysadmin with root powers might be able to snoop on what you’re doing, but your process space should be opaque to other users and processes at the same privilege level as you.
But there’s a more general concern about memory snooping these days, thanks to a series of bugs and weaknesses in most modern multi-core, multi-threaded CPUs.
Modern processors aim to boost performance by carrying out lots of different machine instructions at the same time – sometimes deferring their formal security checks until after they’ve done the work, and then relying on cancelling internally any memory accesses that weren’t supposed to happen.
Sometimes, however, even internal memory accesses that end up rejected may leave behind tell-tale signs in the chip that can be detected externally later.
And modern memory chips have such high capacities that the silicon components making up each storage location are packed together tightly enough that they may interefere with each other in ways that let you guess at better-than-even odds what data they contain.
Flaws with dramatic names such as F**CKWIT, Spectre, Meltdown and Rambleed [PDF] are programming tricks that could allow a determined adversary to use unprivileged code – software that can’t directly access data in your memory space – to make inferences about the secrets you’re keeping in RAM.
Fortunately, attacks like Rambleed aren’t perfect or definitive: they typically require lots of repetitive memory accesses, and only allow attackers to make informed guesses about what bits are stored where.
Nevertheless, the authors of the Rambleed paper were able to extract OpenSSH private keys from memory, without root privileges, provided that the keys were around in RAM for long enough.
As the researchers somewhat mysteriously summarised their results:
[We] successfully read the bits of an RSA-2048 key at a rate of 0.3 bits per second, with 82% accuracy.
While this is a worrying result, in practice they needed to recover 4200 bits of private key data (an RSA-2048 key isn’t simply 2048 bits of random data) correctly.
This took more than 30 hours of continuous probing by a program running under carefully controlled circumstances on the same computer as the OpenSSH software.
Rambleed blunted
The OpenSSH team has now added code to make RAM-sniffing attacks against private keys very much harder, with a pair of aptly named functions:
sshkey_shield_private() sshkey_unshield_private()
If you’re interested in learning more about storing your long-term secrets in memory more safely, the OpenSSH code is pretty easy to follow, assuming you’re famililiar with C (check the file sshkeys.c
), and well worth a look.
Simply put, the new key-shielding code in OpenSSH aims to keep the actual data of your private keys out of memory except during the brief moments it’s actually needed.
What to do?
Imagine you have a 256-bit Elliptic Curve keypair, which takes about 1000 bits of RAM (128 bytes), that you want to cache in memory for repeated use.
But you also want to keep it scrambled so it isn’t directly accessible, even if some other process manages to peek through your security curtains.
A good start would be to store the private key encrypted using AES-256 with a randomly generated symmetric key, and then save the symmetric key somewhere else in memory.
An AES-256 key typically needs another 384 bits of RAM (32 bytes of key and 16 bytes of initialisation vector).
A Rambleeding cybercrook’s job would therefore become much harder – they’d need to squeeze out the contents of two different memory locations, and they wouldn’t be able to use any statistical techniques to correct for bit errors in the extracted data, as the Rambleed researchers did in their attacks against unencrypted RSA keys.
A single-bit error in the random AES key would turn the “decrypted” EC keypair into useless garbage, and a single-bit error in the encrypted EC keypair data would make it similarly undecryptable, and thus similarly useless for key recovery.
But the OpenSSH authors made things exponentially more difficult for an attacker by adding an extra step to the scrambling process, like this:
- Generate a random data string of 131,072 bits (16Kbytes). This is called the prekey.
- Calculate the 64-byte SHA512 hash of the prekey.
- Use the first 32 bytes of the hash as a key, and the next 16 bytes as an initialisation vector.
- Shield the private key by encrypting it with AES-256 in Counter Mode.
In pseudocode:
shield: const BLOBSIZE = 16*1024; string prekey = random.bytes(BLOBSIZE); string hash = digest.sha512(prekey); string key = string.cut(hash, 1,32); string ivec = string.cut(hash,33,48); wipemem(hash); string shield = encrypt.aes_ctr(private,key,ivec); wipemem(key,ivec,private);
Instead of keeping the raw value of private
in memory, you store the pair {prekey,shield}
, from which you can temporarily extract private
when needed:
unshield: string hash = digest.sha512(prekey); string key = string.cut(hash, 1,32); string ivec = string.cut(hash,33,48); wipemem(hash); string private = decrypt.aes_ctr(shield,key,ivec); wipemem(key,ivec); -- use private briefly, e.g. for signing -- or encrypting a transient crypto key wipemem(private);
With this approach, you greatly reduce the time that private
is exposed to danger; you use the values hash
, key
and ivec
only very briefly; and you force an attacker to bleed out not only the encrypted shield
data representing the private key, but also the entire BLOBSIZE
×8 bits’ worth of prekey
.
As the maintainters of OpenSSH put it:
Attackers must recover the entire prekey with high accuracy before they can attempt to decrypt the shielded private key, but the current generation of attacks have bit error rates that, when applied cumulatively to the entire prekey, make this unlikely.
In the meantime, if attacks improve – perhaps as memory density increases and thus Rambleed attacks that rely on nearby memory cells interfering with each other become more common – you can easily adapt your defensive code.
For example, you could:
- Increase the value of BLOBSIZE above.
- Regularly re-shield keys by decrypting and re-encrypting them using a brand new prekey.
Increasing the size of the prekey means the crooks have to attack for longer; re-shielding keys regularly means the crooks have less time to finish each attempted attack.
In the longer term, the OpenSSH team are being upbeat:
Hopefully we can remove this in a few years time when computer architecture has become less unsafe.
That’s not quite as good as “more safe”, but being “less unsafe” would be better than we are now.