Safe
Naked Security Naked Security

Signal app’s address book security could upset governments

The app's novel approach turns Digital Rights Management on its head

Signal, arguably the world’s most respected secure messaging app, plans to use the DRM (Digital Rights Management) secure enclave built into Intel’s Skylake chips as a way of hiding away how people are connected.

It sounds esoteric, but it fixes an important privacy weakness that has dogged end-to-end encrypted messaging: users want to know who else they know that uses the same service. This requires that apps check who else among a person’s contacts uses it by consulting a central “social graph” of how people are connected.

This is a privacy compromise because it means that while the service’s own encryption stops it from reading your messages (or letting intelligence agencies that later ask for access to this data read them either) it can end up knowing a lot about who you know.

Right now Signal counteracts this by turning every number in a user’s address book into a truncated SHA256 hash, which is transmitted and checked against a central database of hashes. It connects users where it finds matches.

The privacy of this design is that anyone intercepting the traffic or hacking the directory will see hashes rather than real telephone numbers.

The only way for a hacker with stolen hashes to figure out what telephone numbers they’ve got is to guess. Guess a number, run it through the hashing algorithm and see if it matches one that you’ve stolen. If it doesn’t match anything, guess another number, and another, and another… and so on until you find a match.

There is a problem with this scheme (to quote Signal’s developers Open Whisper Systems) because the “pre-image space” for 10-digit numbers is small, “inverting these hashes is basically a straightforward dictionary attack”, which is another way of saying that it’s feasible for a computer to make guesses quickly and cheaply enough to compromise the security of the hashes.

Signal doesn’t keep any record of the lookups it’s performed and allows you to satisfy yourself that it doesn’t by giving you access to its source code:

…if you trust the Signal service to be running the published server source code, then the Signal service has no durable knowledge of a user’s social graph if it is hacked or subpoenaed.

But who’s to say that it’s the published server source code that’s actually running on Signal’s server rather than some version of it that’s been modified by a hacker or the demands of an intelligence agency?

…someone who hacks the Signal service could potentially modify the code so that it logs user contact discovery requests, or (although unlikely given present law) some government agency could show up and require us to change the service so that it logs contact discovery requests.

Open Whisper Systems’ founder Moxie Marlinspike thinks the Software Guard Extension (SGX) instruction built into Intel chips as a secure enclave for Digital Rights Management (DRM) offers a way out of the problem, and has integrated it into a new Signal open source Beta.

This is similar to ARM’s TrustZone technology that forms the basis of Samsung’s Knox security system, but was designed with DRM-oriented features such as “remote attestation”.

Remote attestation is normally used by content providers to verify that you and I are running the software we are permitted to, software that will respect DRM restrictions, rather than something that can pirate the content it’s playing.

In Signal’s case this arrangement is inverted. The enclave is on its server rather than on your device and remote attestation allows you, the client, to attest that the server is running a squeaky clean copy of Signal’s software.

Furthermore, because the verified copy of Signal’s software is running in an enclave, neither it nor the messages that pass between you and the enclave can be interfered with by other software on the server.

A practical hurdle to this is SGX’s 128MB RAM limit, which sounds like a lot of protected memory for a microprocessor but is nowhere near enough to hold a database that might contain billions of hashes.

Not to mention:

Even with encrypted RAM, the server OS can learn enough from observing memory access patterns … to determine the plaintext values of the contacts the client transmitted!

Open Whisper Systems’ solution is to perform “a full linear scan across the entire data set of all registered users for every client contact submitted,” which is to say access lots of hashes in the database so anyone with control of the OS can’t detect a pattern.

For any sizable user base, this would be incredibly slow if it had to be done for every user, almost every time they connect to the service (messaging apps perform regular checks in case new users appear).

To avoid this turning into a computer science lecture, we’ll sum up Marlinspike’s proposed solution by saying that it is based around disordering the way hashes are stored within the hash table to make it harder to carry out surveillance on them.

Does any of this matter beyond this one app?

Undoubtedly. Signal’s user base is small but where Signal goes, other secure messaging apps have a habit of following, including as mentioned above, WhatsApp and Facebook Messenger with their billion or more users. Since adopting Signal’s underlying platform in 2016, both appear to be implementing its innovations over time.

We don’t know whether this will include using server-side SGX enclaves, but if it does it could provoke a response from governments already questioning the use encrypted messaging.

App companies want to preserve user privacy for complex reasons we’ve written about before, including a desire not to turn into large-scale surveillance platforms for global governments in ways that might hurt their popularity.

But the bottom line is clear: losing access to address book metadata will not go down well with the powers that be.