The final swipe in the great whack-a-mole game of web encryption may finally have been swung.
It hasn’t struck home yet but the backswing looks good and the aim is true. Behind the bat is Cloudflare – who else? – and its target is an obscure but widely used technology called SNI (Server Name Indication).
SNI is a bit of unencrypted data that contains the name of the website you’re visiting.
It’s sent by the browser when you view websites securely and, ironically, this unencrypted tidbit of data has played a crucial part in making encryption the exception rather than the rule on the web.
SNI is a trade-off – it opens a small privacy hole at the expense of closing a much bigger one – and as such it has always been a job half done. Everyone knew that sooner or later it would need fixing but nobody was quite sure how to.
After considerable head scratching, Cloudflare is about to test its answer to the SNI conundrum, Encrypted SNI, with partners Mozilla.
So let’s look at what SNI does, why fixing it was hard and what Cloudflare is up to.
SNI’s Catch-22
When you visit a website using an encrypted HTTPS connection, your computer asks the server hosting the website for its digital certificate. It compares the name in the certificate with the name of the site it wants to talk to, to ensure they’re one and the same. If they match, an encrypted tunnel is created between the two computers and HTTP messages can be sent and received over that tunnel.
However, there’s a Catch-22 if, as happens frequently, you want to visit a website on a server that’s hosting lots of different websites with different certificates.
Your computer has to tell the server which website it wants a certificate for, so it can create an encrypted tunnel, but it tells the server which website it wants using HTTP, and it can’t send HTTP messages until it’s created an encrypted tunnel…
Enter SNI to break the deadlock.
SNI allows a web browser to send the name of the website it wants to connect to up front, before the encrypted tunnel is formed, so the server knows which certificate to send.
Because it’s sent in plain text, before the encrypted tunnel is created, your SNI data can be read by anyone who can intercept your browsing traffic, such as an ISP or rogue Wi-Fi access point, revealing the websites you’re visiting.
Only a few short years ago the web was largely unencrypted and the entire contents of your HTTP messages could be pored over and modified by interlopers. In that context, the SNI leak was a small price to pay for the huge privacy and security boost it enabled by making HTTPS on shared hosts easier.
Now that encrypted HTTP is the norm, attention is switching to SNI and DNS – two plain text protocols that can be intercepted to see what websites you’re visiting.
Efforts to fix DNS have attracted far more attention than SNI (probably because it’s more widely used and easier to fix), and multiple solutions have emerged such as DoH (DNS over HTTPS) and DNS over TLS, DNSCrypt and the interesting and esoteric Oblivious DNS.
You can see how HTTPS, DNS encryption like DoH and ESNI combine to hide different parts of your browsing data in the table below:
Message | URL | Hostname | |||
---|---|---|---|---|---|
via HTTP | via DNS | via SNI | |||
DNS + HTTP | ✘ | ✘ | ✘ | ✘ |   |
DNS + HTTPS + SNI | ✓ | ✓ | ✓ | ✘ | ✘ |
DoH + HTTPS + SNI | ✓ | ✓ | ✓ | ✓ | ✘ |
DoH + HTTPS + ESNI | ✓ | ✓ | ✓ | ✓ | ✓ |
How ESNI works
The process of negotiating an encrypted tunnel between your browser and a web server is called the “handshake”. It involves sending plaintext SNI data to tell the server which certificate you want, checking that you’re talking to that server, agreeing which ciphers to use and exchanging encryption keys.
Fixing SNI is hard is because the SNI data you send needs to be decrypted by the server before it can perform the handshake that tells it how to decrypt your messages. Or, as Cloudflare itself put it in its detailed explanation on the subject:
If the chicken must come before the egg, where do you put the chicken?
The answer is, you put it in DNS (Domain Name System), the global internet address book used to associate human-readable names like example.org
with IP addresses like 93.184.216.34
.
Computers find and talk to one another using IP addresses, so the first thing your computer does when you tell it to visit a website is quickly lookup up the IP address using DNS.
Cloudflare want website owners to create a pair of cryptographic keys – one public, one private – and publish the public key in a DNS entry (where web browsers can pick it up before visiting a website), alongside their IP address.
The public key can be used by anyone to derive a symmetric encryption key that only the owner of the secret, private key (the website owner) can unlock.
Since only the client, and the server it’s connecting to, can derive the encryption key, the encrypted SNI cannot be decrypted and accessed by third parties.
When it wants to connect to a website using ESNI your browser will generate its own, ephemeral, public and private key pair that will be used once and discarded, to prevent replay attacks.
While this may seem overly complicated, this ensures that the encryption key is cryptographically tied to the specific TLS session it was generated for, and cannot be reused across multiple connections.
The browser uses its ephemeral private key and the server’s public key to derive an encryption key, encrypts the SNI data and sends it to the website along with the public portion of its ephemeral key.
The server then uses its private key and the browser’s public key to derive the encryption key that can decrypt the SNI data, and the handshake for the encrypted HTTP session can begin.
Although it’s Cloudflare’s baby, the company is trying to turn ESNI into an open standard via the IETF (Internet Engineering Task Force). The feature is already available for anyone using Cloudflare name servers but no browsers support it yet, although it’s expected to be a feature of Firefox’s bleeding edge nightly builds imminently.
Anonymous
Not sure this is a “fix”. The word “patch” or “bodge” might be better – what we have now is Yet Another Cryptographic Key System done using yet more DNS records and creating one more PKI.
SPF, DKIM, DMARC, and now ESNI… I suppose it could have been worse – could have used the blockchain, so perhaps DNS is the lesser of two evils.
Rich Baldry
It doesn’t really create a complex PKI because the keys advertised by the servers in DNS do not need to have any persistence. They are not used for authentication of the server, or for establishing trust in the way that SSL keys are. They are only used in a DH exchange to one-time encrypt the SNI data. The keys can be changed as often as the server owner likes – I’ve seen suggestions that 3-4 hours is a reasonable life cycle that trades off the risk of an old key being used, with the risk of a server’s private key being captured and used to decrypt requests after the fact.
Jeff
Wouldn’t it be easier to have server randomly pick a cert to start handshake and encrypt SNI request data? Then switch to requested cert after. Just trying to prevent clear text of SNI data right?
Brent Pantera
This still creates a problem. DNS records of domain names is public information. All an ISP needs to do when looking at the traffic, is match up the public key your IP just requested with the DNS name of a site. Really easy to do when targeting specific sites. The method is technically safer, however the same hole still exists.
RichardD
But you’re not sending the public key; you’re sending the site name encrypted with the public key.
Obviously, if you weren’t using DoH, they could intercept the DNS request to get the public key. But in that case, they wouldn’t need to – they’d already know what site you were visiting.
Paul Ducklin
I think the OP was making the point that by using DNS to get the site-specific cryptographic material that you need to do encrypted SNI, you are not significantly better off than if you did a DNS request to look up the IP number of the site. In other words, there’s still a hint of where you want to go that’s more specific than just “an IP number that could be one of many different websites”.
RichardD
Well yes: if you’re using DNS, rather than the encrypted DoH protocol, then you don’t really get much benefit. :)
Of course, DoH really only moves the point of trust from your ISP to the company providing the DoH service.
Mark Stockley
ISPs can see DNS requests that terminate with them, on their DNS resolvers, and DNS data that passes through them, terminating at 3rd party DNS resolvers.
If you use a 3rd party resolver and DoH or DNS over TLS then the ISP (and any other interloper) is blind. Whomever you trust to act as your DNS resolver gets to see your DNS lookups, unless you use something like Oblivious DNS.
inckka
Just a quick question, when we are request a Direct google search like `https://www.google.com/search?q=my+ip` before sever starts an encrypted channel with the browser, the browser or NAT looks up through the the ISP, NS, DNS right? in that case, the URL should be recorded/logged in several different places other than the intended target server I assume. The question is, while reaching the https server does URL sends along with the search params or other data. Or is it only the domain name?
Mark Stockley
Just the domain name.
Anonymous
Great article. Kind of reminds me of DKIM, but for encryption.
Paul Ducklin
Ah, DKIM, one of the many DNS-stretching technologies that was going to bring an end to spam :-) (One problem with DKIM is that anything from one of the major webmail services will, ipso facto, pass the test.)
davebac
My one domain has been running for 20ish years now, and receives thousands of spam messages a day. Of those, a tiny fraction pass DKIM or SPF. My assumption is the reason you don’t see this with gmail, outlook, etc. when you’re using those services is that they’re blocking these messages, and don’t provide meaningful statistics about what they’ve blocked to normal end users (may be different for the commercial offerings, but I didn’t see any meaningful reporting on O365).
The bulk of the spam claiming to be from my domain is not sent by my servers – the exception is a user account that was compromised and caught within a couple hours – and I’m getting reports from Google, via DMARC, daily about spoofed messages from my domain from IP addresses completely unrelated to my system. DKIM is doing its job there, as part of DMARC.
The real solution there it to replace SMTP – but nobody wants to do that, because of how difficult in scale it would be. It’d be worse than IP v6’s decades long adoption.
For ESNI, I haven’t looked at the protocol extension – but I could see using a client-side key as a mitigation to having to send a public key. If I generate a client-side private key, and matching public key, I can use that in the negotiation of the session key – the server has a published public key, and so it can encrypt the session key using my public key and it’s private key, and I can verify its private key against DOH so the server is never revealed. However, I’d be able to track individual clients – so you’d have to rotate that client key regularly.
anon
Hurray for open standards!
Anon
Should the 3rd paragraph below “SNI’s Catch-22” be using HTTPs instead of HTTP?
This “Your computer has to tell the server which website it wants a certificate for, so it can create an encrypted tunnel, but it tells the server which website it wants using HTTP, and it can’t send HTTP messages until it’s created an encrypted tunnel…” doesn’t make sense currently. I think you want it to read “Your computer has to tell the server which website it wants a certificate for, so it can create an encrypted tunnel, but it tells the server which website it wants using HTTPS, and it can’t send HTTPS messages until it’s created an encrypted tunnel…”
Mark Stockley
I understand where your question is coming from but in this case I think HTTP is used correctly.
The language your web browser speaks to web servers is HTTP. When it speaks HTTP to a web server over an encrypted TLS “tunnel” the protocol is referred to as HTTPS but the language is still HTTP, the messages are still HTTP requests and responses, and those messages still consist of HTTP headers and message bodies.