F**CKWIT, aka KAISER, aka KPTI – Intel CPU flaw needs low-level OS patches

Paul Ducklin

7 years ago

Update. We’ve added information from various vendors that was published after this article first appeared.
See our additions at the end of this piece (latest at 2018-01-05T00:30Z).

Note. If you’re after information and advice specifically about Sophos products,
please see our Knowledgebase Article 128053.
In the near future – in all likelihood, later this month – at least Windows and Linux will get security updates that change the way those operating systems manage memory on Intel processors.
There’s a lot of interest, excitement even, about these changes: they work at a very low level and are likely to affect performance.
The slowdown will depend on many factors, but one report suggests that database servers running on affected hardware might suffer a performance hit around 20%.
“Affected hardware” seems to include most Intel CPUs released in recent years; AMD processors have different internals and are affected, but not quite as broadly.
So, what’s going on here?
On Linux, the forthcoming patches are known colloquially as KPTI, short for Kernel Page Table Isolation, though they have jokingly been referred to along the way as both KAISER and F**CKWIT.
The latter is short for Forcefully Unmap Complete Kernel With Interrupt Trampolines; the former for Kernel Address Isolation to have Side-channels Efficiently Removed.
Here’s an explanation.
Inside most modern operating systems, you’ll find a privileged core, known as the kernel, that manages everything else: it starts and stops user programs; it enforces security settings; it manages memory so that one program can’t clobber another; it controls access to the underlying hardware such as USB drives and network cards; it rules and regulates the roost.
Everything else – what we glibly called “user programs” above – runs in what’s called userland, where programs can interact with each other, but only by agreement.
If one program could casually read (or, worse still, modify) any other program’s data, or interfere with its operation, that would be a serious security problem; it would be even worse if a userland program could get access to the kernel’s data, because that would interfere with the security and integrity of the entire computer.
One job of the kernel, therefore, is to keep userland and the kernel carefully apart, so that userland programs can’t take over from the kernel itself and subvert security, for example by launching malware, stealing data, snooping on network traffic and messing with the hardware.
The CPU itself provides hardware support for this sort of separation: the x86 and x64 processors provide what are known as privilege levels, implemented and enforced by the chip itself, that can be used to segregate the kernel from the user programs it launches.
Intel calls these privilege levels rings, of which there are four; most operating systems use two of them: Ring 0 (most privileged) for the kernel, and Ring 3 (least privileged) for userland.
Loosely speaking, processes in Ring 0 can take control over processes and resources in higher-numbered rings, but not the other way around.
In theory, then, the processor itself blocks Ring 3 programs from reading Ring 0 memory, thus proactively preventing userland programs from peeking into the kernel’s address space, which could leak critical details about the system itself, about other programs, or about other people’s data.
In technical terms, a sequence of machine code instructions like this, running in userland, should be blocked at step 1:

mov rax, [kernelmemory]   ; this will get blocked - the memory is protected
mov rbx, [usermemory]     ; this is allowed - the memory is "yours"

Likewise, swapping the instructions, this sequence would be blocked at step 2:

mov rbx, [usermemory]     ; this is allowed - the memory is "yours"
mov rax, [kernelmemory]   ; this will get blocked - the memory is protected

Now, modern Intel and AMD CPUs support what is called speculative execution, whereby the processor figures out what the next few instructions are supposed to do, breaks them into smaller sub-instructions, and processes them in a possibly different order to how they appear in the program.
This is done to increase throughput, so a slow operation that doesn’t affect any intermediate results can be started earlier in the pipeline, with other work being done in what would otherwise be “dead time” waiting for the slow instruction to finish if it ran at the end of the list.
Above, for example, the two instructions are computationally independent, so it doesn’t really matter what order they run in, even though swapping them round changes the moment at which the processor intervenes to block the offending instruction (the one that tries to load memory from the kernel).

Order does matter!

Back in July 2017, a German security researcher did some digging to see if order does, in fact, matter.
He wondered what would happen if the processor calculated some internal results as part of an illegal instruction X, used those internal results in handling legal instruction Y, and only then flagged X as disallowed.
Even if both X and Y were cancelled as a result, would there be a trace of the internal result left over from the speculative execution of the illegal instruction X?
If so, could you figure out something from that left-over trace?
The example that the researcher started with looked like this:

1.  mov rax, [K]      ; K is a kernel address that is banned
2.  and rax, 1
3.  mov rbx, [U+rax]  ; U is a user address that is allowed

Don’t worry if you don’t speak assembler – what this code does is:

Load the A register from kernel memory.
Change A to 0 if it was even or 1 if it was odd (this keeps the thought experiment simple).
Load register B from memory location U+0 or U+1, depending on A.

In theory, speculative execution means that the CPU could finish working internally on instruction 3 before finishing instruction 1, even though the whole sequence of instructions would ultimately be invalidated and blocked because of the privilege violation in 1.
Perhaps, however, the side-effects of instruction 3 could be figured out from elsewhere in the CPU?
After all, the processor’s behaviour would have been slightly different depending on whether the speculatively-executed instruction 3 referenced memory location U or U+1.
For example, this difference might, just might, show up in the CPU’s memory cache – a list of recently-referenced memory addresses plus their values that is maintained inside the CPU itself for performance reasons.
In other words, the cache might act as a “telltale”, known as a side channel, that could leak secret information from inside the CPU – in this case, whether the privileged value of memory location K was odd or even.
(Looking up memory in CPU cache is some 40 times faster than fetching it from the actual memory chips, so enabling this sort of “short-circuit” for commonly-used values can make a huge difference to performance.)
The long and the short of it is that the researcher couldn’t measure the difference between A is even and A is odd (or, alternatively, did the CPU peek at U or did the CPU peek at U+1) in this case…
…but the thought experiment worked out in the end.
The researcher found other similar code constructions that allow you to leech information about kernel memory using address calculation tricks of this sort.
In other words, Intel CPUs suffer from a hardware-level side channel that could leak privileged memory to unprivileged programs.

The rest is history

And the rest is history.
Patches are coming soon, at least for Linux and Windows, to deliver KAISER: Kernel Address Isolation to have Side-channels Efficiently Removed, or KPTI, to give its politically correct name.
Now you have an idea where the name KAISER came from: the patch keeps kernel and userland memory more carefully apart so that side-effects from speculative execution tricks can no longer be measured.
This security fix is especially relevant for multi-user computers, such as servers running several virtual machines, where individual users or guest operating systems could use this trick to “reach out” to other parts of the system, such as the host operating system, or other guests on the same physical server.
However, because CPU caching is there to boost performance, anything that reduces the effectiveness of caching is likely to reduce performance, and that is the way of the world.
Sometimes, the price of security progress is a modicum of inconvenience, in much the same the way that 2FA is more hassle than a plain login, and HTTPS is computationally more expensive than vanilla HTTP.
In eight words, get ready to take one for the team.

What next?

A lot of the detail beind these patches is currently [2018-01-03T16:30Z] hidden behind a veil of secrecy .
This secrecy seems to be down to non-disclosure clauses imposed by various vendors involved in preparing the fixes, an understandable precaution given the level of general interest in new ways to pull off data leakage and privilege escalation exploits.
We expect this secrecy to be lifted as patches are officially published.
However, you can get and try the Linux patches for yourself right now, if you wish. (They aren’t finalised yet, so we can’t recommend using them except for testing.)
So far as we know at the moment, the risk of this flaw seems comparatively modest on dedicated servers such as appliances, and on personal devices such as laptops: to exploit it would require an attacker to run code on your computer in the first place, so you’d already be compromised.
On shared computers such as as multiuser build servers or hosting services that run several different customers’ virtual machines on the same physical hardware, the risks are much greater: the host kernel is there to keep different users apart, not merely to keep different programs run by one user apart.
So, a flaw such as this might help an untrustworthy user to snoop on other who are logged in at the same time, or to influence other virtual machines hosted on the same server.
This flaw has existed for years and has been documented about for months at least, so there is no need to panic; nevertheless, we recommend that you keep your eyes out for patches for the operating systems you use, probably in the course of January 2018, and that you apply them as soon as you can.

UPDATES
[2018-01-04T01:00Z]
Google’s Project Zero bug hunting team has now published a detailed description of the behind-the-scenes research that’s been going on for the past few months. It’s both technical and jargon-heavy, but the main takeways are:

In theory, various Intel, AMD and ARM processors have features related to speculative execution and caching that can be exploited as described above.
AMD chips have so far only been exploited when using Linux with a non-default kernel feature enabled.
Intel chips have been exploited so that an unprivileged, logged-in user can read out kernel data slowly but steadily.
Intel chips have been exploited so that a root user in a guest virtual machine can read out host kernel data slowly but steadily.

(“Slowly” means that an attacker could suck out on the order of 1000 bytes per second, or approximately 100MBytes per day.)
Even if you assume that an attacker didn’t know where to focus his attempts, but could do no better than to grab live kernel data at random, you can consider this issue to be a bit like Heartbleed, where an attacker would often end up with garbage but might occasionally get lucky and grab hold of secret data such as passwords and private decryption keys.
Unlike Heartbleed, the attacker already needs a footprint on a vulnerable server, for example as a logged-in user with a command shell open, or as the owner of a virtual machine (VM) running on a hosting server. (In both cases the user ought to be constrained entirely to his own account or to his own VM.)
Intel has published a brief official comment entitled Intel responds to security research findings. There isn’t much in this statement, so don’t get too excited; the salient points are:

“Intel believes these exploits do not have the potential to corrupt, modify or delete data.” Indeed, the attacks and exploits reported so far can suck data out of the kernel, but not put any data back into kernel space.
“Recent reports that these exploits are caused by a ‘bug’ or a ‘flaw’ and are unique to Intel products are incorrect.” We used the word ‘flaw’ in our headline, and we’ll stick with it. In our opinion, an ideal implementation of speculative execution would ensure that there were no detectable side-effects left behind after a speculative execution was found to have violated security.
“Contrary to some reports, any performance impacts are workload-dependent, and, for the average computer user, should not be significant.” You will have to interpret that for yourself.

[2018-01-04T17:40Z]
AMD has issued a statement headlined An update on AMD processor security. Like the Intel statement, it doesn’t say an awful lot, but it does confirm that AMD CPUs are not entirely immune to these attacks.
There are three CVE vulnerability numbers attached to the various F**CKWIT exploits: CVE-2017-5753, CVE-2017-5715 and CVE-2017-5754.
AMD claims that it is vulnerable to -5733, immune to -5754, and that although it is in theory at risk from -5715, “differences in AMD architecture mean there is a near zero risk of exploitation of this variant.”
[2018-01-05T00:30Z]
Firefox just pushed out a browser update to mitigate these attacks. Firefox now moves to version 57.0.4.
This update makes it much harder for JavaScript running in the browser to measure short time intervals accurately – timing memory access speeds is necessary in these attacks so you can figure out which memory addresses ended up cached and which ones didn’t.
A memory address that is currently cached must have been accessed recently, a trick that helps you figure out what happened when an instruction was speculatively executed, even if it got cancelled in the end.