Popular virtualisation platform Xen has just announced a worrying bug.
Fortunately, it’s already patched.
And, as far as we can tell, numerous major server hosting companies that rely on Xen were given an early warning, and installed the fix before its public announcement.
Here’s what went wrong, and why it matters.
Understanding VMs
Virtualisation is where you take one physical computer and make it pretend to be one or more pseudo-computers, known as virtual machines (VMs).
Inside those VMs, known as guests, you can then install a range of different operating systems and applications.
The physical computer, known as the host, acts as the overall manager of the virtual guests.
There are many good reasons for virtualisation:
- For testing your software on multiple operating systems at the same time.
- For trying out new system configurations more conveniently than dedicating an entire server to each test.
- For investigating suspicious programs in a way that lets you efficiently watch what they do at a very low level.
- For sharing a huge and expensive server amongst multiple services or customers.
- For keeping spare capacity to run up an unknown number of extra server instances during busy times.
Each guest VM thinks it has a real computer all to itself, and, thanks to the virtualisation layer, can’t mess with any of the other guests.
That’s especially important in a hosting environment, where one physical server may provide virtual server instances to several different customers at the same time
For all you know, your competitor’s web portal might be running in a VM on the very same physical hardware as yours, if you both chose the same service provider.
And it’s equally important that no guest can “reach out” and fiddle with the real hardware on which the VM is running.
If a guest could make unregulated changes in another guest, or in the host itself (which controls all the other guests), that could cause a security crisis.
Hypervision
Software that lets you set up virtual machines in this way is usually called a hypervisor, which is really just a fancy name for a virtual machine monitor or a virtual machine manager.
→ You may hear the term hypervisor reserved for virtualisation software that is very “thin,” serving as both the host operating system and the virtual machine monitor at the same time, such as Xen. The host computer is therefore dedicated entirely to virtualisation. But the word hypervisor is also used to include products like VirtualBox, a virtual machine monitor that runs alongside other user applications on a general-purpose operating system such as Windows or OS X.
Generally speaking, the security of hypervisors (examples include Xen, VirtualBox, VMWare and KVM) has been pretty good in recent years.
For the most part, the popular hypervisors have correctly kept VMs away from each other, as hypervisory doctrine dictates.
But it has not all been plain sailing, and various bugs allowing data to leak between VMs, or for VMs to “escape” into the host, have been found and fixed along the way.
Such as XSA-123.
As easy as 123
The “123” bug is in a part of Xen called the emulator.
Unlike a hypervisor, an emulator is a computer simulator that doesn’t rely on the underlying host computer hardware at all.
A hypervisor shares out the real host hardware amongst multiple virtual “soft computer” guests, while keeping the guests separate from each other.
But the host CPU directly runs the machine code from each guest, so that an Intel-based host hypervisor can only support Intel-based VM guests.
An emulator is a true software computer simulation that pretends to be a specific sort of computer, even if the emulator itself is running on different hardware.
→ Google’s Android development kit, for example, includes an emulator that simulates an ARM-based phone or tablet on your Intel-based laptop. The MAME project simulates now-defunct arcade game hardware on a modern computer so you can play classic arcade games on your PC, using the original machine code. (Because it can.)
Emulators are therefore much more flexible, and in theory ought to be safer, than hypervisors, because guest programs never actually get to choose which physical instructions run on the host server’s CPU.
One downside of emulation, though, is that, all else being equal, it’s slower – often much slower – than virtualisation, because everything is simulated in software.
For all the speed advantages of virtualisation, however, it can’t entirely replace emulation.
The tricky stuff
A secure hypervisor needs an emulator, as well as a virtualiser, to deal with the “tricky stuff.”
There are some CPU instructions that guests simply can’t be allowed to run by themselves, such as instructions that deal with low-level memory management.
If a guest could use the real CPU to manage its own memory directly, it would be able to manage the memory of the entire physical computer at the same time, so it could fiddle with other guests, or the host itself.
This is much like the way that that different programs in a regular operating system are prevented from messing with memory allocation directly, to stop them reading each others’ data and violating security.
So, hypervisors carefully use emulation for some instructions in the guest virtual machines they are managing, to ensure an extra layer of control that can keep even maliciously-minded guest VMs apart.
Unfortunately, that means a hypervisor generally has all of its own complexity, plus the complexity and potential bugs of an emulator.
In the XSA-123, the bug sounds minor, and ought to have been inconsequential.
Greatly simplified, an attacker could potentially break out of a guest by feeding the emulator meaningless machine instructions.
The same instructions would cause no trouble at all if the hypervisor let them run on the real CPU, but could trick the Xen emulator into letting the attacker fiddle with memory inside the hypervisor, i.e. on the host itself.
That sort of vulnerability is known as an arbitrary memory overwrite: a bug that lets you write data of your choice to a memory address that you control.
That is always bad, frequently dangerous, and often results in an exploitable Remote Code Execution (RCE) hole.
RCE, remember, is where you trick a part of the system that isn’t supposed to take instructions from you into doing what you want, based on data sent in from outside.
What to do?
• If you use Xen, patch right away
As far as we can tell, this bug was not a zero-day, meaning that the hole was not known to crooks before the patch was available to the good guys.
Don’t leave the bad guys a gap to exploit you now that the patch is out.
• If you use VMs run by a hosting provider, ask what hypervisor they use.
Get assurance that they have patched already if they are a Xen shop.
Numerous hosting providers were alerted before disclosure and were able to patch up front, including at least Amazon (the service used by Sophos Cloud) and Rackspace.
Many other providers were proactively patched, too; but it pays to ask!