Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Meltdown and Spectre: What about drivers?

Meltdown and Spectre: What about drivers?

The week the Meltdown and Spectre vulnerabilities was one of the most fun weeks I’ve had in a while. Not only were the vulnerabilities mind-bendingly clever, that week had just about everything you could possibly want in a story: mystery, intrigue, heroes, villains, and probably the greatest patch ever submitted.

After consuming as much information as I could stomach about the issues, I was left with three thoughts:

  1. Kudos to all the folks that worked on these vulnerabilities. I hope someone bought them beer.
  2. Kudos to all the folks that worked on patching these vulnerabilities. I hope someone bought them scotch.
  3. I write loadable kernel modules for a living…I really hope that my code isn’t subject to these vulnerabilities

While I can’t do anything do address #1 & #2, I went into a bit of a panic about #3. Here at OSR, we take writing secure kernel mode code seriously. Not only is this important for the code that we write, but we also teach people all over the world how to write drivers. It’s important for us to understand the types of vulnerabilities that drivers can introduce so we can teach best practices and keep the world a little safer.

So, the question that I’m hoping to answer in this post is simple: do Meltdown and Spectre impact Windows driver development?

Note that I’m going to assume that the reader is at least casually aware of the vulnerabilities. I’ll describe the details of here only enough to orient ourselves. There’s plenty of reading out there if you want actual details.

In thinking about this question, I realized that “impact” can mean two different things:

  1. Do drivers need to do anything special to avoid leaking information?
  2. Do the Windows mitigations for Meltdown and Spectre break existing drivers in any way?

I’ll go down the list of vulnerabilities and answer the above for each, leading from least impactful to drivers to most impactful.

Variant 1: Bounds Check Bypass (CVE-2017-5753/Spectre)

This vulnerability occurs because the processor may speculatively execute an out of bounds array access, even if the code correctly performs a bounds check:

    if (index < data->MaxCount) {

        // May speculatively execute out of bounds access

        foo = data->Elements[index];

    }

If the attacker can control the index, and you return data to user mode based on the data contents at that index, then there is a chance for exploitation. Finding the exact right sequence and conditions in an existing binary appears to be tricky. The most susceptible victims would appear to be modules that JIT code from untrusted sources, which lets the attacker create a sequence that would otherwise be difficult to normally find.

Ignoring the fact that this creates a cache side channel attack, the processor is working “as designed” in this case. We haven’t told the processor that it must determine the result of the if before it performs the array access. To do that, we would have to insert a fence instruction:

    if (index < data->MaxCount) {

        // May NOT speculatively execute out of bounds access

        _mm_lfence();

        foo = data->Elements[index];

    }

Do drivers need to do anything special to avoid leaking information?

Technically, yes, we need fences to avoid the speculative execution.

However, these fences should only be used in the cases where the code is vulnerable to attack. In the words of Intel in their white paper on the subject, “…the insertion of LFENCE must be done judiciously; if it is used too liberally, performance may be significantly compromised.”

As an aside, I probably would have avoided the use of the word, “compromised”, but that’s just me…

It’s probably unreasonable for developers to properly identify the places in their code where these fences are required. Luckily, the Visual Studio compiler was recently updated to include the /Qspectre switch. The compiler will now take care of locating the offending code sequences in any compiler generated code and adding the fences in as necessary.

Our recommendation is to recompile all of your kernel mode code with this switch at your earliest convenience. There should be minimal performance impact and also there is very little concern in terms of backwards compatibility as fences have been available for quite some time now.

Do the Windows mitigations for Meltdown and Spectre break existing drivers in any way?

No.

Presumably the Windows mitigation for this issue will be to insert fences where necessary in the OS. That should have zero impact on our driver code.

Variant 2: Branch Target Injection (CVE-2017-5715/Spectre)

This vulnerability occurs because the processor may speculatively execute an indirect call or jump if the target location is not known:

    call [rax] ; Where does this go? Don’t know, let’s guess!

The trick is that the processor records indirect call and jump locations so that it can guess where to go on the next call. However, it’s possible to pollute this trace history and trick the processor to speculatively execute something that wouldn’t have otherwise executed.

Even worse is that this pollution can cross protection domains. So, for example, a user application can use its own execution history to steer the kernel to a predetermined location.

Pair this with Variant 1 and you’re off and running.

Do drivers need to do anything special to avoid leaking information?

No.

While this would seemingly affect drivers, the Windows mitigation will make this problem go away. Basically, Windows will no longer allow the execution history of user mode affect the indirect branch prediction in kernel mode. This is done either by usage of the new Indirect Branch Restricted Speculation (IBRS)/ Indirect Branch Predictor Barrier (IBPB) registers or a software construct such as a “return trampoline” (retpoline) to leave the branch prediction logic dazed and confused.

In either case, the mitigation is performed as part of handling OS entry. The branch prediction history is no longer polluted by user mode by the time we reach driver code. Thus, nothing for drivers to worry about.

Do the Windows mitigations for Meltdown and Spectre break existing drivers in any way?

No.

Variant 3: Rogue Data Cache Load (CVE-2017-5754/Meltdown)

This vulnerability occurs because the processor may speculatively access privileged memory from an unprivileged application:

    foo = *kernelAddress;

    bar = baz[foo];

If baz[foo] is speculatively executed, then we can measure the effects and thus deduce the value of *kernelAddress. Oops.

The only way to address this is to ensure that the kernel virtual address space is not mapped while running in user mode. This is entirely different than how things have always worked, which is to have the kernel address space mapped but not accessible.

Do drivers need to do anything special to avoid leaking information?

No.

This attack is carried out entirely from user mode. The existence of a driver just means that there are some additional kernel virtual addresses that you might want to read from. However, the driver can’t do anything to protect its memory from being speculatively fetched by the processor.

Do the Windows mitigations for Meltdown and Spectre break existing drivers in any way?

No, assuming the drivers are doing things that are architecturally valid for Windows.

The Windows mitigation for this issue is the biggest change of them all. While running in user mode, the process will use a set of page tables that do not map the kernel virtual address space. When we switch to kernel mode, the OS will switch to a different set of process specific page tables that map both the kernel and user virtual address spaces.

We can see an example by using the !vtop WinDbg command. This takes the base (physical) address of the virtual memory tables to use along with a virtual address to translate. We can see that the current process has two separate page directories:

6: kd> ??@$proc->Pcb.DirectoryTableBase

unsigned int64 0xab10000

6: kd> ??@$proc->Pcb.UserDirectoryTableBase

unsigned int64 0xab80000

And if we feed in a valid kernel virtual address to !vtop, we can see that the address is mapped in the kernel address space:

6: kd> !vtop 0xab10000 0xfffff8038e087270

Amd64VtoP: Virt fffff8038e087270, pagedir 000000000ab10000

Amd64VtoP: PML4E 000000000ab10f80

Amd64VtoP: PDPE 0000000001089070

Amd64VtoP: PDE 000000000108a380

Amd64VtoP: PTE 0000000001016438

Amd64VtoP: Mapped phys 0000000002c87270

Virtual address fffff8038e087270 translates to physical address 2c87270.

But not in the user address space:

6: kd> !vtop 0xab80000 0xfffff8038e087270

Amd64VtoP: Virt fffff8038e087270, pagedir 000000000ab80000

Amd64VtoP: PML4E 000000000ab80f80

Amd64VtoP: PDPE 00000000012a2070

Amd64VtoP: PDE 00000000012a1380

Amd64VtoP: zero PDE

Virtual address fffff8038e087270 translation fails, error 0xD0000147.

Of course, there has to be some kernel address space mapped in the user tables so that we can do things like handle system service calls. These code paths quickly switch over to the isolated kernel address space after some initial dispatching code.

While this is all interesting, it should still have zero impact on device drivers. The isolated kernel address space still maps user mode as normal. Thus, while in kernel mode we still have access to the same page tables that map user memory. We can still read/write user memory, map MDLs into user space, etc.

About the only things that wouldn’t work would be if you had somehow managed to hook into the OS before the kernel has a chance to switch to the isolated kernel address space. For example, hooking the SYSENTER MSR would lead to disaster as the driver hook would not be mapped into the user application’s address space. This would be architecturally invalid anyway though.

Summary

After all that, in the end there’s very little to see here in terms of driver impact.  Yay!  To summarize our guidance:

  • The Windows patches designed to mitigate the Meltdown and Spectre (types 2 and 3) vulnerabilities should handle these issues without any code or logic changes in drivers, file systems, or file system filters.
  • The Windows Meltdown and Spectre mitigation patches should not have any adverse effect on drivers, file systems, or file system filters.
  • All Windows kernel-mode code should be recompiled with the /Qspectre switch at your earliest convenience. This switch is available starting in VS 2017 Update 5.  This doesn’t require an emergency fix.  Rather, it’s we recommend you use this switch when you build the next update of your product.