Windows System Software -- Consulting, Training, Development -- Engineering Excellent, Every Time.

How L1 Terminal Fault (L1TF) Mitigation and WinDbg Wasted My Morning (a.k.a. Yak Shaving: WinDbg Edition)

How L1 Terminal Fault (L1TF) Mitigation and WinDbg Wasted My Morning (a.k.a. Yak Shaving: WinDbg Edition)

I’ve been doing some research into the Windows Filtering Platform and the information available at each of the various filtering layers. In particular, I’ve been focusing on the information available in Windows 7 as that predates some ETW trace points that contain interesting network event data.

After attaching a filter to the FWPM_LAYER_ALE_AUTH_CONNECT_V4 layer, I started poking around at the various values supplied to the classify function. While the various network data is interesting (e.g. source/destination IP, payload, etc.), I’m primarily interested in the FWPS_INCOMING_METADATA_VALUES structure provided with the callback. For example, this structure provides information about the process making the network connect request:

1: kd> ??inMetaValues->processId
 unsigned int64 0x484

1: kd> ??(wchar *)inMetaValues->processPath->data
 wchar_t * 0xfffffa80`1b8f1550
  "\device\harddiskvolume1\windows\system32\svchost.exe"

In looking at the rest of the structure, I became interested in the token field of the structure and what exactly it represented:

1: kd> ??inMetaValues->token
unsigned int64 0xffffffff`800004e8

It’s clearly a kernel HANDLE, though it wasn’t clear to what exactly. Checking the documentation didn’t exactly clarify it for me:


A handle for the token used to validate the permissions for the user. This member contains valid data only if the FWPS_METADATA_FIELD_TOKEN flag is set in the currentMetadataValues member.


Wanting to know who exactly “the user” was in this case, I attempted to run !handle on the token HANDLE value:

1: kd> !handle @@c++(inMetaValues->token)

PROCESS fffffa801b8da410
    SessionId: 0  Cid: 0484    Peb: 7fffffda000  ParentCid: 0268
    DirBase: 67f17000  ObjectTable: fffff8a006fcf600  HandleCount: 396.
    Image: svchost.exe

fffff8a000001780: Unable to read handle table

Denied! My immediate thought was that the 19H1 version of WinDbg no longer worked properly with the Windows 7 HANDLE table structure. Thus began my great yak shaving adventure…

  1. I found a copy of the RS2 version of WinDbg on an old drive in my system. Same problem. Need to go back farther!
  2. I installed the Windows 7 SP1 DDK to get the Windows 7 version of WinDbg. Same problem. Went back too far!
  3. I installed the Windows 8.1 WDK to get the Windows 8.1 version of WinDbg. Same problem. Uh, what now?

I thought about just giving up at this point, but it’s a short week due to the July 4th holiday in the US and so it seemed like a good idea to waste invest time in figuring out what exactly was going on.

Let’s see that error text again:

fffff8a000001780: Unable to read handle table

Assuming that 0xfffff8a000001780 was a virtual address, I tried to dump it out and see that it is indeed invalid:

1: kd> dc fffff8a000001780
fffff8a0`00001780  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`00001790  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017a0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017b0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017c0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017d0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017e0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017f0  ???????? ???????? ???????? ????????  ????????????????

Let’s check out the PTE and see why:

Aha! The PTE for this virtual address has been put in the Transition state. While in this state, the physical page containing the data is resident in memory (i.e. it’s not paged out), but the virtual address is marked as invalid to the hardware. If the processor touches this virtual address it will generate a soft fault, where the page fault will be satisfied immediately without having to go to disk. In this case, the data for the virtual address is at physical address 0x7C1FE780 ((0x7C1FE * 0x1000) + byte offset 0x780 from the virtual address).

While the transition state is used throughout the O/S for various purposes, one of the most common reasons to see it is if you have Driver Verifier’s Force IRQL Checking option enabled. With this option enabled Verifier will periodically put all pageable addresses into the transition state so that it can crash the machine if you touch a pageable address at DISPATCH_LEVEL or above. And, of course, I had Verifier enabled while running my experiments so that I could avoid making stupid mistakes right out of the gate.

So, that’s great an all, but WinDbg is supposed to implicitly translate virtual addresses that are in the transition state. After all, WinDbg knows the physical address (it’s in the PTE) so why not just translate the address and show you the data? At this point, I turned on debugging info by pressing Ctrl+Alt+D and tried to dump the memory again:

1: kd> dc fffff8a000001780
Amd64VtoP: Virt fffff8a000001780, pagedir 0000000067f17000
Amd64VtoP: PML4E 0000000067f17f88
Amd64VtoP: PDPE 0000000005d84400
Amd64VtoP: PDE 000000007c1fa000
Amd64VtoP: PTE 000000007c1f9008
Amd64VtoP: Mapped phys 000008007c1fe780
Physical Memory Address 00000800`7c1fe780 is greater than MaxPhysicalAddress
fffff8a0`00001780  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`00001790  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017a0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017b0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017c0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017d0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017e0  ???????? ???????? ???????? ????????  ????????????????
fffff8a0`000017f0  ???????? ???????? ???????? ????????  ????????????????

WTF? Based on the output of !pte, the physical address should be 0x7c1fe780, so where did 0x8007c1fe780 come from? From here I went and dumped the raw PTE structure and looked at the PTE in the transition state (n.b. that the PTE address came from the !pte output):

1: kd> dt nt!_MMPTE FFFFF6FC50000008 u.Trans.
   +0x000 u        : 
      +0x000 Trans    : 
         +0x000 Valid    : 0y0
         +0x000 Write    : 0y1
         +0x000 WriteThrough : 0y0
         +0x000 CacheDisable : 0y0
         +0x000 SwizzleBit : 0y0
         +0x000 Protection : 0y00100 (0x4)
         +0x000 Prototype : 0y0
         +0x000 Transition : 0y1
         +0x000 PageFrameNumber : 0y000010000000000001111100000111111110 (0x8007c1fe)
         +0x000 Unused   : 0y1111000010000000 (0xf080)

Aha! So, even though !pte said the PFN was 0x7C1FE, the actual PFN in the PTE is 0x8007C1FE. So, what’s going on here?

Well, remember how I said that WinDbg will implicitly decode PTEs in the transition state and show us the data anyway? Turns out that processors do this as well in order to speculatively execute instructions that would otherwise page fault. This causes yet another speculative execution side channel known as L1 Terminate Fault (L1TF). In order to mitigate this, Windows doesn’t put the actual PFN in the Transition PTE and instead sets an unused high order bit to point it to an invalid page. Now if the processor tries to speculative execute, the transition PTE points off to invalid memory and is stopped in its tracks. You can read more about L1TF in Matt Miller’s excellent blog post here.

So, the debugger is being tripped up due to an O/S change and is not able to properly translate the given transition PTE. Checking the date on the kernel image shows that this version of Windows 7 is very new and so it’s not entirely surprising:

1: kd> lmt mnt
Browse full module list
start             end                 module name
fffff800`02a1c000 fffff800`02ff9000   nt        Thu May 16 10:34:59 2019 (5CDD7513)

I decided to see if the problem was solved with WinDbg v.Next as it gets much more contemporaneous updates that the traditional WinDbg. Much to my great joy, it does indeed work!

1: kd> !handle @@c++(inMetaValues->token)

PROCESS fffffa801b8da410
    SessionId: 0  Cid: 0484    Peb: 7fffffda000  ParentCid: 0268
    DirBase: 67f17000  ObjectTable: fffff8a006fcf600  HandleCount: 396.
    Image: svchost.exe

Kernel handle table at fffff8a000001780 with 647 entries in use

800004e8: Object: fffff8a006fcf8b0  GrantedAccess: 000f01ff Entry: fffff8a0029f93a0
Object: fffff8a006fcf8b0  Type: (fffffa8018d5c9c0) Token
    ObjectHeader: fffff8a006fcf880 (new version)
        HandleCount: 2  PointerCount: 20

Of course, the new version of WinDbg still uses the same debugger engine as the traditional WinDbg. So, please don’t try this at home, but I couldn’t help myself from copying the DbgEng libraries delivered with WinDbg v.Next from:

C:\Program Files\WindowsApps\Microsoft.WinDbg_1.1906.12001.0_neutral__8wekyb3d8bbwe\amd64

To the traditional WinDbg installation folder:

C:\Program Files (x86)\Windows Kits\10\Debuggers\x64

So now I have two working WinDbgs 😊

With the yak officially shaved naked, I was able to finally determine that the token field is a HANDLE to the token of the caller making the connection. Yes, I probably should have just assumed that before getting out the shears, but what fun would that have been?