Last reviewed and updated: 10 August 2020
Can you count the number of WinDBG commands you know on one hand? Been meaning to learn some commands other than !analyze –v but been too busy to crack the docs open? Well then, this article is for you! I’m going to break down ten WinDBG commands that I couldn’t live without.
System Information Commands
Sometimes as part of your analysis, you’d like a bit more detailed information about the target system that generated the crash dump. The commands in this section are going to let you find out critical details about your system that just might be the clues you need to perform your analysis.
Don’t be fooled by the name, the !vm command gives you a great quick view into the virtual and physical memory usage on a system. When I run !vm I like to use a flags value of 0x21, which will omit some process specific memory usage information and add in some extra info about the kernel address space on platforms that support it:
kd> !vm 0x21 *** Virtual Memory Usage *** Physical Memory: 261886 ( 1047544 Kb) Page File: \??\C:\pagefile.sys Current: 1572864 Kb Free Space: 1571132 Kb Minimum: 1572864 Kb Maximum: 3145728 Kb Available Pages: 211575 ( 846300 Kb) ... Free System PTEs: 231247 ( 924988 Kb) ... NonPagedPool Usage: 0 ( 0 Kb) NonPagedPoolNx Usage: 2969 ( 11876 Kb) NonPagedPool Max: 52691 ( 210764 Kb) ... PagedPool Usage: 4904 ( 19616 Kb) PagedPool Maximum: 51200 ( 204800 Kb) ...
NOTE: The !vm output currently has a bug where the non-paged pool usage will always be listed as zero. The actual non-paged pool usage is listed as, “NonPagedPoolNx Usage” in the output.
Note here that we see the amount of physical memory in the system as well as how much memory is currently free. We then get to note the current usage of the system PTEs as well as the pools. If we suspect some sort of resource exhaustion going on in the system, we can use this command to quickly pinpoint which resource is being consumed.
Do you have a customer that can repeatedly reproduce a problem but you just can’t reproduce it with the exact same procedure? Maybe you’re not using a fast enough processor or the right BIOS version, but in any event, how can you tell what system configuration the customer is using from just a dump file? Enter !sysinfo, a command that can tell you just about anything you’d want to know about your system using information cached on the target. For example, let’s see what kind of processor is in this system:
kd> !sysinfo cpuinfo [CPU Information] ~MHz = REG_DWORD 1779 Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0 Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0 Identifier = REG_SZ x86 Family 15 Model 1 Stepping 2 ProcessorNameString = REG_SZ Intel(R) Pentium(R) 4 CPU 1.80GHz Update Signature = REG_BINARY 0,0,0,0,2d,0,0,0 Update Status = REG_DWORD 0 VendorIdentifier = REG_SZ GenuineIntel MSR8B = REG_QWORD 2d00000000 CPUID1 = REG_BINARY 12,f,0,0,8,8,1,0,0,0,0,0,ff,fb,eb,3f
How about the BIOS version and other platform information?
kd> !sysinfo machineid Machine ID Information [From Smbios 2.3, DMIVersion 35, Size=2982] BiosVendor = Dell Computer Corporation BiosVersion = A05 BiosReleaseDate = 10/05/2001 SystemManufacturer = Dell Computer Corporation SystemProductName = OptiPlex GX400 BaseBoardManufacturer = Dell Computer Corporation BaseBoardProduct = OptiPlex GX400 BaseBoardVersion =
There’s more here as well if you go exploring the documentation for the command. For example, you can even query information about which RAM slots are populated using the smbios switch (e.g. !sysinfo smbios –memory).
Suspected Race Condition Commands
Race conditions are the worst. They’re difficult to track, difficult to reproduce, and when you get a crash it may be too late. The race has already happened and when the system crashes you’re dealing with the secondary failure, so there’s nothing that can be done, right? Wrong! WinDBG has a couple of commands that can make you feel like you’ve won the lottery and pinpoint the racing thread with ease.
If you’re lucky, the thread that is racing with your crashing thread is still running on another processor. This is where !running comes in, which will show you information about each thread that is currently running on a processor in the system. Whenever I run this command I like to specify the –ti switch, to include thread stacks in the output as well as idle threads:
1: kd> !running -ti .. 0 f7857120 85ed2da8 ................ ChildEBP RetAddr ba9be270 804f961f nt!KeBugCheckEx+0x19 ba9be62c 805310dd nt!KiDispatchException+0x307 ba9be694 8053108e nt!CommonDispatchException+0x4d ba9be6a4 f6cd768d nt!Kei386EoiHelper+0x18e ba9be6b4 f6c0675a ks!KsReleaseIrpOnCancelableQueue+0x5b ba9be758 f6c15264 portcls!CIrpStream::ReleaseUnmappingIrp+0xd0 ba9be780 f6c21760 portcls!UpdateActivePinCount+0xb f6cd7553 10c2c95e portcls!CPortPinWavePci::DistributeDeviceState+0x4d 1 f7867120 86fb5b30 ................ ChildEBP RetAddr f7a1eba0 f6c0d445 portcls!CIrpStream::GetMapping+0x17 f7a1ebc8 f6c31ce1 portcls!CPortPinWavePci::GetMapping+0x2a …
If the thread isn’t actively running, you might think that you would have to go the long way and try finding a racing thread with !process 0 7. However, WinDBG also provides us a way to look at threads that are ready to run, with the !ready command. Maybe the current thread pre-empted another thread and that’s the reason for the race, in which case the other thread will be in the ready state. Whenever using !ready, I like to use the 0xF flags value so that I can see the call stacks of the threads, though I won’t do that here just to keep the output short:
kd> !ready Processor 0: Ready Threads at priority 8 THREAD 8543cd48 Cid 0004.0b58 Teb: 00000000 Win32Thread: 00000000 READY Processor 0: Ready Threads at priority 1 THREAD 85367020 Cid 0004.0008 Teb: 00000000 Win32Thread: 00000000 READY
Have an address and want to know what it is? Is it a pool allocation? Is it paged out? Here are a couple of commands that will get you the information that you need.
!pool is a standard command for any toolbox, so I suspect that most of you know it and love it already. However, for those that might not be aware, !pool will take an arbitrary virtual address and let you know if it is a pool allocation or not. If it is indeed a pool allocation, you’ll be told some details about it, such as whether it’s allocated or freed, the length of the allocation, the tag, etc. When I use !pool, I like to specify a flags value of 2 to suppress information about other allocations surrounding the address:
kd> !pool 8539da40 2 Pool page 8539da40 region is Nonpaged pool *8539da40 size: 8 previous size: 148 (Free) Io Pooltag Io : general IO allocations, Binary : nt!io
Before moving on, I’d like to note something in the output here that often confuses people. The previous size value mentioned here is not the, “previous size of this allocation.” Instead, what it is telling you is the size of the allocation preceding this entry in the pool page. This is used as part of a consistency check by the Memory Manager to validate that the page of memory has not been corrupted by buffer overruns or underruns.
Sometimes you’d like to view the virtual memory structures for a given virtual address, such as the PDE and PTE. In that case, you can use the !pte command, which will provide decoded information about a virtual address. Here’s some example output for a valid virtual address:
kd> !pte 9371a000 VA 9371a000 PDE at C0300934 PTE at C024DC68 contains 9B441863 contains 8B660121 pfn 9b441 ---DA--KWEV pfn 8b660 -G--A--KREV
We can also see what happens if we specify a virtual address that isn’t valid to the hardware, such as one with its backing page currently in transition:
kd> !pte 93726000 VA 93726000 PDE at C0300934 PTE at C024DC98 contains 9B441863 contains 8B5A0860 pfn 9b441 ---DA--KWEV not valid Transition: 8b5a0 Protect: 3 - ExecuteRead
Now we have some further details as to why the address is invalid, which may be invaluable to our investigation.
Viewing O/S Trace Information
The O/S has some built in trace facilities that you can turn on to collect data that might be useful during analysis. Unfortunately these facilities need to be turned on before the problem happens, but knowing that this information is available can be useful in some situations.
We’re all using Driver Verifier, right? Well, what you might not realize is that starting in Windows Vista Verifier has been enhanced to keep a log of interesting events that happen in your driver. Assuming that you’ve enabled Driver Verifier on your driver, you can now extract valuable information with the following !verifier commands:
- !verifier 0x80 Address – This command dumps the allocate and free log, which logs each pool allocate and free made by your driver. Included in the output is the call stack of the operation, which can be invaluable when you’re trying to track down use after free or double free bugs. Optionally, the command takes an address value that will limit the output to only include allocation ranges including that address.
- !verifier 0x100 Address – This command dumps the IRP log, which logs each call to IoAllocateIrp, IoCancelIrp, and IoCompleteRequest made by your driver.
- !verifier 0x200 – This command dumps the critical region log, which logs each call to KeEnterCriticalRegion and KeLeaveCriticalRegion made by your driver.
!htrace and !obtrace
Handle leaks and object reference leaks can be very tricky to track down, especially when working with a large code base. Luckily, the O/S has built in facilities for logging handle and reference count activities. All you need to do is enable them and be aware of the commands available for extracting the logs, which in this case are !htrace and !obtrace.
Handle tracing needs to be enabled on a per-process basis, which can be done by using Application Verifier. As driver writers, however, we’re typically only interested in kernel handles. By implementation, kernel handles are actually just handles from the handle table of the System process. And, as luck would have it, if you enable Driver Verifier handle tracing is automatically turned on for the System process. Thus, as long as Driver Verifier is enabled on the target you can dump the handle tracing log for all kernel handles with !htrace 0 PEPROCESS:
1: kd> !htrace 0 85e0a170 Process 0x847c6530 ObjectTable 0x85c01aa8 -------------------------------------- Handle 0x281C - CLOSE Thread ID = 0x00000ab4, Process ID = 0x00000408 0x82a63f72: nt!ObpCloseHandle+0x7F 0x82a98bf0: nt!ObCloseHandle+0x40 0x828cef7b: nt!ExpWorkerFactoryCreateThread+0xFC 0x828bf02e: nt!NtSetInformationWorkerFactory+0x56D ... -------------------------------------- Handle 0x281C - OPEN Thread ID = 0x00000ab4, Process ID = 0x00000408 0x82a97fde: nt!ObOpenObjectByPointerWithTag+0xC1 0x82a98043: nt!ObOpenObjectByPointer+0x24 0x82a9cdf0: nt!PspCreateObjectHandle+0x2E 0x82a742ff: nt!PspInsertThread+0x685 0x82a9392e: nt!PspCreateThread+0x244
Object reference tracing, on the other hand, needs to be enabled on a system wide basis with GFlags. Due to the volume of tracing generated, when you enable tracing you must specify the pool tag of the object you want to trace (e.g. ‘File’) and you can also limit the tracing to only apply to a single process’ objects. Once you have enabled tracing via GFlags, you can view the trace for a given object with !obtrace:
0: kd> !obtrace 9f6aca50 Object: 9f6aca50 Image: notepad.exe Sequence (+/-) Tag Stack -------- ----- ---- --------------------------------------------------- f3 +1 Dflt nt!ObCreateObject+1c4 nt!IopAllocRealFileObject+50 nt!IopParseDevice+ac4 nt!ObpLookupObjectName+4fa nt!ObOpenObjectByName+159 nt!IopCreateFile+673 nt!NtOpenFile+2a nt!KiFastCallEntry+12a f4 +1 Dflt nt!ObfReferenceObjectWithTag+27 nt!ObfReferenceObject+12 nt!IopParseDevice+1395 nt!ObpLookupObjectName+4fa nt!ObOpenObjectByName+159 nt!IopCreateFile+673 nt!NtOpenFile+2a nt!KiFastCallEntry+12a f5 -1 Dflt nt!ObfDereferenceObjectWithTag+22 nt!NtOpenFile+2a nt!KiFastCallEntry+12a -------- ----- --------------------------------------------------- References: 2, Dereferences 1 Tag: Dflt References: 2 Dereferences: 1 Over reference by: 1
Plug and Play and Power Issues
Nothing is more annoying than when the system hangs during a plug and play or power operation. Luckily, the debugger provides a quick way to identify the threads participating in the operation so that you can get right to resolving the issue.
!pnptriage is a nifty command that combines the output of several PnP related debugging commands. It will identify any of your devnodes with problems as well as dump out any PnP worker threads that are currently executing, which will give you the ability to quickly identify the threads in the system that might be of interest to you:
0: kd> !pnptriage … ******************************************************************************** Dumping devnodes with problems... ******************************************************************************** Dumping IopRootDeviceNode (= 0x86c05c08) DevNode 0x8a131e78 for PDO 0x8a1af6a8 InstancePath is "USB\VID_0403&PID_6001\7&2363c875&0&1" State = DeviceNodeInitialized (0x302) Previous State = DeviceNodeUninitialized (0x301) Problem = CM_PROB_FAILED_INSTALL ... ******************************************************************************** Dumping currently active PnP thread (if any)... ******************************************************************************** Dumping device action thread... THREAD 847f8798 Cid 0004.0044 Teb: 00000000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable 8712b944 NotificationEvent ... nt!KiSwapContext+0x26 nt!KiSwapThread+0x266 nt!KiCommitThreadWait+0x1df nt!KeWaitForSingleObject+0x393 nothing!NothingAddDevice+0xa9 nt!PpvUtilCallAddDevice+0x45 nt!PnpCallAddDevice+0xb9 nt!PipCallDriverAddDevice+0x565 nt!PipProcessDevNodeTree+0x15d nt!PiRestartDevice+0x8a nt!PnpDeviceActionWorker+0x1fb nt!ExpWorkerThread+0x10d nt!PspSystemThreadStartup+0x9e nt!KiThreadStartup+0x19
!poaction is the essential command for debugging any of your power related issues. Most importantly, !poaction will show any outstanding query or set power operations and the driver to which they were sent, which can be used to quickly identify which devices are preventing the power operations from occurring. Great for getting insight into what’s going on when the system will mysteriously refuse to enter or resume from a lower power state:
1: kd> !poaction PopAction: 8296ea60 State..........: 3 - Set System State Updates........: 0 Action.........: Sleep Lightest State.: Hibernate Flags..........: 80000004 OverrideApps|Critical Irp minor......: SetPower System State...: Hibernate Hiber Context..: 89dd5978 Allocated power irps (PopIrpList - 82978480) IRP: 8e1d8f00 (set/D0,), PDO: 89c0a248, CURRENT: 89fde028 IRP: 9d722e48 (set/D0,), PDO: 89c08818, CURRENT: 89f92620 IRP: 9fe7ee70 (set/D0,), PDO: 89c08940, CURRENT: 89f917a0 ...
Did I Miss Any?
Got your own favorite command that wasn’t represented here? Send me an email at firstname.lastname@example.org and let me know!
Analyst’s Perspective is a column by OSR Consulting Associate, Scott Noone. When he’s not root-causing complex kernel issues, he’s leading the development and instruction of OSR’s Kernel Debugging seminar. Comments or suggestions for this or future Analyst’s Perspective columns can be addressed to email@example.com.