Last reviewed and updated: 10 August 2020
You can think of just about any driver performing three separate types of processing:
1. Receiving and managing Requests from the OS.
2. Creating and managing internal state and data structures.
3. Interacting with other components in the system – hardware, other drivers, or other random pieces of the OS – to complete the processing of Requests that have been received.
For certain classes of drivers, item number 2 – the creation and management of the driver’s internal state and data structures –represents the majority of the code and complexity in the project. Developing and testing such drivers can pose unique challenges due to the amount of code and the complexity of the algorithms involved.
Over the past few years, a new and effective strategy has evolved for more easily and efficiently developing and testing these types of drivers. That strategy is to develop and test the code in user mode first. Then, after the code is proven to be solid, move it to kernel mode.
Here at OSR, we’ve used this approach in our own development projects, and we’re aware of it being used for complex driver projects at Microsoft as well. Universally, people start out being uber-skeptical of the value of this approach: Won’t it be a yuuuge PITA to create the infrastructure necessary to write and test my driver code in user mode? Will I really save enough time by doing this to make it worthwhile? Won’t I wind-up re-writing a lot of code when I eventually move to kernel mode? Can’t I just get on with writing my kernel mode code and get my project done?
In every case I’ve heard of or been involved with, the effort required to develop and test in user mode has been well worthwhile. Developing and testing complex code in user mode has actually saved tons of time, not cost time as you might guess. And, more importantly, the quality that’s resulted has been much better than would likely have been possible in the same amount of time if development and testing had taken place strictly in kernel mode.
Not convinced? I don’t blame you. I was skeptical at first as well. In this article, based on experiences in a few recent projects we’ve done, I’ll provide a number of hints and tips that should help you succeed if you decide to give this technique a try.
Some Recent Experience
As a real-life example where the user mode development and testing strategy was helpful, we’ll use a driver we wrote recently here at OSR. This driver required a caching package that temporarily stored disk writes in memory. We initially wrote and tested this caching package entirely in user mode.
The caching package allocates an appropriately sized memory buffer, stores the data being written disk into that buffer, keeps track of the disk sector range to which that (now cached) data was to have been written, and then completes the write operation without ever sending it to disk. When a disk read occurs, the caching package checks to see if it data for all or part of the sought range is in cache. If any of the sought sectors are in cache, the cache package satisfies the disk read using the cached data (plus any data that was not cached, read directly from disk). While there were a few other wrinkles in the real project, those are the functional requirements at their most basic.
This project was a perfect choice for user mode development and testing, because the major complexity in the driver was the caching code. The algorithms to track which blocks were stored in cache, locate the buffers associated with the blocks, allocate new cache storage buffers and their associated tracking structures if necessary, and to allow blocks of stored data to be read and written comprised almost all of the driver’s complexity. Both data structures and locking strategies had to be designed to ensure as many things as possible could happen in parallel (for performance reasons) while also ensuring the integrity of data structures and data.
Any good design would surely isolate the caching code from the Request processing code. And, of course, that’s how this driver was designed. This design made developing and testing the caching code in user mode a pretty easy task. The task was made easier still by the fact that Request processing code had dependencies on the caching code, but there were no dependencies from the caching code on the Request processing code. Thus, it was easy to treat the caching code as a stand-alone entity. By the way, there is a moral to this story: A good design, with good isolation of functional components, can go a long way toward facilitating user mode development and testing.
With that background, let’s look at a few hints and tips that might help you adopt this strategy for your own use.
Write a Regular Program, Not a User-Mode Driver
When I say, “develop and test your driver code in user mode”, I’m not suggesting that you should try writing your driver using UMDF. Now, let me hasten to add that UMDF is definitely a good thing. And it is certainly true that writing and testing a driver using UMDF for certain classes of devices can definitely be quicker and easier than writing the same driver using KMDF. But for the class of driver we’re focusing on here – drivers with lots of complex internal processing and data structures – I’m not suggesting you start with UMDF. I’m actually suggesting you code-up and test all the complex processing and data structure management as an ordinary user mode program from within Visual Studio.
The primary thing that writing your code as an ordinary user mode program achieves is dramatically faster “cycle time.” Think about it. When you want to make a change to the code in your kernel mode driver, you need to edit the code, stop the device, copy the executable, and restart the device. If there was a crash involved (or your driver needs to start at boot time) you’ll need to reboot the system. You need to fight with WinDbg when it doesn’t handle your breakpoints properly or when your client and target don’t sync-up properly. This all takes a lot longer than making a change and restarting your program under the Visual Studio debugger. In my experience, the time you save really adds up. You simply have to experience it to realize how dramatic the difference is.
It’s important to understand that the code you want to focus on in your user mode development and testing is not the code that directly performs device initialization, or directly receives Requests and processes them. Rather, the code you want to focus on testing is your algorithmically complex code. So, in the case of the example disk caching driver we described previously, the only code we wrote and tested in user mode was the disk caching code. Code to get Requests and process them, and code to deal with things like power state changes, was left for standard kernel mode development and testing.
Throw-Away or Sustainable Infrastructure?
When you decide to code and test parts of your driver in user mode, perhaps the biggest decision you’ll face is whether you want the user-mode infrastructure that you create to be sustainable after your initial development and testing process has ended. Creating a “throw away” infrastructure in user mode is easy. Creating a sustainable infrastructure that you can use for modifying and testing later versions of your driver code is harder, but brings additional advantages.
Restricting the use of user mode to only the initial design and development of your driver is the simpler alternative. Because the infrastructure you’re creating isn’t permanent, you’re pretty much free to hack and whack whatever you need. No need to be tidy. No need to write comments that are comprehensible by third parties. No need to be embarrassed by writing ugly code. All you need to do is to make something work temporarily, while your algorithms and data structures are being developed and validated.
Building a sustainable development and testing infrastructure in user mode tends to be much more work. But it can also yield much bigger returns on your invested time and effort. As needs change or bugs are found, you can return to your quick and convenient user mode infrastructure to rapidly code and/or test those changes.
The biggest problem that we’ve seen in maintaining a sustainable user mode infrastructure is that it’s very easy for code paths to diverge and “rot.” That is, once the initial user mode portion of your project has been completed and you’ve moved on to kernel mode development and testing, you’ll certainly be making changes to your driver’s code modules. It can be pretty easy for these changes to unintentionally break – in small or large ways – the user mode infrastructure that you previously created. If you don’t build and exercise the user mode infrastructure regularly, resurrecting it after a significant series of kernel mode changes can be quite a challenge. Worse yet, while you’re in the midst of your development cycle driving to a release, maintaining the user mode infrastructure can feel like an unnecessarily burdensome task.
For the projects that I’ve done, I almost always favor the “use once and throw away” approach. I find if I don’t take that approach, I spend way too much time thinking about my user mode infrastructure. If it’s going to be a persistent part of the project, I feel like I have to “design” it; I feel like I have to do a professional job of it. If it’s something that’s going to be used once and thrown away, I feel more at ease with just “making it work.” Hand me that chainsaw, please.
First Practical Steps
Whether you decide to create a one-time or a lasting infrastructure, there are some simple, practical, hints that we can provide that’ll make your project easier.
First, unless processor architecture (x64, x86, or ARM) plays a crucial role in the algorithms you’ll be developing and testing in user mode, decide on one processor architecture for your user-mode infrastructure and just stick to that. You ever look at the code in NTDDK.H that’s conditionalized based on processor architecture? Yeah… you don’t want to try to recreate that and keep it maintained unless absolutely necessary.
#ifndef _KERNEL_MODE #pragma once #ifdef _DEBUG #define DBG 1 #endif #ifdef __cplusplus extern "C" { #endif #define CLONG ULONG #ifndef _AMD64_ // UM supports x64 only #define _AMD64_ 1 #endif #define UMDF_USING_NTSTATUS 1 #include <stdio.h> #include <windows.h> #include <ntstatus.h> #define DbgPrint printf // // Interrupt Request Level (IRQL) // typedef UCHAR KIRQL; typedef KIRQL *PKIRQL; // // AMD64 Specific portions of Mm component. // // Define the page size for the AMD64 as 4096 (0x1000). // #define PAGE_SIZE 0x1000 // // Define the number of trailing zeroes in a page aligned // virtual address. This is used as the shift count when // shifting virtual addresses to virtual page numbers. // #define PAGE_SHIFT 12L #define NT_SUCCESS(Status) (((NTSTATUS)(Status)) >= 0) … File continues…
Next, we recommend that you create a dedicated header file that will contain most of the definitions for your project that are specific to user mode. Conditionalize this header on the code not being built in kernel mode. Our user-mode specific headers usually start somewhat as shown in Figure 1.
Looking at Figure 1, you’ll notice that we qualify the entire include file by the symbol _KERNEL_MODE not being defined. To get the NTSTATUS values defined, we include NTSTATUS.H, but only after defining UMDF_USING_NTSTATUS to 1. We don’t recommend you try to #include any of the other WDK header files. What we’ve found works best is including WINDOWS.H and then just copying the definitions you need from NTDDK.H or WDM.H into your dedicated user mode header file. Sure it’s ugly, but you won’t lose any points for including stuff your code doesn’t really need. So, copy and paste some stuff when you start, and then as you find you need things defined – be they function prototypes, macros, or typedefs – just add them to your user mode header file and you’re good to go.
There’s one set of macros that’s used by almost every driver that’s been written: The list manipulation macros. These include InsertTailList, RemoveHeadList, and all their friends. These are defined in WDM.H. A quick hint is that it’ll be simpler to copy the definitions if you copy the versions from WDM.H that are defined when the NO_KERNEL_LIST_ENTRY _CHECKS is defined. This leaves out the dynamic checks for list consistency, which you probably won’t need.
Once you get the basics defined, I think you’ll be surprised at how little you need to cut/paste from the WDK headers. If your code is focused on algorithms and data structure manipulation, most of that type of code doesn’t tend to use a lot of kernel mode specific functions.
Allocating Memory, Locks, and Stuff
The most common question that we encounter in building a user-mode driver testing infrastructure is how to handle basic functions that are different in kernel mode and in user mode. For example, when the code that you’re writing, and that you’ll eventually be moving to kernel mode, needs to allocate memory or perform locking functions. Calls to these functions tend to be spread all through your driver code. How do you handle the fact that the names of the routines that you need to call in user mode are different from the names in kernel mode?
There are two possible approaches, and both have been used by teams here at OSR. The easiest approach is to use macros to define the kernel mode function name as some reasonable user mode equivalent. So, in terms of allocating memory, you might put the following definition in your dedicated user mode header file:
#define ExAllocatePoolWithTag(size, tag) malloc(size)
You then write your code to use the kernel mode function name, and your dedicated user mode header “does the right thing” by defining it for user mode use.
Alternatively, you might choose to create private functions that perform the necessary operation(s), and “do the right thing” within those functions based on the mode for which the code is being compiled. For memory allocation, you might define a function like the following (see Figure 2).
__forceinline PVOID CachePackageAllocateMemory(_In_ SIZE_T NumberofBytes, _In_ ULONG Tag) { #ifdef _KERNEL_MODE return(ExAllocatePoolWithTag(NonPagedPoolNx, NumberofBytes, Tag)); #else UNREFERENCED_PARAMETER(Tag); return(VirtualAlloc(NULL, NumberofBytes, MEM_COMMIT, PAGE_READWRITE)); #endif }
Personally, I tend to favor the pattern of defining private functions for things like memory allocation even when I’m developing and testing my code strictly in kernel mode. There’s something about isolating the underlying mechanism from the need for that mechanism that just appeals to me. Of course, if you just want to get something running so you can test in user mode, it’s hard to argue with the simplicity of just writing your code to reference the familiar kernel mode function names you already know and love, and then redefining those names as their chosen user mode equivalents. The choice is really up to you, and is mostly related to the coding practice you prefer.
The same alternatives apply to locking. You could choose to simply #define the kernel mode function names as some appropriate user mode equivalent. But, once again, I often choose to define private locking functions in my driver code that’s particular to specific structures. Whichever method you choose, it’s usually pretty simple to find user mode locks that have similar semantics to the type of lock you choose in kernel mode. In Figure 3, you can see how we’ve handled a shared reader/writer lock that would be usable at elevated IRQL in kernel mode.
_Requires_lock_not_held_(GCLock->Lock) _Acquires_shared_lock_(GCLock->Lock) __forceinline VOID OSRCachePackageAcquireLockShared(_In_ PGC_LOCK GCLock, _Out_ PKIRQL OldIrql) { #ifdef _KERNEL_MODE *OldIrql = ExAcquireSpinLockShared(Lock); #else AcquireSRWLockShared(&GCLock->Lock); *OldIrql = 0; #endif } _Requires_shared_lock_held_(GCLock->Lock) _Releases_shared_lock_(GCLock->Lock) __forceinline VOID OSRCachePackageReleaseLockShared(_In_ PGC_LOCK GCLock, _In_ KIRQL OldIrql) { #ifdef _KERNEL_MODE ExReleaseSpinLockShared(Lock, OldIrql); #else UNREFERENCED_PARAMETER(OldIrql); ReleaseSRWLockShared(&GCLock->Lock); #endif }
Figure 3 only shows one pair of lock/unlock functions, but that should be enough to illustrate the overall idea without making things too boring. Note that we define our own, private, data type for the lock (GC_LOCK in the example). The definition of this data type will vary, based on whether the code is compiled for user mode or kernel mode. This allows us to allocate storage for the lock in our data structures without having to hand-code changes in those structures. In the example, you can see that we use reader/writer spin locks in kernel mode, and in user mode we’ve chosen Slim Reader/Writer locks as a reasonable analogue. Note again that, we conditionalize the code in the private function (which typically lives in a component-specific header file, along with the mode-specific definition of the lock data type itself).
Whatever type of kernel mode lock you need, you’ll be able to easily find a user mode analogue. And, again, whether you choose to acquire and release those locks using appropriately conditionalized private functions that you define or just #define the kernel function names to their user mode equivalents, is probably more to do with your engineering style than whether one method can be considered objectively superior.
Support Routines
One major advantage of building your test infrastructure in user mode is that you have access to all the “stuff” that coding in user-mode provides. Need random numbers, some type of collection to store test data, or the ability to sort your results? You have everything that the C++ Standard Library (std::) has to offer available to you. Need to allocate enormous data structures? Just malloc/new to your heart’s content. Need to read or record test data? Easy access to file I/O is a good thing. Need six million local variables, including an array with a zillion entries? Just declare it all and let the user mode stack grow to accommodate it. I’m not saying it’s impossible to handle these things in kernel mode. I’m just saying things such as these are a lot easier in user mode.
But wait, there’s more! Not only do you have the benefit of access to all the support routines that user mode has to offer, you can still use your familiar friends from kernel mode, the Run Time Library (RTL) functions. Most of the RTL functions are available for use in your user mode test harness by linking with NTDLL.LIB. So, for example, if you decide to use the RTL’s Generic Table Package there’s no problem with calling those functions. Love the RTL-defined bit map routines? They’re there for you to call. Of course, you will have to copy the prototypes for these functions from the appropriate WDK header file to your dedicated user mode definition file.
Other Miscellaneous Advantages
I hate to say it, but programming and debugging in user mode using Visual Studio brings with it a lot of life-simplifying features. The user-mode debugger knows a few tricks that the kernel mode debugger hasn’t learned yet. When running in user mode, you can also use the lovely little “performance profiler” that’s integrated into Visual Studio to evaluate both CPU and memory usage. I actually found a bug using the memory profiler recently, so it has demonstrated its worth to me (at least once).
It’s a Win!
For the right driver project, developing and testing algorithm and data structure intensive routines in user mode can save time while increasing the thoroughness of your testing. By writing a bit of “bridge logic” to ensure your favorite functions and data structure are available in user mode, you can avail yourself of all the features that coding in user mode provides: Richer support routines, a slightly nicer debugging experience, and – perhaps most importantly of all – a much faster time for each iteration through the “build/test/find-bug/fix-bug” cycle.
Who would think that one of the newest things in the world of kernel mode driver development is… user mode development and testing!