A Disktest program to crash Windows

99/08/13

This disktest program was written in Delphi 3. The full project source and EXE file can be downloaded by clicking here (149KB).

Introduction

Want to show the network administrator who really runs the server? Need to test a real-time process to see if the drivers can run well over long periods during local or network disk access? Want to show your congressman why Windows NT shouldn't be used to run battle ships? (yes, the Navy is really trying) Well, this disk testing program simply writes and reads to disk drives just like normal programs do, but it also has a darker side. It seems that there may be some bugs in the IDE disk drivers and/or Windows NT kernal that Microsoft hasn't fixed yet. There are bugs in many popular I/O drivers for specialized boards such as A/D converters. They can be a real show stopper, creating blue screens for your enjoyment.

Where I now work, we needed to convert an old DOS real-time application to Windows NT. We purchased the proper drivers from one of the top A/D board suppliers and went to work. Shortly after the first installation we started to get reports that Windows NT was crashing about every week. Further testing with this program increased the crash frequency to about every hour. Now, about 3 months later, they are still saying they will get the bug fixed soon. Meanwhile, we have switched to a more expensive solution with drivers that work better, but even they can crash after several days of testing with this program. It should be noted that this bug only occurs if you do your A/D operations at high speed in a high priority thread. None of the A/D board supplier's test programs would fail. The Windows 95/98 drivers also wouldn't crash.

Quite simply, the disk I/O system and cache in Windows NT/95/98 is problematic. Microsoft has taken the stance that what you can't control won't hurt you. The following is from the latest Windows NT Resource Kit:

Tuning the Cache

This is going to be a short section, because there is not much you can do to tune the Windows NT file system cache. The tuning mechanisms are built into the Virtual Memory Manager to save you time for more important things.

In the days of DOS you could completely control the disk cache or even replace it with one from another company if you didn't like the Microsoft version. Because disk cache operations occur at the kernal level, they tend to block other processes and can delay real-time operations. Even if they don't delay operations, they can cause excessive swapping of your program pages and slow down performance of applications. I have personally found that this lack of control over the cache, and the tendency of the VMM to swap your program out of memory to make room for more disk cache, is the one main factor that makes Windows NT difficult to use for real-time control. Sure, programs run fine if you don't have much user interaction (and less page faults), but just run this disk test program and see your system slow down to a crawl. If you could limit the amount of physical memory used by the cache you could tune the system for your applications, but with Windows NT, there is no easy way to do this.

A properly sized disk cache can really increase the performance of a program, up to a point. If the cache becomes too big then the time required to find the information in it can exceed the benefits of using one. I don't really understand why Microsoft would allow a disk cache to grow up to 64 MB on my system with 128 MB of RAM, but they do, and their Task Manager and Performance Monitor don't even properly report the cache size. The screen shots below will show how they hide the actual usage of the cache (this may have worked correctly before I installed SP4 - this may be a "feature" added by the new service pack).

You will find some information on adjusting the Windows NT cache at http://technet.microsoft.com/en-us/sysinternals/bb897561 but my experience has been that this utility doesn't give you the total control needed.

Update 1/2006: O&O Software (http://www.oo-software.com/en/products/ooclevercache) has a cache control program that appears to work.

Disktest Program Operation

I will now discuss using the disk test program. This first screen shot shows the Task Manager (you can get it by pressing ctrl-alt-del and selecting the Task Manager button) with the system resting. I have a total of 128 MB of RAM which shows up as 130488K of available physical memory. The free physical memory and file cache are also indicated. As you can see, they don't quite add up. This is because some of the memory is in use by Windows and applications.

Figure 1: Task Manager with system resting. 

The basic operation of the disk test program is as follows:

  1. Pick a drive and directory where you want to run the test.
  2. Compressed drives and network drives will run much slower than high speed local drives. Network performance will depend on your cable speed and also on the raw server disk speed. To get a good reading of the actual disk performance you must let the test run until it writes a file at least as large as the amount of RAM you have.
  3. Check the Run Continuous box if you want the test to run repeated loops until you stop it (for overnight testing).
  4. The buffer size can be set with the slider. The 512 K setting works well in most cases (this means that we pass off a 512 K block of memory to Windows for each I/O request).
  5. Click the Run Test button to start the test.
  6. The program creates a file named Disk Speed Test File and starts to write to it.
  7. If the program blue screens Windows and leaves a copy of the file on your disk, you will need to delete this file before it will run again. If you do crash Windows I suggest you run a CHKDSK on all drives to correct potential errors and then delete the test file.
  8. The write operation runs until the disk is full and reports an error
  9. You can click the Cancel button once to cancel the write operation before the disk is full.
  10. The read operation runs until the complete file has been read.
  11. The Cancel button can be clicked again to cancel the read test.
  12. The program then deletes the file and stops operation (unless you checked Run Continuous).
  13. The speed of reading and writing are reported.


To run this first test I selected an uncompressed NTFS drive and directory and let the disk write test run up to about 64 MB. Pressing the Cancel button once during the write test will stop it and start the read test. The Performance Monitor doesn't show much change in the physical memory used by the file cache. At most it goes up or down by about 2 MB.

Well, this really can't be the case because the read speed is too high. There is no way the physical disk could have read the data back at 38 MB per second. Most hard disks can only go about 2 to 3 MB per second on Windows NT (they run faster on Windows 95/98). What has actually happened here is that Windows NT used up about 64 MB of my RAM to cache the file and then read it directly out of RAM during the read test. Neither the file cache or available physical memory showed any indication of this memory being used.

Figure 2: disk test with all file data cached.

Now I run the same test on a FAT drive with less available space on it. The Task Manager still doesn't show much memory usage, but look at how much faster the FAT drive operates. If I let it run on the NTFS drive and write about 128 MB, the write speed is about 2 MB per second and the read speed is about 2 MB per second. The FAT disk is about 2 to 3 times faster than NTFS even on the same physical disk. One would expect some slowing down due to the increased NTFS overhead but this test says that perhaps we should be running our servers on FAT partitions (too bad they don't go above 2 gig).

Figure 3: disk test with only some file data cached on a FAT drive.

This screen shot of the Task Manager shows how the physical memory values change very little when the disk test is running.

 Figure 4: Task Manager performance during a disk test.

We now run the test on a FAT drive with a little more free space. For this test I write and read about 190 MB which is a little larger than the amount of physical RAM I have.

Figure 5: disk test on FAT drive with less use of cache.

Now for something strange. Finally, if I let the disk test program run for a long time, the Task Manager starts to show some physical memory being used for the disk cache, but only during the final stages of the read test.

Figure 6: Task Manager finally showing some cache use.

In reality, we know most of the physical memory is being used for the cache. Run a large application (like MS Word) during a disk test and look at the physical memory. You should still see a lot of available physical memory provided you have 128 MB to start with like I do. Switch to the Processes page and make a note of the Page Fault count for WINWORD. Let the disk test run and minimize and then maximize Word. The Page Fault count will jump up by several hundred when you maximize Word. If there is so much physical memory available, why did the VMM swap Word's pages out? Here is a classic example of poor performance caused by improper cache and memory manager design if we are willing to believe the numbers reported by the Task Manager. In reality, Windows NT is filling all the available RAM up with a file cache and swapping or removing application pages from memory to make room. If we only could set the disk cache at a reasonable 10 MB I bet that we would see much better multitasking performance as well as decreased real-time response times.

Crashing Windows

Finally, here are some tips on how to crash (or blue screen) Windows NT. Run the disk test program on a local drive in the continuous mode while your other I/O or real-time application is running. It is best to use a small FAT partition with at least 256 MB free for the disk test. If you use the Windows system drive where the paging file is kept you may cause application failures when the disk is written to zero free bytes during the test. Another good reason to not run it on the paging disk is that it may crash with zero free space and then not boot the system. We ran our A/D program at high priority so it always got CPU time during the disk test and run at full speed. With luck you will see a blue screen in a few hours. If your I/O driver is working properly it will just keep running.

At the very least, this program will show you how well (or not) Windows NT works in real-time applications. When you run this program it will force other programs to swap out of memory to make room for a larger disk cache. If you switch between applications you will start to notice that it takes a long time for them to respond to commands because they need to load most of their code back in and must wait for the cache to empty and free the necessary memory. If your custom I/O driver doesn't lock down the I/O buffers properly, or your application creates a page fault at the wrong time, you can end up with a blue screen.

Run it on a network drive and you will make the whole company love you. This program can easily hog all the bandwidth on a network if the server can keep up with it. I find it interesting that any user running this program can tie up the network or a server so that other users start to see very slow response times.

© James S. Gibbons 1987-2015