Microsoft diskspd. Part 1 Preparing to test.

Installing Disk-Speed (diskspd).

Overview

diskspd operates on windows filesystems, and will read / write to one or more files concurrently.

The NULL byte problem

By default, when diskspd creates a file it is a file full of NULL bytes. Many storage systems (at least NetApp and Nutanix that I know of) will optimize the layout NULL byte files. This means that test results from NULL byte files will not reflect the performance of real applications that write actual data.

Suggested work-around

To avoid overly optimistic results, first create the file, then write a randomized data pattern to the file before doing any testing.

Create a file using diskspd -c. e.g. for a 32G file on drive D: then overwrite with random data.

diskspd.exe -c32G D:\testfile1.dat

This will create a 32G file full of NULL bytes

Default write command sends NULL bytes to disk

Then overwrite with a random pattern

diskspd.exe -w100 -Zr D:\testfile1.dat
same file after being overwritten with diskspd -w100 -Zr

Available write patterns.

diskspd provides a variety of options when writing out datafiles. I strongly recommend using random patterns for testing underlying storage, unless you are specifically trying to understand how the storage handles particular patterns.

For purposes of demonstration, create 3 files of 2GB in size.

diskspd.exe -c2G F:\testfile2
diskspd.exe -c2G F:\testfile3
diskspd.exe -c2G F:\testfile4

Option 1 – Repeating pattern

Use -w100 (write 100%) with no additional flags to generate a repeating pattern.

diskspd.exe -w100 F:\testfile2.dat

Option 2 – Write NULL (Zero byte) pattern

Use -w100 with -Z to generate NULL bytes

diskspd.exe -w100 -Z F:\testfile3.dat

Option 3 – Write Random pattern (Recommended)

Use -w100 with –Zr to generate a random pattern

diskspd.exe -w100 -Zr - F:\testfile4.dat

Here are the resulting patterns

testfile2.dat (-w100) == repeating pattern. Writes the values 00, 01..0xFE,0xFF then repeats.
testfile3.dat (-w100Z) == NULL bytes
testfile4.dat (-w100Zr) == Random pattern

On-Disk format of the written files.

A Simple test of random-ness in data

A really effective test of data “random-ness” is to see how much a file can be compressed. The built in compression tool in Windows File-Manager is good enough. We see that the repeating pattern compresses almost as well as the NULL byte pattern. So for an intelligent storage platform – although the special-case optimization for NULL bytes will be defeated, the storage engine will probably compress the file internally. This is good in most use-cases but not if you’re trying to test the underlying storage with a particular file size.

testfile2.dat (repeating patterns) 2GB -> 97KB
testfile3.dat (NULL bytes) 2GB ->74KB
testfile4.dat (random) 2GB -> 2GB. (Note the “compressed” file is actually a bit larger!)

A simple “compression” test will show how random the data really is in a file.

Leave a Comment