How to install and setup diskspd before starting your first performance tests and avoiding wrong results due to null byte issues.
Installing Disk-Speed (diskspd).
- Get diskpd binary from Microsft : http://aka.ms/diskspd
- Manual is here: https://github.com/Microsoft/diskspd/wiki
- Agree to License
- Extract Zip file (\Documents\Diskspd-2.0.21a)
- open Terminal/Command Prompt
- cd to the Extracted directory
- cd to AMD64
diskspd operates on windows filesystems, and will read / write to one or more files concurrently.
The NULL byte problem
By default, when diskspd creates a file it is a file full of NULL bytes. Many storage systems (at least NetApp and Nutanix that I know of) will optimize the layout NULL byte files. This means that test results from NULL byte files will not reflect the performance of real applications that write actual data.
To avoid overly optimistic results, first create the file, then write a randomized data pattern to the file before doing any testing.
Create a file using diskspd -c. e.g. for a 32G file on drive D: then overwrite with random data.
diskspd.exe -c32G D:\testfile1.dat
This will create a 32G file full of NULL bytes
Then overwrite with a random pattern
diskspd.exe -w100 -Zr D:\testfile1.dat
Available write patterns.
diskspd provides a variety of options when writing out datafiles. I strongly recommend using random patterns for testing underlying storage, unless you are specifically trying to understand how the storage handles particular patterns.
For purposes of demonstration, create 3 files of 2GB in size.
diskspd.exe -c2G F:\testfile2 diskspd.exe -c2G F:\testfile3 diskspd.exe -c2G F:\testfile4
Option 1 – Repeating pattern
Use -w100 (write 100%) with no additional flags to generate a repeating pattern.
diskspd.exe -w100 F:\testfile2.dat
Option 2 – Write NULL (Zero byte) pattern
Use -w100 with -Z to generate NULL bytes
diskspd.exe -w100 -Z F:\testfile3.dat
Option 3 – Write Random pattern (Recommended)
Use -w100 with –Zr to generate a random pattern
diskspd.exe -w100 -Zr - F:\testfile4.dat
Here are the resulting patterns
testfile2.dat (-w100) == repeating pattern. Writes the values 00, 01..0xFE,0xFF then repeats.
testfile3.dat (-w100 –Z) == NULL bytes
testfile4.dat (-w100 –Zr) == Random pattern
A Simple test of random-ness in data
A really effective test of data “random-ness” is to see how much a file can be compressed. The built in compression tool in Windows File-Manager is good enough. We see that the repeating pattern compresses almost as well as the NULL byte pattern. So for an intelligent storage platform – although the special-case optimization for NULL bytes will be defeated, the storage engine will probably compress the file internally. This is good in most use-cases but not if you’re trying to test the underlying storage with a particular file size.
testfile2.dat (repeating patterns) 2GB -> 97KB
testfile3.dat (NULL bytes) 2GB ->74KB
testfile4.dat (random) 2GB -> 2GB. (Note the “compressed” file is actually a bit larger!)