Creating compressible data with fio.

binary-code-507785_1280

Today I used fio to create some compressible data to test on my Nutanix nodes.  I ended up using the following fio params to get what I wanted.

 

buffer_compress_percentage=50
refill_buffers
buffer_pattern=0xdeadbeef
  • buffer_compress_percentage does what you’d expect and specifies how compressible the data is
  • refill_buffers Is required to make the above compress percentage do what you’d expect in the large.  IOW, I want the entire file to be compressible by the buffer_compress_percentage amount
  • buffer_pattern  This is a big one.  Without setting this pattern, fio will use Null bytes to achieve compressibility, and Nutanix like many other storage vendors will suppress runs of Zero’s and so the data reduction will mostly be from zero suppression rather than from compression.

Much of this is well explained in the README for latest version of fio.

Also NOTE  Older versions of fio do not support many of the fancy data creation flags, but will not alert you to the fact that fio is ignoring them. I spent quite a bit of time wondering why my data was not compressed, until I downloaded and compiled the latest fio.

 

Specifying Drive letters with fio for Windows.

windows_logo_-_2012_derivative

Simple fio file for using Drive letters on Windows.

This will create a file called “fiofile” on the F:\ Drive in Windows.  Notice that the specification is “Driveletter” “Backslash” “Colon” “Filename”

In fio terms we are “escaping” the “:” which fio traditionally uses as a file separator.

[global]
bs=1024k
size=1G
time_based
runtime=30
rw=read
direct=1
iodepth=8

[job1]
filename=F\:fiofile

To run IO to multiple drives (Add the group_reporting) flag to make the output more sane.

[global]
bs=1024k
size=1G
time_based
runtime=30
rw=read
direct=1
iodepth=8
group_reporting

[job1]
filename=F\:fiofile

[job2]
filename=G\:fiofile

Download fio for windows here

 

Things to know when using vdbench.

Recently I found that vdbench was not giving me the amount of outstanding IO that I had intended to configure by using the “threads=N” parameter. It turned out that with Linux, most of the filesystems (ext2, ext3 and ext4) do not support concurrent directIO, although they do support directIO. This was a bit of a shock coming from Solaris which had concurrent directIO since 2001.

All the Linux filesystems I tested allow multiple outstanding IO’s if the IO is submitted using asynchronous IO (AKA asyncIO or AIO) but not when using multiple writer threads (except XFS). Unfortunately vdbench does not allow AIO since it tries to be platform agnostic.

fio however does allow either threads or AIO to be used and so that’s what I used in the experiments below.

The column fio QD is the amount of outstanding IO, or Queue Depth that is intended to be passed to the storage device. The column iostat QD is the actual Queue Depth seen by the device. The iostat QD is not “8” because the response time is so low that fio cannot issue the IO’s quickly enough to maintain the intended queue depth.

Device
fio QD
fio QD Type
direct
iostat QD
 ps -efT | grep fio | wc -l
/dev/sd
8
libaio
Yes
7
5
/dev/sd
8
Threads
Yes
7
12
ext2 fs (mke2fs)
8
Threads
Yes
1
12
ext2 fs (mke2fs)
8
libaio
Yes
7
5
ext3 (mkfs -t ext3)
8
Threads
Yes
1
12
ext3 (mkfs -t ext3)
8
libaio
Yes
7
5
ext4 (mkfs -t ext4)
8
Threads
Yes
1
12
ext4 (mkfs -t ext4)
8
libaio
Yes
7
5
xfs (mkfs -t xfs)
8
Threads
Yes
7
12
xfs (mkfs -t xfs)
8
libaio
Yes
7
5

At any rate, all is not lost – using raw devices (/dev/sdX) will give concurrent directIO, as will XFS. These issues are well known by Linux DB guys, and I found interesting articles from Percona and Kevin Closson after I finally figured out what was going on with vdbench.

fio “scripts”

For the “threads” case.

[global]
bs=8k
ioengine=sync
iodepth=8
direct=1
time_based
runtime=60
numjobs=8
size=1800m

[randwrite-threads]
rw=randwrite
filename=/a/file1

For the “aio” case

[global]
bs=8k
ioengine=libaio
iodepth=8
direct=1
time_based
runtime=60
size=1800m


[randwrite-aio]
rw=randwrite
filename=/a/file1