How to run vdbench benchmark on any HCI with X-Ray

How to use Nutanix X-Ray to run any vdbench workload at scale

Many storage performance testers are familiar with vdbench, and wish to use it to test Hyper-Converged (HCI) performance. To accurately performance test HCI you need to deploy workloads on all HCI nodes. However, deploying multiple VMs and coordinating vdbench can be tricky, so with X-ray we provide an easy way to run vdbench at scale. Here’s how to do it.

Step by step instructions to add vdbench to X-Ray.

  1. Download vdbench from the Oracle site
  2. Get the vdbench x-ray test scenario from github https://github.com/garyjlittle/xray.git
    • (You can clone the repository to a laptop, then upload to your X-Ray server)
  3. Rename the zip file downloaded from Oracle to vdbench.zip The X-ray scenerio relies on the zip file having exactly this name.
  4. Go to your X-ray server and upload the vdbench.zip file and the vdbench x-ray scenario files to the x-ray server.
  5. Ensure that VMs created on the cluster will have access to the internet, they will need to be able to install a JVM in order to run vdbench.
Continue reading

Things to know when using vdbench.

Recently I found that vdbench was not giving me the amount of outstanding IO that I had intended to configure by using the “threads=N” parameter. It turned out that with Linux, most of the filesystems (ext2, ext3 and ext4) do not support concurrent directIO, although they do support directIO. This was a bit of a shock coming from Solaris which had concurrent directIO since 2001.

All the Linux filesystems I tested allow multiple outstanding IO’s if the IO is submitted using asynchronous IO (AKA asyncIO or AIO) but not when using multiple writer threads (except XFS). Unfortunately vdbench does not allow AIO since it tries to be platform agnostic.

fio however does allow either threads or AIO to be used and so that’s what I used in the experiments below.

The column fio QD is the amount of outstanding IO, or Queue Depth that is intended to be passed to the storage device. The column iostat QD is the actual Queue Depth seen by the device. The iostat QD is not “8” because the response time is so low that fio cannot issue the IO’s quickly enough to maintain the intended queue depth.

Device
fio QD
fio QD Type
direct
iostat QD
 ps -efT | grep fio | wc -l
/dev/sd
8
libaio
Yes
7
5
/dev/sd
8
Threads
Yes
7
12
ext2 fs (mke2fs)
8
Threads
Yes
1
12
ext2 fs (mke2fs)
8
libaio
Yes
7
5
ext3 (mkfs -t ext3)
8
Threads
Yes
1
12
ext3 (mkfs -t ext3)
8
libaio
Yes
7
5
ext4 (mkfs -t ext4)
8
Threads
Yes
1
12
ext4 (mkfs -t ext4)
8
libaio
Yes
7
5
xfs (mkfs -t xfs)
8
Threads
Yes
7
12
xfs (mkfs -t xfs)
8
libaio
Yes
7
5

At any rate, all is not lost – using raw devices (/dev/sdX) will give concurrent directIO, as will XFS. These issues are well known by Linux DB guys, and I found interesting articles from Percona and Kevin Closson after I finally figured out what was going on with vdbench.

fio “scripts”

For the “threads” case.

[global]
bs=8k
ioengine=sync
iodepth=8
direct=1
time_based
runtime=60
numjobs=8
size=1800m

[randwrite-threads]
rw=randwrite
filename=/a/file1

For the “aio” case

[global]
bs=8k
ioengine=libaio
iodepth=8
direct=1
time_based
runtime=60
size=1800m


[randwrite-aio]
rw=randwrite
filename=/a/file1