How to use Nutanix X-Ray to run any vdbench workload at scale
Many storage performance testers are familiar with vdbench, and wish to use it to test Hyper-Converged (HCI) performance. To accurately performance test HCI you need to deploy workloads on all HCI nodes. However, deploying multiple VMs and coordinating vdbench can be tricky, so with X-ray we provide an easy way to run vdbench at scale. Here’s how to do it.
Step by step instructions to add vdbench to X-Ray.
Nutanix X-Ray is well known for being able to model IO/Storage workloads, but what about workloads that are CPU bound?
X-Ray can run Ansible scripts on the X-Ray worker VMs, and by doing so we are able to provision almost any application. For our purposes we are going to use Postgres DB and the built-in benchmarking tool PGbench. I have deliberately created a very small DB which fits into the VM memory and does almost no IO.
Using standard X-Ray YAML we are able to pass custom parameters such as how many Postgres VMs to deploy, how many clients and how many threads for pgbench to run.
When X-Ray runs the workload the results are displayed in the X-Ray UI. This time though the metric is Database transactions per second not IOPS or Storage throughput.
By running a variety of experiments, altering the number of VMs running the workload, I was able to plot the aggregate transactions/s and the per-VM value, which as expected decreases once the host platform CPU is saturated.
I was able to use the CPU bound nature of this particular workload to take a look at scheduling and CPU usage characteristics of different hypervisors and CPU types. I found that one combination gave better performance at low loads, but the other combination generated better performance under higher loads.
These sorts of experiments are quite straight-forward using X-Ray and Ansible. A special shout-out goes to GV who created the custom exporter to send the postgres transaction/s results back to X-ray in realtime.
Specifically a customer wanted to see how performance changes (and how quickly) as data moves from HDD to SSD automatically as data is accessed. The access pattern is 100% random across the entire disk.
In a hybrid Flash/HDD system – “cold” data (i.e. data that has not been accessed for a long time) is moved from SSD to HDD when the SSD capacity is exhausted. At some point in the future – that same data may become “hot” again, and so we want to make sure that the “newly hot” data is quickly moved back to the SSD tier. The duration of the above chart is around 5 minutes – and we see that by, around the 3 minute mark the entire dataset is resident on the SSD tier.
This X-ray test uses a couple of neat tricks to demonstrate ILM behavior.
Edit container preferences to send sequential data immediately to HDD
Overwriting data with NUL/Zero bytes frees the underlying data on Nutanix filesystem
To demonstrate ILM from HDD to SSD (ad ultimately into the DRAM cache on the CVM) we first have to ensure that we have data on the HDD in the first place. By default Nutanix OS will always try to write new data to SSD. To circumvent that behavior we can edit the container preferences. We use the fact that the “prefill” will be a sequential workload, while the measured workload will be a random workload. To make the change, use “ncli” to change the ” Sequential I/O Pri Order” to be HDD only.
In my case I happened to call my container “xray” since I didn’t want to change the default container. Now, when X-Ray executes the prefill stage, the data will land on HDD. As a second requirement, we want to see what happens when IO with different size blocks are issued so that we can get a chart similar to this: To achieve the desired behavior, we need to make sure that, at the beginning of each test, the data, again resides on HDD. The problem is that the data is up-migrated during the test. To do this we do an initial overwrite of the entire disk with “NULL” bytes using a parameter in fio “zero_buffers”. This causes the data to be freed on the Nutanix filesystem. Then we issue a normal profile with random data. Once the data is freed, then we know that the new initial writes will go to HDD – because we edited the container to do so. The overall test pattern looks like this
Create and clone VMs
Prefill with random data (Data will reside on HDD due to container edit)
Read disk with 16KB block size
Zero out the disks – to remove/free the up-migrated data