2M IOPS on a single VM with Nutanix HCI
Published: (Updated: ) by .
How to generate a LOT of IOPS to a single VM
The Recipe
- Use Load Balanced Volume Groups (VGLB) which allow disks for a single VM to be hosted across the cluster. In traditional HCI, disks for a given user virtual machine (UVM) are owned by the CVM on the same host. While this is great for saving network bandwidth – we can’t fully use the power of all the CVMs in the cluster to on a single VM
- Make sure the user VM has multiple vdisks and at least the same number of CPUs to drive the IO. It turns out that at high IO rates – we need a lot of CPU cycles to move the data around
- Increase the number of FRODO threads that are available to connect the user VMs to the virtual disks. The default is 2 FRODO threads per VM, which is good for about 850,000 IOPS – which is quite a lot. To get to 2M IOPS we need to up that value – in my experiments I used one FRODO thread per User VM CPU.
12 Disks on the User VM
12 CPUs on the User VM
12 FRODO threads in AHV
We can see that the disks are spread around the cluster
$ iscsi_migrator --print
Found 4 nodes on cluster
iSCSI Targets on SVM Node 10.57.255.26
============================================
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt15
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt7
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt11
iSCSI Targets on SVM Node 10.57.255.27
============================================
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt0
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt12
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt8
iSCSI Targets on SVM Node 10.57.255.28
============================================
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt13
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt9
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt4
iSCSI Targets on SVM Node 10.57.255.29
============================================
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt6
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt10
iqn.2010-06.com.nutanix:testvg6-100-1e71d359-c7c4-4d85-a7d6-c3faa576e7f7-tgt14
Then I use fio to create 12 jobs, and pin those threads to the 12 CPUs
[global]
numjobs=12
cpus_allowed=0-11
cpus_allowed_policy=split
When I run the job, using 8 OIO per disk I get the following output
2 Million IOPS with an average response time of 574 microseconds (a little over 0.5 milliseconds)
read: IOPS=2018k, BW=7881MiB/s (8264MB/s)(462GiB/60001msec)
slat (nsec): min=1576, max=17391k, avg=2243.99, stdev=6132.82
clat (nsec): min=587, max=156573k, avg=568022.77, stdev=403663.76
lat (usec): min=91, max=156574, avg=570.38, stdev=403.68

Comments