Today we will use the simplest workload that X-Ray provides, the “Four Corners” Benchmark. This is the classic storage benchmark of Random Read/Write and Sequential Read/Write. Most people understand that this workload tells us very little about how the storage will behave under real workloads, but most people also want to know “How Fast” will the storage go.
Here’s a video of the same process :
First select the Four Corners Microbenchmark” from the test list. The “For Corners” test is supplied with X-Ray. Of course you can edit the parameters if you wish.
Then select the target cluster to run the test upon, and add to the test queue for execution.
The results will update in realtime. X-Ray first creates the test VM’s and powers them on…
If I want to compare different runs then X-Ray has the “Analyze” button. In my case I am using an engineering build of the product and comparing the same platform with different tuning. The compare/analyze can be useful for comparing different platforms, hypervisors or HCI vendors since X-Ray can run pretty-much on anything that presents a data store to vCenter as well as Nutanix AOS/Prism.
This result would seem to show that the tuning performed in experiment #2 had a large improvement in Random Write IOPS and did not negatively effect the other results (Random Read, Sequential Read, and Sequential Write).
I can also look at the particular parameters of this test, by selection the Actions->Test Logs
For instance I can look at the Random Read parameters (these are standard fio configuration files)
I can also look at the overall “Four Corners” test configuration which is specified as YAML
Storage bus speeds with example storage endpoints.
Theoretical Bandwidth (MB/s)
HBA <-> Single SATA Drive
HBA <-> Single SAS Drive
HBA <-> SAS/SATA Fanout
4 Lane HBA to Breakout (6 SSD)
HBA <-> SAS/SATA Fanout
8 Lane HBA to Breakout (12 SSD)
Single Lane PCIe3
PCIe <-> SAS HBA or NVMe
Enough for Single NVMe
PICe <-> SAS HBA or NVMe
Enough for SAS-3 4 Lanes
PCIe Bus <-> Processor Socket
Xeon Direct conect to PCIe Bus
All figures here are the theoretical maximums for the busses using rough/easy calculations for bits/s<->bytes/s. Enough to figure out where the throughput bottlenecks are likely to be in a storage system.
SATA devices contain a single SAS/SATA port (connection), and even when they are connected to a SAS3 HBA, the SATA protocol limits each SSD device to ~600MB/s (single port, 6Gbit)
SAS devices may be dual ported (two connections to the device from the HBA(s)) – each with a 12Gbit connection giving a potential bandwidth of 2x12Gbit == 2.4Gbyte/s (roughly) per SSD device.
An NVMe device directly attached to the PCIe bus has access to a bandwidth of 4GB/s by using 4 PCIe lanes – or 8GB/s using 8 PCIe lanes. On current Xeon processors, a single socket attaches to 40 PCIe lanes directly (see diagram below) for a total bandwidth of 40GB/s per socket.
I first started down the road of finally coming to grips with all the different busses and lane types after reading this excellent LSI paper. I omitted the SAS-2 figures from this article since modern systems use SAS-3 exclusively.