HCI Performance testing made easy (Part 1)

In this short series I will describe how to perform performance an resiliency tests on a HCI cluster using X-ray.

X-Ray can do the following for the performance tester.

  • Model IO workloads using standard fio format
  • Create VMs based on user-specified criteria (CPUs, Memory, Number & Size of disks)
  • Provision the VMs  to a HCI cluster (Nutanix AHV, ESXi, Hyper-V)
  • Execute the workloads
  • Display and store the results

In particular X-Ray give the additional benefits that most workload generators do not

  • Specify and deploy workloads with different IO patterns and characteristics
    • Most workload generators create a uniform workload on all workers
  • Execute and terminate sub-workloads on a user-specified timeline
    • e.g. Begin workload 1 then introduce workload 2 and measure the interference
  • Introduce failure scenarios and measure the impact to performance

Here’s a video of X-Ray in action, I export an existing X-Ray test, edit it to create a new test, upload and execute the test.

 

The files are in my X-ray GitHub

Storage Bus Speeds 2018

Storage bus speeds with example storage endpoints.

Bus Lanes End-Point Theoretical Bandwidth (MB/s) Note
SAS-3 1 HBA <-> Single SATA Drive 600 SAS3<->SATA 6Gbit
SAS-3 1 HBA <-> Single SAS Drive 1200 SAS3<->SAS3 12Gbit
SAS-3 4 HBA <-> SAS/SATA Fanout 4800 4 Lane HBA to Breakout (6 SSD)[2]
SAS-3 8 HBA <-> SAS/SATA Fanout 8400 8 Lane HBA to Breakout (12 SSD)[1]
PCIe-3 1 N/A 1000 Single Lane PCIe3
PCIe-3 4 PCIe <-> SAS HBA or NVMe 4000 Enough for Single NVMe
PCIe-3 8 PICe <-> SAS HBA or NVMe 8000 Enough for SAS-3 4 Lanes
PCIe-3 40 PCIe Bus <-> Processor Socket 40000 Xeon Direct conect to PCIe Bus

 

Notes


All figures here are the theoretical maximums for the busses using rough/easy calculations for bits/s<->bytes/s. Enough to figure out where the throughput bottlenecks are likely to be in a storage system.

  1. SATA devices contain a single SAS/SATA port (connection), and even when they are connected to a SAS3 HBA, the SATA protocol limits each SSD device to ~600MB/s (single port, 6Gbit)
  2. SAS devices may be dual ported (two connections to the device from the HBA(s)) – each with a 12Gbit connection giving a potential bandwidth of 2x12Gbit == 2.4Gbyte/s (roughly) per SSD device.
  3. An NVMe device directly attached to the PCIe bus has access to a bandwidth of 4GB/s by using 4 PCIe lanes – or 8GB/s using 8 PCIe lanes.  On current Xeon processors, a single socket attaches to 40 PCIe lanes directly (see diagram below) for a total bandwidth of 40GB/s per socket.

  • I first started down the road of finally coming to grips with all the different busses and lane types after reading this excellent LSI paper.  I omitted the SAS-2 figures from this article since modern systems use SAS-3 exclusively.
LSI SAS PCI Bottlenecks

 

Intel Processor & PCI connections