How does database performance behave as more DBs are consolidated?
What impact does running the CVM have on available host resources?
The cluster was able to achieve ~90% of the theoretical maximum.
CVM overhead was 5% for this workload.
The goal was to establish how database performance is affected as additional database workloads are added into the cluster. As a secondary metric, measure the overhead from running the virtual storage controller on the same host as the database servers themselves. We use the Postgres database with pgbench workload and measure the total transactions per second.
4 Node Nutanix cluster, with 2x Xeon CPU’s per host with 20 cores per socket.
Each database is identically configured with
8GB of memory
pgbench benchmark, running the “simple” query set.
The database is sized so that it fits entirely in memory. This is a test of CPU/Memory not IO.
The experiment starts with a single Database on a single host. We add more databases into the cluster until we reach 40 databases in total. At 40 databases with 4 vCPU each and a CPU bound workload we use all 160 CPU cores on the cluster.
The database is configured to fit into the host DRAM memory, and the benchmark runs as fast as it can – the benchmark is CPU bound.
Below are the measured results from running 1-40 databases on the 4 node cluster.
Performance scales almost linearly from 4 to 160 CPU with no obvious bottlenecks before all of the CPU cores are saturated in the host at 40 databases.
How to use Nutanix X-Ray to run any vdbench workload at scale
Many storage performance testers are familiar with vdbench, and wish to use it to test Hyper-Converged (HCI) performance. To accurately performance test HCI you need to deploy workloads on all HCI nodes. However, deploying multiple VMs and coordinating vdbench can be tricky, so with X-ray we provide an easy way to run vdbench at scale. Here’s how to do it.
Step by step instructions to add vdbench to X-Ray.
In this video I migrate a Postgres DB running PGbench benchmark. The DB is running on a Host which is CPU constrained. Once the VM is migrated to a less busy host the transaction rate immediately increases from ~15,000 to ~20,000. As the DB continues to run on the new host – the Nutanix storage detects the access patterns and “localizes” the data that the DB is accessing. Over the subsequent minutes the transaction rate increases to ~30,000 TPS.
The variation in the transaction rate is due to the benchmark itself, the transaction rate is not expected to be uniform. Many different queries are executing in parallel, some hitting RAM cache, some hitting storage.
N.B The Postgres DB is totally un-tuned and contains purely default settings. The aim of the experiment was to see how data-locality might affect a running database workload, not to generate the maximum TPS.