Some versions of HammerDB (e.g. 3.2) may induce imbalanced NUMA utilization with SQL Server.
This can easily be observed with Resource monitor. When NUMA imbalance occurs one of the NUMA nodes will show much larger utilization than the other. E.g.
The cause and fix is well documented on this blog. In short HammerDB issues a short lived connection, for every persistent connection. This causes the SQL Server Round-robin allocation to send all the persistent worker threads to a single NUMA Node! To resolve this issue, simply comment out line #212 in the driver script.
If successful you will immediately see that the NUMA nodes are more balanced. Whether this results in more/better performance will depend on exactly where the bottleneck is.
HammerDB is a great tool for running Database benchmarks. However it is very easy to create an artificial bottleneck which will give a very poor benchmark result.
When setting up HammerDB to run against even a moderate modern server, it is important to avoid displaying the client transaction outputs in the HammerDB UI.
In my case just making this simple changed increased my HammerDB results by over 6X. The reason is that HammerDB spends more time updating its own UI, than it does sending transactions to the DB. When I run HammerDB, I select “Log Output to Temp” and “Use Unique Log Name”.
Un-check the “Show results” mark.
Or Ensure that the results are logged to file, not displayed on screen
Otherwise the workload generator will become the bottleneck.
TL;DR It’s pretty easy to get 1M SQL TPM running a TPC-C like workload on a single Nutanix node. Use 1 vDisk for Log files, and 6 vDisks for data files. SQL Server needs enough CPU and RAM to drive it. I used 16 vCPU’s and 64G of RAM.
Running database servers on Nutanix is an increasing trend and DBA’s are naturally skeptical about moving their DB’s to new platforms. I recently had the chance to run some DB benchmarks on a couple of nodes in our lab. My goal was to achieve 1M SQL transactions per node, and have that be linearly scalable across multiple nodes.
It turned out to be ridiculously easy to generate decent numbers using SQL Server. As a Unix and Oracle old-timer it was a shock to me, just how simple it is to throw up a SQL server instance. In this experiment, I am using Windows Server 2012 and SQL-Server 2012.
For the test DB I provision 1 Disk for the SQL log files, and 6 disks for the data files. Temp and the other system DB files are left unchanged. Nothing is tuned or tweaked on the Nutanix side, everything is setup as per standard best practices – no “benchmark specials”.
Load is being generated by HammerDB configured to run the OLTP database workload. I get a little over 1Million SQL transactions per minute (TPM) on a single Nutanix node. The scaling is more-or-less linear, yielding 4.2 Million TPM with 4 Nutanix nodes, which fit in a single 2U chassis . Each node is running both the DB itself, and the shared storage using NDFS. I stopped at 6 nodes, because that’s all I had access to at the time.
The thing that blew me away in this was just how simple it had been. Prior to using SQL server, I had been trying to set up Oracle to do the same workload. It was a huge effort that took me back to the 1990’s, configuring kernel parameters by hand – just to stand up the DB. I’ll come back to Oracle at a later date.
My SQL Server is configured with 16 vCPU’s and 64GB of RAM, so that the SQL server VM itself has as many resources as possible, so as not to be the bottleneck.
I use the following flags on SQL server. In SQL terminology these are known as traceflags which are set in the SQL console (I used “DBCC trace status” to display the following. These are fairly standard and are mentioned in our best practice guide.
One thing I did change from the norm was to set the target recovery time to 240 seconds, rather than let SQL server determine the recovery time dynamically. I found that in the benchmarking scenario, SQL server would not do any background flushing at all, and then suddenly would checkpoint a huge amount of data which caused the TPM to fluctuate wildly. With the recovery time hard coded to 240 seconds, the background page flusher keeps up with the incoming workload, and does not need to issue huge checkpoints. My guess is that in real (non benchmark conditions) SQL server waits for the incoming work to drop-off and issues the checkpoint at that time. Since my benchmark never backs off, SQL server eventually has to issue the checkpoint.