Benchmarking with Postgres PT1

Image By Daniel Lundin

In this example, we use Postgres and the pgbench workload generator to drive some load in a virtual machine.  Assume a Linux virtual machine that has Postgres installed. Specifically using a Bitnami virtual appliance.

  • Once the VM has been started, connect to the console
  • Allow access to postgres port 5432 – which is the postgres DB port or allow ssh
  • Note the postgres user password (cat ./bitnami_credentials)
  •  Login to psql from the console or ssh
  • Optionally change password (the password prompted is the one from bitnami_credentials for the postgres database user).
  • Create a DB to run the pgbench workload.  In this case I name the db pgbench-sf10 for “Scale Factor 10”.  Scale Factors are how the size of the database is determined.
  • Initialise the DB with data ready to run the benchmark.  The “createdb” step just creates an empty schema.
    • -i means “initialize”
    • -s means “scale factor” e.g. 10
    • pgbench-sf10 is the database schema to use.  We use the one just created pgbench-sf10
  • Noe run a workload against the DB schema called pgbench-sf10

The workload pattern, and load on the system will vary greatly depending on the scale factor.  

Scale-Factor        Working Set Size


1                                   23M
10                                157M
100                             1.7GB
1000                          15GB
2500                          37GB
5000                         74GB
10000                       147GB

 

 

Performance gains for postgres on Linux with hugepages

For this experiment I am using Postgres v11 on Linux 3.10 kernel. The goal was to see what gains can be made from using hugepages. I use the “built in” benchmark pgbench to run a simple set of queries.

Since I am interested in only the gains from hugepages I chose to use the “-S” parameter to pgbench which means perform only the “select” statements. Obviously this masks any costs that might be seen when dirtying hugepages – but it kept the experiment from having to be concerned with writing to the filesystem.

Experiment

The workstation has 32GB of memory
Postgres is given 16GB of memory using the parameter


pgbench creates a ~7.4gb database using a scale-factor of 500

Run the experiment like this

Result

Default : No hugepages :
tps = 62190.452850 (excluding connections establishing)

2MB Hugepages
tps = 66864.410968 (excluding connections establishing)
+7.5% over default

1GB Hugepages
tps = 69702.358303 (excluding connections establishing)
+12% over default

Enabling hugepages

Getting the default hugepages is as easy as entering a value into /etc/sysctl.conf. To allow for 16GB of hugepages I used the value of 8400, followed by “sysctl -p”

To get 1GB hugepages, the kernel has to have it configured during boot e.g.

Then reboot the kernel

I used these excellent resources
How to modify the kernel command line
How to enable hugepages
and this great video on Linux virtual memory



SuperScalin’: How I learned to stop worrying and love SQL Server on Nutanix.

TL;DR  It’s pretty easy to get 1M SQL TPM running a TPC-C like workload on a single Nutanix node.  Use 1 vDisk for Log files, and 6 vDisks for data files.  SQL Server  needs enough CPU and RAM to drive it.  I used 16 vCPU’s  and 64G of RAM.

Running database servers on Nutanix is an increasing trend and DBA’s are naturally skeptical about moving their DB’s to new platforms.  I recently had the chance to run some DB benchmarks on a couple of nodes in our lab.  My goal was to achieve 1M SQL transactions per node, and have that be linearly scalable across multiple nodes.

Screen Shot 2014-11-26 at 5.50.58 PM

It turned out to be ridiculously easy to generate decent numbers using SQL Server.  As a Unix and Oracle old-timer it was a shock to me, just how simple it is to throw up a SQL server instance.  In this experiment, I am using Windows Server 2012 and SQL-Server 2012.

For the test DB I provision 1 Disk for the SQL log files, and 6 disks for the data files.  Temp and the other system DB files are left unchanged.  Nothing is tuned or tweaked on the Nutanix side, everything is setup as per standard best practices – no “benchmark specials”.

SQL Server TPCC Scaling

Load is being generated by HammerDB configured to run the OLTP database workload.  I get a little over 1Million SQL transactions per minute (TPM) on a single Nutanix node.  The scaling is more-or-less linear, yielding 4.2 Million TPM  with 4 Nutanix nodes, which fit in a single 2U chassis . Each node is running both the DB itself, and the shared storage using NDFS.  I stopped at 6 nodes, because that’s all I had access to at the time.

The thing that blew me away in this was just how simple it had been.  Prior to using SQL server, I had been trying to set up Oracle to do the same workload.  It was a huge effort that took me back to the 1990’s, configuring kernel parameters by hand – just to stand up the DB.  I’ll come back to Oracle at a later date.

My SQL Server is configured with 16 vCPU’s and 64GB of RAM, so that the SQL server VM itself has as many resources as possible, so as not to be the bottleneck.

I use the following flags on SQL server.  In SQL terminology these are known as traceflags which are set in the SQL console (I used “DBCC trace status” to display the following.  These are fairly standard and are mentioned in our best practice guide.

Screen Shot 2014-11-30 at 8.38.45 PM

One thing I did change from the norm was to set the target recovery time to 240 seconds, rather than let SQL server determine the recovery time dynamically.  I found that in the benchmarking scenario, SQL server would not do any background flushing at all,  and then suddenly would checkpoint a huge amount of data which caused the TPM to fluctuate wildly.  With the recovery time hard coded to 240 seconds, the background page flusher keeps up with the incoming workload, and does not need to issue huge checkpoints.  My guess is that in real (non benchmark conditions) SQL server waits for the incoming work to drop-off and issues the checkpoint at that time.  Since my benchmark never backs off, SQL server eventually has to issue the checkpoint.

Screen Shot 2014-11-26 at 5.39.07 PM