[Guest Post] Why I became a performance engineer

Posted on February 21, 2022January 3, 2023 by Dan Chilton

First Off, I want to thank Gary for giving me an opportunity to be a guest writer on his blog, it’s an honor. My name is Dan Chilton and I have worked in technology for the past 20 years. As an introduction, today I just want to tell the story of why I became a performance engineer. . .

Continue reading →

Nutanix Performance for Database Workloads

Posted on November 24, 2021September 14, 2022 by gary

We’ve come a long way, baby.

Full disclosure. I have worked for Nutanix in the performance engineering group since 2013. My opinions are likely biased, but that also gives me a decent amount of context when it comes to the performance of Nutanix storage over time. We already have a lot of customers running database workloads on Nutanix. But what about those high-performance databases still running on traditional storage?

I dug out a chart that I presented at .Next in 2017 and added to it the performance of a modern platform (AOS 6.0 and an NVME+SSD platform). For this random read microbenchmark performance has more than doubled. If you took a look at a HCI system even a few years back and decided that performance wasn’t where you needed it – there’s a good chance that the HW+SW systems shipping today could meet your needs.

Much more detail below.

Continue reading →

How to generate a new hostid for a Cassandra node.

Posted on July 21, 2021July 21, 2021 by gary

If you clone a Cassandra VM with the goal of creating a cassandra cluster – you may find that every Cassandra node has the same hostID.

Continue reading →

Using rwmixread and rate_iops in fio

Posted on July 14, 2021December 29, 2022 by gary

Creating a mixed read/write workload with fio can be a bit confusing. Assume we want to create a fixed rate workload of 100 IOPS split 70:30 between reads and writes.

Don’t mix rwmixread and rate_iops

TL;DR

Specify the rate directly with rate_iops=<read-rate>,<write-rate> do not try to use rwmixread with rate_iops. For the example above use.

rate_iops=70,30

Additionally older versions of fio exhibit problems when using rate_poisson with rate_iops . fio version 3.7 that I was using did not exhibit the problem.

Continue reading →

Understanding fio norandommap and randrepeat parameters

Posted on May 6, 2021December 29, 2022 by gary

The parameters norandommap and randrepeat significantly change the way that repeated random IO workloads will be executed, and also can meaningfully change the results of an experiment due to the way that caching works on most storage system.

Continue reading →

How to drop tables for HammerDB TPC-C on SQL Server

Posted on April 20, 2021January 3, 2023 by gary

From the SQL Window of SQL*Server. Issue these commands to drop the tables and procedures created by HammerDB. This will allow you (for instance) to re-create the database, or create a new database with more warehouses (larger size) while retaining the same name/DB layout.

Continue reading →

Understanding Concurrency Parameters in pgbench

Posted on January 12, 2021September 7, 2022 by gary

How to use the “jobs” and “clients” parameters in pgbench without going crazy.

Continue reading →

A Generalized workload generator for storage IO

Posted on December 22, 2020November 4, 2022 by gary

With help from the Nutanix X-Ray team I have created an IO “benchmark” which simulates a “General Server Virtualization” workload. I call it the “Mixed Workload Simulator“

Continue reading →

Advanced X-Ray: reducing runtime by re-using VMs.

Posted on October 5, 2020July 14, 2021 by gary

How to speed up your X-ray benchmark development cycle by re-using/re-cycling benchmark VMs and more importantly data-sets.

Continue reading →

Cross rack network latency in AWS

Posted on August 20, 2020January 3, 2023 by gary

I have VMs running on bare-metal instances. Each bare-metal instance is in a separate rack by design (for fault tolerance). The bandwidth is 25GbE however, the response time between the hosts is so high that I need multiple streams to consume that bandwidth.

Compared to my local on-prem lab I need many more streams to get the observed throughput close to the theoretical bandwidth of 25GbE

# iperf Streams	AWS Throughput	On-Prem Throughput
1	4.8 Gbit	21.4 Gbit
2	9 Gbit	22 Gbit
4	18 Gbit	22.5
8	23 Gbit	23 Gbit

Difference in throughput for a 25GbE network on-premises Vs AWS cloud (inter-rack)

How to performance test Nutanix on AWS with X-ray

Posted on August 18, 2020April 2, 2021 by gary

End to End Creation of a Nutanix Cluster on AWS and Running X-Ray

Continue reading →

Postgres pgbench scale-factors and WSS

Posted on August 11, 2020April 2, 2021 by gary

Scale factor to workingset size lookup for tiny databases

Continue reading →

Nutanix X-Ray video Series

Posted on August 10, 2020January 3, 2023 by gary

A series of videos showing how to install, run, modify and analyze HCI clusters with the Nutanix X-ray tool

Continue reading →

How to download and Install Nutanix X-ray on an AHV cluster

Posted on August 10, 2020April 2, 2021 by gary

Identifying Optane drives in Linux

Posted on July 15, 2020September 7, 2022 by gary

Optane Drive. Image Courtesy wikimedia

How to identify optane drives in linux OS using lspci.

Continue reading →

How to drop tables for HammerDB TPC-H on SQL Server

Posted on June 9, 2020September 7, 2022 by gary

Use the following SQL to drop the tables and indexes in the HammerDB TPC-H schema, so that you can re-load it.

Continue reading →

Microsoft diskspd Part 3. Oddities and FAQ

Posted on April 22, 2020April 2, 2021 by gary

Tips and tricks for using diskspd especially useful for those familar with tools like fio

Continue reading →

Microsoft diskspd. Part 2 How to bypass NTFS Cache.

Posted on April 7, 2020April 2, 2021 by gary

How to ensure performance testing with diskspd is stressing the underlying storage devices, not the OS filesystem.

Continue reading →

Microsoft diskspd. Part 1 Preparing to test.

Posted on March 30, 2020February 23, 2023 by gary

How to install and setup diskspd before starting your first performance tests and avoiding wrong results due to null byte issues.

Continue reading →