TL;DR – Some modern Linux distributions use a newer method of identification which, when combined with DHCP can result in duplicate IP addresses when cloning VMs, even when the VMs have unique MAC addresses. To resolve, do the following ( remove file, run the systemd-machine-id-setup command, reboot): # rm /etc/machine-id # systemd-machine-id-setup # reboot When […]
Following on from the previous [1] [2] experiments with Postgres & pgbench. A quick look at how the workload is seen from the Nutanix CVM. The Linux VM running postgres has two virtual disks: One is taking transaction log writes. The other is doing reads and writes from the main datafiles. Since the database size […]
In this example we run pgbench with a scale factor of 1000 which equates to a database size of around 15GB. The linux VM has 32G RAM, so we don’t expect to see many reads. Using prometheus with the Linux node exporter we can see the disk IO pattern from pgbench. As expected the write […]
In this example, we use Postgres and the pgbench workload generator to drive some load in a virtual machine. Assume a Linux virtual machine that has Postgres installed. Specifically using a Bitnami virtual appliance. Once the VM has been started, connect to the console Allow access to postgres port 5432 – which is the postgres […]
One of the nice things about using public cloud is the ability to use pre-canned application virtual appliances created by companies like Bitnami. We can use these same appliance images on Nutanix AHV to easily do a Postgres database benchmark Step 1. Get the bitnami image wget https://bitnami.com/redirect/to/587231/bitnami-postgresql-11.3-0-r56-linux-debian-9-x86_64.zip Step 2. Unzip the file and convert […]
A 2007 paper, that still has lots to say on the subject of benchmarking storage and filesystems. Primarily aimed at researchers and developers, but is relevant to anyone about to embark on a benchmarking effort. The authors are clear on why benchmarks remain important: “Ideally, users could test performance in their own settings using real […]
For this experiment I am using Postgres v11 on Linux 3.10 kernel. The goal was to see what gains can be made from using hugepages. I use the “built in” benchmark pgbench to run a simple set of queries. Since I am interested in only the gains from hugepages I chose to use the “-S” […]
DB Compression
How to improve large DB read performance by 2X Nutanix AOS 5.10 ships with a feature called Autonomous Extent Store (AES). AES effectively provides Metadata Locality to complement the existing data locality that has always existed. For large datasets (e.g. a 10TB database with 20% hot data) we observe a 2X improvement in throughput for random […]
How to reduce database restore time by 50% During .Next 2018 in London, Nutanix announced performance improvements in the core-datapath said to give up to 2X performance improvements. Here’s a real-world example of that improvement in practice. I am using X-Ray to simulate a 1TB data restore into an existing database. Specifically the IO sizes […]
In a previous post I showed a chart which plots concurrency [X-axis] against throughput (IOPS) on the Y-Axis. Here is that plot again: Experienced performance chart ogglers will notice the familiar pattern of Littles Law, whereby throughput (X) rises quickly as concurrency (N) is increased. As we follow the chart to the right, the slope flattens […]
The fio Pareto parameter allows us to create a workload, which references a very large dataset, but specify a hotspot for the access pattern. Here’s an example using the same setup as the ILM experiment, but using a Pareto value of 0:8. My fio file looks like this.. [global] ioengine=libaio direct=1 time_based norandommap random_distribution=pareto:0.8 The […]
At some point potential Hyper-converged infrastructure (HCI) users want to know – “How fast does this thing go?”. The real question is “how do we measure that?”. The simplest test is to run a single VM, with a single disk and issue a single IO at a time. We see often see this sort of […]
Specifically a customer wanted to see how performance changes (and how quickly) as data moves from HDD to SSD automatically as data is accessed. The access pattern is 100% random across the entire disk. In a hybrid Flash/HDD system – “cold” data (i.e. data that has not been accessed for a long time) is moved […]
What happens when power is lost to all nodes of a HCI Cluster? Ever wondered what happens when all power is simultaneously lost on a HCI cluster? One of the core principles of cloud design is that components are expected to fail, but the cluster as a whole should stay “up”. We wanted to […]
Creating a HCI benchmark to simulate multi-tennent workloads HCI deployments are typically multi-tennant and often different nodes will support different types of workloads. It is very common to have large resource-hungry databases separated across nodes using anti-affinity rules. As with traditional storage, applications are writing to a shared storage environment which is necessary to […]
A simple benchmark for Random Reads, Random Writes, Sequential Reads, Sequential Writes.
How to create a customized performance test using X-ray.
Storage bus speeds with example storage endpoints. Bus Lanes End-Point Theoretical Bandwidth (MB/s) Note SAS-3 1 HBA <-> Single SATA Drive 600 SAS3<->SATA 6Gbit SAS-3 1 HBA <-> Single SAS Drive 1200 SAS3<->SAS3 12Gbit SAS-3 4 HBA <-> SAS/SATA Fanout 4800 4 Lane HBA to Breakout (6 SSD)[2] SAS-3 8 HBA <-> SAS/SATA Fanout 8400 […]
It’s good to detect corrupted data. It’s even better to transparently repair that data and return the correct data to the user. Here we will demonstrate how Nutanix filesystem detects and corrects corruption. Not all systems are made equally in this regard. The topic of corruption detection and remedy was the focus of this excellent […]