Impact of Data locality on DB workloads.
Published: (Updated: ) by .
Do database workloads benefit from data locality?
Published: (Updated: ) by .
Do database workloads benefit from data locality?
Published: (Updated: ) by .
Following on from the previous [1] [2] experiments with Postgres & pgbench. A quick look at how the workload is seen from the Nutanix CVM. The Linux VM running postgres has two virtual disks: One is taking transaction log writes. The other is doing reads and writes from the main datafiles. Since the database size […]
Published: (Updated: ) by .
One of the nice things about using public cloud is the ability to use pre-canned application virtual appliances created by companies like Bitnami. We can use these same appliance images on Nutanix AHV to easily do a Postgres database benchmark Step 1. Get the bitnami image wget https://bitnami.com/redirect/to/587231/bitnami-postgresql-11.3-0-r56-linux-debian-9-x86_64.zip Step 2. Unzip the file and convert […]
Published: (Updated: ) by .
DB Compression
Published: (Updated: ) by .
How to improve large DB read performance by 2X Nutanix AOS 5.10 ships with a feature called Autonomous Extent Store (AES). AES effectively provides Metadata Locality to complement the existing data locality that has always existed. For large datasets (e.g. a 10TB database with 20% hot data) we observe a 2X improvement in throughput for random […]
Published: (Updated: ) by .
How to reduce database restore time by 50% During .Next 2018 in London, Nutanix announced performance improvements in the core-datapath said to give up to 2X performance improvements. Here’s a real-world example of that improvement in practice. I am using X-Ray to simulate a 1TB data restore into an existing database. Specifically the IO sizes […]
Published: (Updated: ) by .
In a previous post I showed a chart which plots concurrency [X-axis] against throughput (IOPS) on the Y-Axis. Here is that plot again: Experienced performance chart ogglers will notice the familiar pattern of Littles Law, whereby throughput (X) rises quickly as concurrency (N) is increased. As we follow the chart to the right, the slope flattens […]
Published: (Updated: ) by .
The fio Pareto parameter allows us to create a workload, which references a very large dataset, but specify a hotspot for the access pattern. Here’s an example using the same setup as the ILM experiment, but using a Pareto value of 0:8. My fio file looks like this.. [global] ioengine=libaio direct=1 time_based norandommap random_distribution=pareto:0.8 The […]
Published: (Updated: ) by .
At some point potential Hyper-converged infrastructure (HCI) users want to know – “How fast does this thing go?”. The real question is “how do we measure that?”. The simplest test is to run a single VM, with a single disk and issue a single IO at a time. We see often see this sort of […]
Published: (Updated: ) by .
Specifically a customer wanted to see how performance changes (and how quickly) as data moves from HDD to SSD automatically as data is accessed. The access pattern is 100% random across the entire disk. In a hybrid Flash/HDD system – “cold” data (i.e. data that has not been accessed for a long time) is moved […]
Published: (Updated: ) by .
What happens when power is lost to all nodes of a HCI Cluster? Ever wondered what happens when all power is simultaneously lost on a HCI cluster? One of the core principles of cloud design is that components are expected to fail, but the cluster as a whole should stay “up”. We wanted to […]
Published: (Updated: ) by .
Creating a HCI benchmark to simulate multi-tennent workloads HCI deployments are typically multi-tennant and often different nodes will support different types of workloads. It is very common to have large resource-hungry databases separated across nodes using anti-affinity rules. As with traditional storage, applications are writing to a shared storage environment which is necessary to […]
Published: (Updated: ) by .
A simple benchmark for Random Reads, Random Writes, Sequential Reads, Sequential Writes.
Published: (Updated: ) by .
How to create a customized performance test using X-ray.
Published: (Updated: ) by .
It’s good to detect corrupted data. It’s even better to transparently repair that data and return the correct data to the user. Here we will demonstrate how Nutanix filesystem detects and corrects corruption. Not all systems are made equally in this regard. The topic of corruption detection and remedy was the focus of this excellent […]
Published: (Updated: ) by .
As an experiment, I wanted to (a) Create a HDD only container, and (b) measure the bandwidth I could achieve when backing up the SQL DB. This was performed on a standard hybrid platform with only 4 HDD’s in the node. First create a container, but add the special options “sequential-io-priority-order=DAS-SATA random-io-priority-order=DAS-SATA” which means that […]
Published: (Updated: ) by . Leave a Comment on SATA on Nutanix. Some experimental data..
The question of why Nutanix uses SATA drive comes up sometimes, especially from customers who have experienced very poor performance using SATA on traditional arrays. I can understand this anxiety. In my time at NetApp we exclusively used SAS or FC-AL drives in performance test work. At the time there was a huge difference in performance between […]
Published: (Updated: ) by .
One of the characteristics of a successful storage system for virtualized environments is that it must handle the IO blender. Put simply, when lots of regular looking workloads are virtualized and presented to the storage, their regularity is lost, and the resulting IO stream starts to look more and more random. This is very similar to […]
Published: (Updated: ) by .
I was speaking to one of our developers the other day, and he pointed me to the following paper: SEDA: An Architecture for Well-Conditioned, Scalable Internet Services as an example of the general philosophy behind the design of the Nutanix Distributed File System (NDFS). Although the paper uses examples of both a webserver and a gnutella client, […]