The parameters norandommap and randrepeat significantly change the way that repeated random IO workloads will be executed, and also can meaningfully change the results of an experiment due to the way that caching works on most storage system.Continue reading
From the SQL Window of SQL*Server. Issue these commands to drop the tables and procedures created by HammerDB. This will allow you (for instance) to re-create the database, or create a new database with more warehouses (larger size) while retaining the same name/DB layout.Continue reading
How to use the “jobs” and “clients” parameters in pgbench without going crazy.Continue reading
How to speed up your X-ray benchmark development cycle by re-using/re-cycling benchmark VMs and more importantly data-sets.Continue reading
I have VMs running on bare-metal instances. Each bare-metal instance is in a separate rack by design (for fault tolerance). The bandwidth is 25GbE however, the response time between the hosts is so high that I need multiple streams to consume that bandwidth.
Compared to my local on-prem lab I need many more streams to get the observed throughput close to the theoretical bandwidth of 25GbE
|# iperf Streams||AWS Throughput||On-Prem Throughput|
|1||4.8 Gbit||21.4 Gbit|
|2||9 Gbit||22 Gbit|
|8||23 Gbit||23 Gbit|
End to End Creation of a Nutanix Cluster on AWS and Running X-RayContinue reading
Scale factor to workingset size lookup for tiny databasesContinue reading
A series of videos showing how to install, run, modify and analyze HCI clusters with the Nutanix X-ray toolContinue reading
How to identify optane drives in linux OS using lspci.Continue reading
Use the following SQL to drop the tables and indexes in the HammerDB TPC-H schema, so that you can re-load it.Continue reading
Tips and tricks for using diskspd especially useful for those familar with tools like fioContinue reading
How to ensure performance testing with diskspd is stressing the underlying storage devices, not the OS filesystem.Continue reading
How to install and setup diskspd before starting your first performance tests and avoiding wrong results due to null byte issues.Continue reading
How can database density be measured?
- How does database performance behave as more DBs are consolidated?
- What impact does running the CVM have on available host resources?
- The cluster was able to achieve ~90% of the theoretical maximum.
- CVM overhead was 5% for this workload.
The goal was to establish how database performance is affected as additional database workloads are added into the cluster. As a secondary metric, measure the overhead from running the virtual storage controller on the same host as the database servers themselves. We use the Postgres database with pgbench workload and measure the total transactions per second.
- 4 Node Nutanix cluster, with 2x Xeon CPU’s per host with 20 cores per socket.
Each database is identically configured with
- Postgres 9.3
- Ubuntu Linux
- 4 vCPU
- 8GB of memory
- pgbench benchmark, running the “simple” query set.
The database is sized so that it fits entirely in memory. This is a test of CPU/Memory not IO.
The experiment starts with a single Database on a single host. We add more databases into the cluster until we reach 40 databases in total. At 40 databases with 4 vCPU each and a CPU bound workload we use all 160 CPU cores on the cluster.
The database is configured to fit into the host DRAM memory, and the benchmark runs as fast as it can – the benchmark is CPU bound.
Below are the measured results from running 1-40 databases on the 4 node cluster.
Performance scales almost linearly from 4 to 160 CPU with no obvious bottlenecks before all of the CPU cores are saturated in the host at 40 databases.Continue reading
Many storage performance testers are familiar with vdbench, and wish to use it to test Hyper-Converged (HCI) performance. To accurately performance test HCI you need to deploy workloads on all HCI nodes. However, deploying multiple VMs and coordinating vdbench can be tricky, so with X-ray we provide an easy way to run vdbench at scale. Here’s how to do it.Continue reading
First things First
Why do we tend to use 1MB IO sizes for throughput benchmarking?
To achieve the maximum throughput on a storage device, we will usually use a large IO size to maximize the amount of data is transferred per IO request. The idea is to make the ratio of data-transfers to IO requests as large as possible to reduce the CPU overhead of the actual IO request so we can get as close to the device bandwidth as possible. To take advantage of and pre-fetching, and to reduce the need for head movement in rotational devices, a sequential pattern is used.
For historical reasons, many storage testers will use a 1MB IO size for sequential testing. A typical fio command line might look like something this.
fio --name=read --bs=1m --direct=1 --filename=/dev/sdaContinue reading
The real-world achievable SSD performance will vary depending on factors like IO size, queue depth and even CPU clock speed. It’s useful to know what the SSD is capable of delivering in the actual environment in which it’s used. I always start by looking at the performance claimed by the manufacturer. I use these figures to bound what is achievable. In other words, treat the manufacturer specs as “this device will go no faster than…”.
Start by identifying the exact SSD type by using lsscsi. Note that the disks we are going to test are connected by ATA transport type, therefore the maximum queue depth that each device will support is 32.
[1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.5+ /dev/sr0
[2:0:0:0] disk ATA SAMSUNG MZ7LM1T9 404Q /dev/sda
[2:0:1:0] disk ATA SAMSUNG MZ7LM1T9 404Q /dev/sdb
[2:0:2:0] disk ATA SAMSUNG MZ7LM1T9 404Q /dev/sdc
[2:0:3:0] disk ATA SAMSUNG MZ7LM1T9 404Q /dev/
The marketing name for these Samsung SSD’s is “SSD 850 EVO 2.5″ SATA III 1TB“
Identify device specs
The spec sheet for this ssd claims the following performance characteristics.
|Sequential Read (QD=8)||540 MB/s||534|
|Sequential Write (QD=8)||520 MB/s||515|
|Read IOPS 4KB (QD=32)||98,000||80,00|
|Write IOPS 4KB (QD=32)||90,000||67,000|