Why do we tend to use 1MB IO sizes for throughput benchmarking?
To achieve the maximum throughput on a storage device, we will usually use a large IO size to maximize the amount of data is transferred per IO request. The idea is to make the ratio of data-transfers to IO requests as large as possible to reduce the CPU overhead of the actual IO request so we can get as close to the device bandwidth as possible. To take advantage of and pre-fetching, and to reduce the need for head movement in rotational devices, a sequential pattern is used.
For historical reasons, many storage testers will use a 1MB IO size for sequential testing. A typical fio command line might look like something this.
TL;DR – Some modern Linux distributions use a newer method of identification which, when combined with DHCP can result in duplicate IP addresses when cloning VMs, even when the VMs have unique MAC addresses.
To resolve, do the following ( remove file, run the systemd-machine-id-setup command, reboot):
# rm /etc/machine-id
When hypervisor management tools make clones of virtual machines, the tools usually make sure to create a unique MAC address for every clone. Combined with DHCP, this is normally enough to boot the clones and have them receive a unique IP. Recently, when I cloned several Bitnami guest VMs which are based on Debian, I started to get duplicate IP addresses on the clones. The issue can be resolved manually by following the above procedure.
To create a VM template to clone from which will generate a new machine-id for very clone, simply create an empty /etc/machine-id file (do not rm the file, otherwise the machine-id will not be generated)
# echo "" | tee /etc/machine-id
The machine-id man page is a well written explanation of the implementation and motivation.
For this experiment I am using Postgres v11 on Linux 3.10 kernel. The goal was to see what gains can be made from using hugepages. I use the “built in” benchmark pgbench to run a simple set of queries.
Since I am interested in only the gains from hugepages I chose to use the “-S” parameter to pgbench which means perform only the “select” statements. Obviously this masks any costs that might be seen when dirtying hugepages – but it kept the experiment from having to be concerned with writing to the filesystem.
The workstation has 32GB of memory Postgres is given 16GB of memory using the parameter
shared_buffers = 16384MB
pgbench creates a ~7.4gb database using a scale-factor of 500
We have started seeing misaligned partitions on Linux guests runnning certain HDFS distributions. How these partitions became mis-aligned is a bit of a mystery, because the only way I know how to do this on Linux is to create a partition using old DOS format like this (using -c=dos and -u=cylinders) Continue reading →
When changing SCSI devices in an ESX based VM, it’s easy to screw up the ability to boot. The simple fix is to add
bios.hddOrder = “scsi0:0”
to the end of the .vmx file. This has always worked for me. The problem with this solution is that any OVF/OVA that is created from the VM will not include the .vmx file hack, and of course VM’s created from the template will not boot until their .vmx file is hand edited.
The solution that worked for me was to simply make the “boot drive” the first .vmdk file that is listed in the .vmx file. In my case, the Linux OS is stored on the VMDK named “disk.vmdk”
In the before case, this disk is listed last (even though it has SCSI ID 0:0:0) and the VM does not boot.
I simply change the filename from disk_6.vmdk to disk.vmdk (and change the last item from disk.vmdk to disk_6.vmdk).
The beauty of this method is that the ordering is maintained when creating an OVF/OVA.
When the VM boots, the /dev/sd devices may change since the vmdk’s are now attached to different SCSI devices – so mounting using UUID’s in Linux helps keep things sane.
Note: I tried editing the .ovf file and adding a key:value pair to the file, and regenerating the SHA1 and stashing the SHA1 in the .mf file. The process worked, but the VM still did not boot, and the bios.hddOrder param was not in the .vmx file of the VM that was created from the template.