As performance analysts we often have to summarize large amounts of data in order to make engineering decisions or understand existing behavior. This paper will help you do exactly that! Many analysts know that using statistics can help, but statistical analysis is a huge field in itself and has its own complexity. The article below distills the essential techniques that can help you with typical performance analysis tasks.Statistics for the performance analyst
We have started seeing misaligned partitions on Linux guests runnning certain HDFS distributions. How these partitions became mis-aligned is a bit of a mystery, because the only way I know how to do this on Linux is to create a partition using old DOS format like this (using -c=dos and -u=cylinders) Continue reading
Often we are presented with a vCenter screenshot, and an observation that there are “high latency spikes”. In the example, the response time is indeed quite high – around 80ms. Continue reading
One way of categorizing Hyperconverged filesystems (or any filesystem really) is by how data is distributed across the nodes, and the method used to track/retrieve that data. The following is based on knowledge of the internals of Nutanix and publicly available information for the other systems.
|Distributed||Distributed data & metadata||Nutanix||Hash||Random data distribution, hash-lookup (object store)||VSAN||Dedupe||Data stored in HA-Pairs, Lookup by fingerprint||Simplivity||Dedupe||Random data distribution, Lookup by fingerprint||Springpath/Hyperflex||Psuedo Distributed||Data stored in HA pairs, Unified namespace via redirection||NetApp C-Mode|
- Nutanix uses a fully distributed metadata layer that allows the cluster to decide where to place data depending on the location of the VM accessing it. The data can move around to follow the VM. The Nutanix FS uses a lot of ideas from distributed systems research and implementation, rather than taking a classic filesystems approach and applying it to HCI.