Hyperconverged File Systems PT1 Taxonomy

One way of categorizing Hyperconverged filesystems (or any filesystem really) is by how data is distributed across the nodes, and the method used to track/retrieve that data. The following is based on knowledge of the internals of Nutanix and publicly available information for the other systems.

Metadata

Characteristics

Implemented by

Distributed Distributed data & metadata Nutanix Hash Random data distribution, hash-lookup (object store) VSAN Dedupe Data stored in HA-Pairs, Lookup by fingerprint Simplivity Dedupe Random data distribution, Lookup by fingerprint Springpath/Hyperflex Psuedo Distributed Data stored in HA pairs, Unified namespace via redirection NetApp C-Mode
    Nutanix uses a fully distributed metadata layer that allows the cluster to decide where to place data depending on the location of the VM accessing it. The data can move around to follow the VM. The Nutanix FS uses a lot of ideas from distributed systems research and implementation, rather than taking a classic filesystems approach and applying it to HCI.

Creating compressible data with fio.

binary-code-507785_1280

Today I used fio to create some compressible data to test on my Nutanix nodes.  I ended up using the following fio params to get what I wanted.

 

buffer_compress_percentage=50
refill_buffers
buffer_pattern=0xdeadbeef
  • buffer_compress_percentage does what you’d expect and specifies how compressible the data is
  • refill_buffers Is required to make the above compress percentage do what you’d expect in the large.  IOW, I want the entire file to be compressible by the buffer_compress_percentage amount
  • buffer_pattern  This is a big one.  Without setting this pattern, fio will use Null bytes to achieve compressibility, and Nutanix like many other storage vendors will suppress runs of Zero’s and so the data reduction will mostly be from zero suppression rather than from compression.

Much of this is well explained in the README for latest version of fio.

Also NOTE  Older versions of fio do not support many of the fancy data creation flags, but will not alert you to the fact that fio is ignoring them. I spent quite a bit of time wondering why my data was not compressed, until I downloaded and compiled the latest fio.