Nutanix AOS 5.10 ships with a feature called Autonomous Extent Store (AES). AES effectively provides Metadata Locality to complement the existing data locality that has always existed. For large datasets (e.g. a 10TB database with 20% hot data) we observe a 2X improvement in throughput for random access across the 2TB hot dataset.
In our experiment we deliberately size the active working-set to NOT fit into the metadata cache. We uniformly access 2TB with a 100% random access pattern and record the time to access all 2TB. On the same hardware with AES enabled – the time is cut in half. As can be seen in the chart – the throughput is double, as expected.
It is the localization of metadata from AES that contributes to the 2X improvement. AES keeps most of the metadata local to the node – so there is no need to fetch data across-the-wire. Additionally AES reduces the need to cache metadata in DRAM since local access is so fast. For very large datasets, retrieving metadata can contribute a large proportion of the access time. This is true for all storage, so speeding up metadata resolution can make a dramatic improvement to overall throughput as we demonstrate.
During .Next 2018 in London, Nutanix announced performance improvements in the core-datapath said to give up to 2X performance improvements. Here’s a real-world example of that improvement in practice.
I am using X-Ray to simulate a 1TB data restore into an existing database. Specifically the IO sizes are large, an even split of 64K,128K,256K, 1MB and the pattern is 100% random across the entire 1TB dataset.
Normally storage benchmarks using large IO sizes are performed serially, because it’s easier on the storage back-end. That may be realistic for an initial load, but in this case we want to simulate a restore where the pattern is 100% random.
In this case the time to ingest 1TB drops by half when using Nutanix AOS 5.10 with Autonomous Extent Store (AES) enabled Vs the previous traditional extent store.
This improvement is possible because with AES, inserting directly into the extent store is much faster.
For throughput sensitive, random workloads, AES can detect that it will be faster to skip the oplog. Skipping oplog allows AES to eliminate a network round trip to a remote oplog – and instead only make an RF2 copy for the Extent Store. By contrast, when sustained, large random IO is funneled into oplog, the 10Gbit network can become the bottleneck. Even with faster networks, AES will still be a benefit because the CPU and SSD resource usage is also lower. Unfortunately I only have 10Gbit networking in my lab!
The X-Ray files needed to run this test are on github