A 2007 paper, that still has lots to say on the subject of benchmarking storage and filesystems. Primarily aimed at researchers and developers, but is relevant to anyone about to embark on a benchmarking effort.
Understand what you are testing, cached results are fine – as long as that is what you had intended.
The authors are clear on why benchmarks remain important:
“Ideally, users could test performance in their own settings using real work- loads. This transfers the responsibility of benchmarking from author to user. However, this is usually impractical because testing multiple systems is time consuming, especially in that exposing the system to real workloads implies learning how to configure the system properly, possibly migrating data and other settings to the new systems, as well as dealing with their respective bugs.”
We cannot expect end-usersto be experts in benchmarking. It is out duty as experts to provide the tools (benchmarks) that enable users to make purchasing decisions without requiring years of benchmarking expertise.
In a previous post I showed a chart which plots concurrency [X-axis] against throughput (IOPS) on the Y-Axis. Here is that plot again:
Experienced performance chart ogglers will notice the familiar pattern of Littles Law, whereby throughput (X) rises quickly as concurrency (N) is increased. As we follow the chart to the right, the slope flattens out and we achieve a lower increase in throughput, even as we increase concurrency by the same amount at each stage. The flattening of the curve is best understood as Amdahls Law.
Anyone who follows Dr. Neil Gunther and his Universal Scalability Law, will also recognize this curve.
The USL states that taking the values of concurrency and throughput as inputs, we can in fact calculate the scalability of the system. Specifically we are able to calculate the key factors of contention and crosstalk – which limit absolute linear scalability and eventually result in less throughput as additional load is submitted even as the capacity of the system is saturated.
Using his Excel spreadsheet, I was able to input the numbers from my test and derive values that determine scalability.
Taking the largest number (0.074%) the “contention value” (i.e the impact we expect due to Amdahls law) as the limit to linear scaling – we can say that for this particular cluster, running this particular (simplistic/synthetic) workload the Nutanix cluster scales 99.926% linear. Although I did not crank up the concurrency beyond 576, the model shows us that this cluster will start to degrade performance if we try to push concurrency beyond 600 or so. Again, the USL model is for this particular workload – on this particular cluster. Doubling the concurrency of the offered load to 1200 will only net us 500,000 IOPS according to the model.
The high linearity (99.926%) is expected. The workload is 100% read, and with the data-locality feature of Nutanix filesystem – we expect close to 100% scalability.
We will return to these measures of scalability in the future to look at more realistic workloads.
As performance analysts we often have to summarize large amounts of data in order to make engineering decisions or understand existing behavior. This paper will help you do exactly that! Many analysts know that using statistics can help, but statistical analysis is a huge field in itself and has its own complexity. The article below distills the essential techniques that can help you with typical performance analysis tasks.