Flash, Flash, Flash. It seems that every storage manager has a new favorite question to ask about Flash storage. Do we need to move to Flash? How much of our workload can we move to Flash? Can we afford to move to Flash? Can we afford NOT to move to Flash?
Whether or not Flash is going to magically solve all our problems (it’s not), it’s here to stay. We know Flash has super-fast response times as well as other benefits, but for a little while yet, it’s still going to end up costing you more money. If you subscribe to the notion that it’s good to make sure you only purchase as much Flash as your unique workload needs, read on.
One of our enterprise customers asked us: “How much Flash do I really need?” This whole discussion started because they were on the verge of putting in a purchase order for some new storage hardware and wanted to know how the current workload was going to behave on their future hardware configuration. They already had SSD drives in their current system and were looking to expand the percentage of that capacity on Flash. One of the interesting results of this modeling exercise was that their projected response times did not decrease with more Flash capacity, nor did response times increase with less Flash capacity than they were planning to purchase. Which posed the question, well just “How much Flash do I really need?”
Step 1: Determine the Active Data
Our first step to determine how much Flash they really needed was figuring out what percentage of their capacity contained active data. Note that when we talk about active data in the context of workload to put on Flash, we focus on the small random-read I/Os that directly influence front-end response times. Large sequential read I/O is normally pre-staged into cache and there is essentially no performance value to keeping this data on Flash. Writes go to cache and that’s already super-fast, so it’s really the small-block random reads where Flash really shines.
This customer was already using auto-tiering (in this case IBM Easy Tier) so it was not difficult to obtain this information on most active data. Using IBM’s Stat Tool, we generated the so called “skew curve” showing the percentage of small I/O workload versus the percentage of capacity – see Figure 1.
The shape of this curve visualizes how the I/O activity is distributed over all the available capacity; typically, this distribution is very skewed. For most workloads, there is a large portion of capacity that contains data that never gets any small reads or writes at any point during the day. The yellow and blue curves, which are almost identical, represent the workload distributions from the system’s two storage pools. The red vertical line shows that 100% of their small I/O workload takes place on only about 13% of their logical capacity.
Law of Diminishing Returns
When it comes to determining how much workload to put on Flash, it is important to understand the law of diminishing returns. Adding Flash to your storage environment greatly improves performance. However, at a certain point the performance benefit of putting more data on Flash becomes too small to justify the additional cost. Therefore, if you want a mixed tier environment, then as a general rule of thumb you should target between 80% and 90% of the small I/O on Flash, depending on the response time requirements.
The 80% and 90% are visualized with the two green vertical lines in Figure 1. We can see that 80% of the small I/O workload activity for this installation took place on only 3.2% of the capacity. And 90% of this customer’s small I/O workload used 6.0% of the total capacity.
Step 2: Compare Current Configuration with Future Configuration
Now that we understand this workload distribution curve, we can compare this to the percentage of SSD capacity that they currently have, as well as to the proposed amount of Flash.
But what about the proposed new hardware? The customer was planning to purchase 14.9% of their total capacity on Flash. Based on Figure 1 above we already concluded that 13% Flash capacity would cover 100% of the active data. Remembering the law of diminishing returns, and using the tried and true calibrated eyeball technique when looking at the skew curve, it looks like 14.9% Flash is more than twice the amount of Flash this unique workload requires.
So How Much Flash Do You Really Need?
Besides the workload distribution that we discussed above, there are two other factors that need to be considered:
- How much faster is the response time going to be if you go from 90% to 100% of workload on Flash?
- How valuable are these faster response times of Flash to your business?
In this customer’s situation, they were more than happy to cut the proposed Flash purchase in half to realize significant savings once we showed them that almost identical response times would result. In Part 2 of this blog, we will show how we used modeling techniques to project the hardware component utilizations and expected response times of their workloads.
Should I Disable SVC Write Cache for Flash?
It has been suggested that in order to maximize throughput for large sequential operations on SVC volumes residing entirely on IBM Flash systems, you should consider disabling the write cache.
How to Determine Flash Candidates in Your Environment
If you are getting ready to add flash, it is important to see how much of your capacity is a good candidate for flash so you aren't throwing away money on flash you don't need
Flash Performance in High-End Storage
This paper analyzes the performance of Flash and SSDs when deployed in high-end storage, using RMF measurements from production systems.