Storage Performance Analysts never get a break. They only seem to get credit when there is some sort of crisis and their reward is often 12-hour work days and exposure to intense management scrutiny. When things are running smoothly they tend to be ignored or redeployed to other projects. With the advent of SSDs that are becoming ever more pervasive and affordable for Enterprise class applications, storage vendors are suggesting that storage performance issues are a thing of the past. But let’s take a closer look at how SSDs are used in today’s Enterprise Storage before we decide if this is really true.
Today, there is a whole spectrum of SSD use cases in the Enterprise. Generally, SSDs have been deployed in the same storage systems that support HDDs although lately there have been a number of All Flash Arrays (AFAs) that have come to market. AFAs are for the most part simply a collection of flash memory using hardware that is more tuned to the storage capabilities. An SSD is just a collection of flash memory and a controller so at a very basic level you can think of an AFA as a super-sized SSD!
In traditional storage systems, SSDs are often a low-latency HDD replacement that can enable either manual or automated tiering. If tiering is done right, then indeed you will see improved average I/O response times result from adding some number of SSDs to the storage system. And, subsequently, your I/O-bound applications will run faster. None of this should be a big surprise. However, the big leap in logic is that now that your I/O runs faster all of your potential performance problems have disappeared forever!
HDDs have been getting faster for decades and cache has been effectively used to blunt the impact of latency on applications. For example, cached writes almost always occur at electronic speeds so SSDs have no beneficial impact on response time for writes. Certain applications, especially in the mainframe space, have locality of reference and thus exhibit high read hit ratios, again limiting the latency effect. For sequential I/Os, throughput is the name of the game – and HDDs and SSDs are not so different in terms of throughput capabilities. The result is that disk drive latency really affects only a limited subset of I/Os, thus deploying SSDs may only realize evolutionary, not revolutionary improvements in overall I/O performance.
Now consider some typical problems seen in modern Enterprise Storage and assess whether the reduced latency of SSDs can actually do much to avoid them.
Front-end Fabric limitations – Front-end contention may occur because either there are not enough host ports or front-end adapters, the fabric speed is insufficient, or there is some imbalance in port usage. SSDs won’t help here.
File Contention – This can happen when there are multiple applications vying for the same datasets. Reduced disk drive latency may help but will not eliminate this potential issue.
Long running I/Os – Some applications are very sensitive to the occasional long running I/O. With HDDs, it is not uncommon for the occasional I/O to have latency measured in the 100s of milliseconds. On the surface it seems that SSDs would solve this problem but in reality SSD processes such as garbage collection could exacerbate this issue.
Overrunning the Batch Window – Since currently available enterprise SSDs don’t really boost throughput much compared to HDDs, it is unlikely that they will solve batch window problems. The bottleneck would likely be in the storage system, not the drives.
Remote Replication – Having low latency drives will not help if remote replication is causing delays. This may occur due to limited inter-site bandwidth, insufficient links, or network contention.
Drive Imbalance – Since SSDs can handle many more IO/sec than HDDs, it would seem that they would insulate you from imbalances. However, aggressive auto-tiering could create a hot spot on an array of SSD drives and even exceed their capabilities at times.
The bottom line is that although SSDs provide many benefits, they do not eliminate the need to keep a careful eye on storage performance. Indeed, if you drive a Ferrari, you are likely much more concerned about performance than if you drive a Toyota. To keep your storage tuned like a Ferrari, use IntelliMagic Vision to find and eliminate performance risk in your storage environment.
Estimating Storage System Capabilities Should not be a Risky Business!
If you want a useful headroom metric you need to define it properly.
How to Avoid Application Infrastructure Performance Problems
"What are the top 5 million things you need to do today to avoid application infrastructure performance problems?"
KBC Ensures Data Storage Availability with IntelliMagic Vision
Learn how the Belgian bank-insurance group KBC ensures data storage availability using IntelliMagic Vision