Brett Allison - 16 April 2021

What’s in a storage system response time? Every storage system vendor has different methods and definitions surrounding the response times provided, but unfortunately most of the explanations lack clarity into the components of the response times.

In this blog we will examine the components of the response times for I/O on a fibre channel fabric with block storage.

4 Steps Required for an I/O To Be Completed

Figure 1 displays the steps that occur for an I/O to be completed:

  1. The host initiates the I/O request
  2. Switch routes the request to the target address
  3. The storage system receives the requests (starts an internal processing counter) and performs processing to gather the required data. It then sends the data back to the host, stops internal processing counter on send and starts the network timer.
  4. The host receives and processes the data and sends an acknowledgement of receipt. The storage system receives the acknowledgment and stops the network timer.
Steps required for an IO to be completed

Figure 1: Steps required for an I/O to be completed

 

All response time measurements observed at the storage array are limited in visibility. For example, the storage array cannot know when the host has sent the request, nor does it know the duration of time between the initial host send and the receipt at the storage array.

That is, the time accumulated in step 1 and step 2 above is opaque to the storage system. One could wager that the amount of time in step 1 is roughly equivalent to the amount of time in step 4, but they couldn’t be certain without timing measurements from the host or the switches.

Various Response Time Definitions

Now that we understand the basic steps, we can define and categorize the different types of response times, also referred to as latencies, that may be available. In reality it depends on what the vendor thought was important, but generally the following variations are encountered in storage system performance data:

Internal processing

This is the time between receiving the host I/O request and sending back the response.

Round-trip time

Also called end-to-end response time. This is the time between reception of the host I/O request, and reception of the host acknowledgement after sending the response.

Network time

This is the difference between round-trip time and internal processing. In addition to the actual time spent in the fabric, this includes the time spent by the host to ingest the response of the storage system. While this is referred to as network time, it does not include the time spent between when the host has initiated a request and the moment the request is received by the storage system.

Full

Internal processing and round-trip time are both available.

Data Sources for Storage Hardware Platforms

The measurement available for a few of the platforms supported by IntelliMagic Vision are:

  Volume metrics Port metrics
IBM DS8000 Round-trip time Network time
HPE 3PAR Internal processing Round-trip time
INFINIDAT InfiniBox Full Full
Huawei OceanStor Internal processing Internal processing
Pure FlashArray Full N/A

 

Having at least two of internal processing, round-trip time, and network time available can be very helpful in diagnosing the root cause of an observed increase in response time.

For example, if the internal processing time is high, the cause is likely in the storage system configuration, whereas if the internal processing time is good, and only the network time and/or round-trip time is high, the cause is likely in the fabric or at the host.

Obviously, having both timers available for the volumes and the ports is best, but even in situations such as for the IBM DS8000 and HPE 3PAR series, being able to compare the metrics for the ports and volumes provides valuable insights in the health of the storage infrastructure.

Conclusion

As you can tell, various vendors provide varying measurements at the volume and port level. When visualizing the response time data, it is very important to understand if you are looking at the internal processing time, the round-trip time, or the difference between the two.

When you understand what you are looking at you can also interpret whether it is good or bad.

IntelliMagic Vision automatically applies hardware specific and measurement specific thresholds to intelligently highlight areas of your infrastructure that have performance risks. For a multi-vendor environment, it is especially important to properly interpret the response times so that you can manage any risk in the host, fabric, or storage array.

If you would like to see intelligent interpretation in action, start a free trial of IntelliMagic Vision so that you can see for yourself.

This article's author

Brett Allison
Director of Technical Services
More from Brett

Share this blog

5 Things Every Storage Professional Should Be Checking

Subscribe to our Blogs

Subscribe to our newsletter and receive monthly updates about the latest industry news and high quality content, like webinars, blogs, white papers, and more.

Related Resources

Whitepaper

Improving Fibre Channel SAN Performance by Reducing Congestion

This whitepaper discusses the causes, troubleshooting, and remediation of congestion on Fibre Channel networks. It will also give some helpful best practices that serve as preventative measures.

Download
Blog

Avoid Islands In Your Data Lakes

Executives have been asking for a holistic view of the enterprise applications and infrastructure for years. The result has been a slew of overly complex, hard to maintain, expensive, and siloed IT management solutions.

Read more
Webinar

Dell EMC VMAX All Flash and PowerMax Performance Management and Architecture Overview

This webinar will look at the key physical and logical components that make up Dell EMC’s All Flash and PowerMax storage arrays.

Watch Webinar

Go to Resources