Do you have any VMware connectivity risks? Chances are you do.
Unfortunately, there is no way to see them. That’s because seeing the real end-to-end risks from the VMware guest through the SAN fabric to the Storage LUN is a difficult thing to do in practice as it requires many relationships from a variety of sources.
A complete end to end picture requires:
- VMware guests to the ESX Hosts
- ESX hosts initiators to targets
- ESX hosts and datastores, VM guests and datastores, and ESX datastores to LUNs.
- Zone sets
- Target ports to host adapters and LUNs and storage ports.
For seasoned SAN professionals, none of this information is very difficult to comprehend. The trick is tying it all together in a cohesive way so you can visualize these relationships and quickly identify any asymmetry.
Why is asymmetry important? Let’s look at an actual example:
Notice the ESX Hosts layer. I have highlighted the asymmetry. This layer consists of four hosts: php10404, php00201, php00203 and php10203. Notice that php10404 has a single path to the DCX13 switch while the other three ESX hosts contain two paths through the fabric with one path going through DCX13, and one path going through DCX14.
You probably also noticed that the Logical Switch DCX14 only had three ports while it’s counterpart, DCX13, had four ports.
Because a VM has relationships with multiple ESX hosts it is not surprising that in this case there were many other VMs that had the same issue. Should the path from php10404 fail, any VM residing on the ESX host at the time of failure would lose connectivity to its data on the SAN fabric.
Further investigation revealed a problem with a switch port on DCX14.
Let’s take a look at another example:
In this example, we see a couple issues. I have highlighted the asymmetry at the ESX host level. Host xvd086 has three active paths to the fabric while the other ESX hosts associated with this VM have four paths (one through each Logical switch at the next level). Note that DCX11 only has three ports. This is the switch that does not have a connectivity relationship with host xvd086.
The challenge with this scenario is that when switch ports fail or misbehave the host may not be aware of the failure until the ESX host is restarted or a rescan for hardware occurs – leaving a huge blind spot for VMware administrators. Lastly, we see that there is asymmetry in the Target Ports level. SP-A contains four storage target ports but SP-B only has two. This turned out to be a zoning issue where some of the target ports for SP-B were not included in the zone set. This was easily resolved.
Free SAN Connectivity Audit
For a free connectivity audit of your SAN environment please send me an email at email@example.com
You can view the capabilities described in this blog in the video below:
Best Practices for Managing your SAN Performance (Part 3: Planning)
Within infrastructure capacity management it is important that we consider growth to help us understand future costs for budgeting purposes.
Best Practices for Managing your SAN Performance (Part 2: Reactive)
As a SAN administrator your job is to provide applications with access to fast and reliable SAN storage. Here are some best practices to ensure these goals are achieved.
Best Practices for Managing your SAN Performance (Part 1: Proactive)
Over the years I have learned the hard way that spending a little bit of time proactively assessing the health of the SAN environment is worth a thousand hours of reactive problem management.
Subscribe to our Newsletter
Subscribe to our newsletter and receive monthly updates about the latest industry news and high quality content, like webinars, blogs, white papers, and more.