For many mainframe performance shops, January resembles the calm after the storm of the busy holiday shopping season. Sometimes you can look around with surprise that the storm managed to leave you and your home (or organization) unscathed, and other times it’s a process of assessing the damage, putting out any ongoing fires, and preparing for the next storm.
For z/OS performance management, a stormy holiday season can mean excessive application downtime, lost revenue, customer frustration, and more. Even after excluding the direct financial costs, once we consider the true cost of downtime it’s clear why avoiding or preventing service disruptions is the number 1 priority for most mainframe performance teams.
When looking ahead at 2019, there’s a lot to consider. Mainframe transaction volumes continue to grow for most sites, and the size and complexity of the z/OS environment is increasing, yet, the number of deep z/OS infrastructure performance experts continues to shrink.
Narrowing down priorities for the year can be difficult, but these three top most lists:
- Ensure Optimal Application Performance and Availability (Zero Downtime)
- Reduce Mainframe Costs
- Overcome the Skills Gap and Improve Staff Efficiency
Utilizing the right strategic and technical plan with modernized solutions can mean the difference between achieving these goals in 2019 or missing them.
Ensure Optimal Application Performance and Availability
When z/OS technical experts are looking to manage and monitor their z/OS infrastructure performance, it’s logical to seek out a real-time performance monitor. After all, doesn’t it make sense to want to know about a service disruption or application downtime as soon as it occurs?
When the only options are either knowing right away or after end-users report the issue, then yes, knowing right away is the better option. But knowledge about upcoming disruptions that can be prevented before they affect the end-user is much more valuable.
White-box analytics use built-in expert knowledge about the hardware and a site’s specific workloads to identify potential issues before they ever impact application availability.
The image above taken from IntelliMagic Vision for z/OS shows an exception table for all warnings and exceptions for Coupling Facility and includes a prioritized rating of each issue with built-in recommendations. This leads us to our first best practice.
Best Practice #1 Utilize a Predictive Analytics Solution to Eliminate Disruptions
Reduce Mainframe Costs
When it comes to reducing mainframe costs, usually the first culprit is MLC (Monthly License Charges), and rightly so. MLC costs consume up to 30% of some mainframe budgets and if left unchecked can skyrocket out of control.
There are many options available to lower or reduce a site’s MLC costs, from capping and MLC-specific solutions to simple tuning activities that just require the right visibility and knowledge of the critical areas.
From a z/OS performance management perspective, the real key is to lower costs without negatively impacting performance – and those two rarely cooperate with each other.
Our resident MLC expert Todd Havekost has written extensively on the effect that processor cache has on MLC costs. And in addition to processor cache, we typically examine 5 key areas when performing our MLC reduction opportunity assessments to see if there is additional room for MLC cost savings.
Rather than having multiple tools or solutions to optimize or reduce MLC costs and another tool to monitor your z/OS performance, save yourself the money and additional headache and look for an end-to-end performance solution that provides built-in MLC visibility.
The three images above taken from IntelliMagic Vision for z/OS represent visibility that is critical to understanding what is causing the peaks that drive MLC costs.
Best Practice #2 Use a z/OS Monitoring Tool that lets you manage your performance AND MLC costs
But MLC costs are not the only line item that can be lowered with the right visibility. Others include:
- Eliminating emergency hardware purchases
- Purchasing the right amount of hardware
- Purchasing the right amount of flash storage
- Rebalancing workloads and applications
Clear visibility into your z/OS infrastructure allows you to monitor its performance, but advanced analytical solutions can forecast your workload growth to ensure you’re only purchasing the right amount of storage that you actually need, rather than playing it safe and over purchasing.
Best Practice #3 Forecast Workload and Capacity Growth to Eliminate Guesswork from Storage Purchases
Overcome the z/OS Skills Shortage and Improve Staff Efficiency
Overcoming the mainframe performance and capacity skills shortage and improving IT staff efficiency is essential for the long-term viability of mainframe operations. Otherwise, it’s impossible to ensure optimal application performance.
Even with less heads to pay for, mainframe costs will likely go up rather than down due to the lack of efficiency and paying for likely service disruptions.
If you’ve attended a SHARE conference in the past 2-3 years then you know that there are now numerous initiatives underway to try and fill in the gap left behind by a retiring workforce of deep performance experts. These initiatives are working to hire and train a new generation of mainframe performance analysts, but the process is slow and doesn’t solve the issue of faster skill acquisition.
The fastest way to train the incoming workforce is to equip them with tools that allow them to easily visualize a new environment, pick up on the critical areas, and easily navigate through the data. In his white paper, 5 Key Attributes of an Effective Solution to the z/OS Performance Skills Gap, Todd Havekost writes that such a solution must be:
- Fast and Current
- Visual and Interactive
- Predictive and Contextual
- Versatile with Expanded Applications
- Cloud-based and Collaborative
Having a powerful tool not only trains incoming staff but makes even the deepest subject matter experts more efficient and effective with their time and energy. By eliminating the need for manual coding or excessive report creation, and by making the sharing of reports and knowledge quick and easy, experts can instead spend the bulk of their time in reducing costs, optimizing performance, and preventing disruptions.
Best Practice #4 Ensure your z/OS Performance Monitoring Solution is Easy to Use and has an Intuitive and GUI-based interface for Quick Skill-Acquisition
Best Practices for z/OS Application Infrastructure Availability in 2019
2018 was a watershed year for how RMF (or CMF) and SMF data is used by mainframe performance and capacity teams. The capabilities and expectations for what a z/OS performance monitor can and should do has dramatically shifted. 2019 will solidify this momentum.
In this blog I covered the following Best Practices for z/OS Performance Monitoring:
- Predictive Analytics > Real-Time Monitor
- Use 1 solution with visibility into performance & MLC costs
- Save on unnecessary storage purchases by forecasting workload and capacity growth
- Make IT staff more efficient and easier to train with an easy to use performance solution
On January 31st, Brent Phillips and Jerry Street dove deeper into some of these issues and solutions and discussed how white-box analytics (as well as the more common black-box analytics) not only compensates for the discrepancy between the requirements of the job and the time of the experts but also creates capabilities not previously possible.
The webinar, Best Practices for z/OS Application Infrastructure Availability in 2019, also covered additional best practices for ensuring efficient application availability from the z/OS infrastructure.
Subscribe to our Newsletter
Subscribe to our newsletter and receive monthly updates about the latest industry news and high quality content, like webinars, blogs, white papers, and more.
An Insider’s Thoughts on Managing MLC Software Costs and Opportunities
Join this live webinar on May 30th to learn about the many options available for managing MLC software costs & their various pros & cons.
z/OS Performance Monitoring and more at SHARE Phoenix
A look back at SHARE Phoenix with links to all of the presentations and sessions we hosted.
Understanding & Dealing with z14 Traffic Patterns
The z14 is designed for massive, parallel processing. So why do delays still occur? This webinar will explore common sources of application delays and discuss practical solutions to reduce these delays.