Brent Phillips -

The mainframe skills gap is a well-known issue, but most of the focus is on mainframe application development. A large z/OS mainframe organization may have thousands of application developers but only 20 or fewer performance & capacity planning staff. Even though fewer in number, these IT staff have an outsized impact on the organization.

The problem, however, is not just about recruiting new IT staff members to the team. The road to becoming a true z/OS performance and capacity (perf/cap) expert is far longer and more difficult than what is necessary for a programmer to learn to code in a mainframe programming language like COBOL. Consequently, it is not feasible to fill the performance and capacity planning gap with new recruits, and recruiting experienced staff from the short supply is difficult. Even teams that have all the headcount positions filled very often exhibit at least some of the signs that they are being negatively impacted by insufficient levels of expert staff.

A primary contributor to the problem is the antiquated way of understanding the RMF and SMF performance data that most sites still use. The way this data is processed and interpreted not only makes it difficult for new IT staff to learn the job, but it also makes the job for the existing experts more difficult and time-consuming.

Here are six signs that indicate your z/OS performance and capacity team would benefit by modernizing analytics for your infrastructure performance and configuration data.

1.     Production Problems Take Too Long to Diagnose

An IT disruption or outage can cost organizations thousands of dollars per minute, so diagnosing and resolving the problem is critical for production and management teams alike. If your production problems often take too long to diagnose, this is a clear sign that your team is lacking the visibility and analytics necessary to quickly identify the root cause of application availability issues.

2.     Production Problems are Not Predicted and Prevented

Resolving production problems quickly is important, but being able to identify areas that will lead to issues before they occur is even more valuable. Forced to rely on a catalogue of unrated static reports, deriving this type of predictive intelligence from the data is not feasible without modernizing the analytics using artificial intelligence techniques like embedded expert infrastructure knowledge.

When performance analysts are too busy fighting fires and trying to diagnose problems without the benefit of modernized analytics, they don’t have the time to proactively evaluate the risk in the infrastructure and address the problems that are developing. This can lead to more fires and increased turnover as staff members are regularly overworked and stressed.

3.     Problem Resolution Often Requires Outside Vendor Help

When infrastructure performance problems require outside expertise from infrastructure vendors to resolve, it makes the process even longer and costlier. This is an indicator of a need for an analytics solution that includes infrastructure specific ratings. A cloud-based analytics solution that includes expert performance services can also bridge the gap for the missing expertise and be more effective than relying on infrastructure vendors.

4.     You Only Have One Expert for Critical Infrastructure Components

Only having a single expert, or sometimes none at all, for a critical part of the infrastructure creates a risk. The analytics solution you use should facilitate cross-learning. Automated ratings based on the specific infrastructure components in use should allow any team member, even new staff, to understand if there are issues of concern that someone needs to address.

5.     No Time to Optimize MSU-Based Software Costs

IBM’s Monthly License Charge’s (MLC) can often be optimized given the right visibility. However, the architectures have changed and there are new metrics to analyze daily. Most teams already have a focus on MLC optimization, but it is too time consuming to identify issues manually using the status quo approach to the performance and capacity data. IntelliMagic’s approach to the RMF and SMF data with modernized analytics opens new views that make optimization easier and more feasible. IntelliMagic has identified significant MLC optimization opportunities in over 85% of the sites that have sent data to us for an assessment. However, an organization that is too busy putting out fires and resolving availability disruptions will never have the bandwidth to begin optimizing their environment and reducing costs. If your MSU-based software costs continue to skyrocket year after year, there’s a good chance you have a high potential to significantly reduce these costs once you understand the cost drivers.

6.     Difficulty Testing Infrastructure Performance of New App Releases

The z/OS platform excels at efficiently executing transactions and is a critically important component for many business applications. Yet application development teams rarely have an understanding of how application program changes impact the performance and cost efficiency of the underlying z/OS infrastructure. Changing your approach to DevOps testing can significantly improve release quality as it relates to the z/OS infrastructure’s ability to efficiently deliver the service levels required by specific applications. If robust infrastructure performance testing for new application releases is not occurring today, it is an important gap that can be addressed with a modernized approach.


If you’re already experiencing some or all of these signs within your performance and capacity processes, it is feasible to address the problem without increasing experienced headcount. A modern solution to z/OS performance and capacity planning addresses many issues that prevent experts from realizing their true potential:

  • Shortage of experienced professionals
  • Lack of necessary time
  • Likelihood for human error in rote data analysis
  • Need for responsiveness and a proactive approach to availability
  • Insights to reduce and optimize costs

By utilizing Artificial Intelligence techniques to unlock the power of the RMF/CMF and SMF data, the ability of a single expert is elevated to meet today’s new business requirements. IntelliMagic Vision has been designed to help your existing and new performance and capacity staff successfully meet these challenges in today’s operating environments.

This article's author

Brent Phillips
Worldwide IBM Z Performance Evangelist
Read Brent's bio

Share this blog



New to z/OS Performance? 10 Ways to Help Maintain Performance and Cost | IntelliMagic zAcademy

This webinar will delve into strategies for managing z/OS performance and costs. You'll gain insights into key metrics, learn how to identify bottlenecks, and discover tips for reducing costs.

Watch Webinar

What's New with IntelliMagic Vision for z/OS? 2024.2

February 26, 2024 | This month we've introduced changes to the presentation of Db2, CICS, and MQ variables from rates to counts, updates to Key Processor Configuration, and the inclusion of new report sets for CICS Transaction Event Counts.

Read more

A Mainframe Roundtable: The SYSPROGS | IntelliMagic zAcademy

Discover the vital role of SYSPROGs in the mainframe world. Join industry experts in a concise webinar for insights and strategies in system programming.

Watch Webinar

Go to Resources

Book a Demo or Connect With an Expert

Discuss your technical or sales-related questions with our mainframe experts today