AI and z/OS Performance and Capacity Analysis: 2018 Predictions

Brent Phillips - 24 January 2018

2018 is gearing up to be a watershed year for z/OS performance and capacity professionals.

Industry analysts have been talking for some years now about Artificial Intelligence (AI) and the role it will play in our work. But what that truly means, and its value in day-to-day operations has not yet been understood or realized by most professionals in this field.

There are many different types of AI, but not all are useful in making the computer do the kind of infrastructure performance and availability health assessment work that is no longer feasible for human analysts to proactively do every day. But when properly designed and deployed, it has proven very effective to implement automated, AI-driven decision making about what all the data means for identifying current or near-term performance problems and their root-causes.

The reason the computer can be more effective at this is that it is far more efficient than humans at continuously assessing how the application workloads are complying with hundreds or thousands of the most common issues that cause service disruptions on the specific infrastructure components running the workloads.

Answering for example, what z/OS best practices are indicating performance risk or problems, or what z/OS components are nearing saturation or have lost redundancy or are being used inefficiently? This automated application of domain-specific expert knowledge enables the human analysts to focus on the most important issues and root-causes that are, or will, affect the required application service levels.

2018 Predictions: AI and the IT Infrastructure

Applying AI techniques to the IT infrastructure operations data has been proven effective already for quite some time, at least in IntelliMagic solutions. Based on our experience in the market, as well as comments in the press and by industry analysts, we expect 2018 to be a watershed year in terms of mainstream recognition of the benefits of using AI to operate the IT infrastructure for optimal service levels.

In the bigger picture, this modernized, AI-driven analytics approach addresses issues such as:

Enabling experienced staff to get the answers they need far more quickly
Training new staff and bringing them up to speed sooner
Analyzing and interpreting vast amounts of data
Predictively detecting areas that represent performance risk
Identifying inefficiency that helps safely reduce costs
Quicker understanding of the IT operations data sources from new infrastructure technology

Closing the Performance and Capacity Skills Gap

Many organizations are hiring new staff to complement their deep z/OS performance and capacity planning experts that are due to retire in the coming years. Yet the skills required take years to develop, and in the meantime, the team must deliver continuous availability for the production applications. Solutions with deep, platform-specific expert-knowledge that are accessible by the algorithms facilitate faster learning about what is important, as well as showing what the sometimes-obtuse root-causes of the more easily visible performance problem symptoms are.

Augmenting Human RMF/SMF Data Analysis with Artificial Intelligence

Manual, proactive analysis of vast amounts of performance metrics is not effective or feasible with the complexity and scope of the infrastructure using the limited human resources most teams have today. Instead, teams typically dig into the data only after performance issues arise. Inviting “AI to the team” enables the entire team to be more productive, more quickly.

Predictive & Preventative Performance Intelligence

Organizations do not need more reports; they already have more than enough for their staff to look at. What they need is refined intelligence about what is important in all the data and what it means for the performance and capacity and efficiency of the infrastructure. Responding quickly to application availability disruptions is fast becoming too expensive and unreliable, and even real-time monitors are too late to avoid the production problem.

The need for proactively predicting and preventing service disruptions will soon become a fundamental requirement for all organizations – not just the largest financial institutions. Only AI technology that utilizes platform-specific expert domain knowledge can provide effective predictive capabilities with minimal false positives (alerting about unimportant issues) and without false negatives (missing the important problems).

Reducing Costs without Impacting Performance

Finding ways to reduce ever-rising costs has always been a priority for organizations, but not at the expense of performance and availability. AI-driven analysis can automatically and continuously assess whether common inefficiencies have arisen in the dynamic infrastructure operation.

Keeping up with Modern Technologies

The z/OS infrastructure continues to add to its already rich source of metrics with additional metrics about new technologies such as Pervasive Encryption, data compression, and other features. Properly analyzing these new data sources using antiquated reporting techniques and products requires custom coding and manual interpretation. Consequently, many sites today have significant gaps in visibility into the metrics required to support newer technologies.

Better intelligence means processing and assessing all of these new data types. Representing that information in an easy to understand manner that is flexible and interactive eliminates the need to invest resources to learn, develop, and maintain one’s own custom reports to understand and manage these new infrastructure components.

Moving Ahead with AIOps

The integration of artificial intelligence with IT operations analysis is now being referred to by some in the market as “AIOps”. 2018 is likely to see the emergence of AIOps on a much larger scale than in previous years because it provides a breakthrough in productivity and effectiveness at a time when human analysts are coming under increased loads due to past reductions in staff while the workload and infrastructure complexity is growing.

Related Resources

Webinar

New to z/OS Performance? 10 Ways to Help Maintain Performance and Cost | IntelliMagic zAcademy

This webinar will delve into strategies for managing z/OS performance and costs. You'll gain insights into key metrics, learn how to identify bottlenecks, and discover tips for reducing costs.

Watch Webinar

News

What's New with IntelliMagic Vision for z/OS? 2024.2

February 26, 2024 | This month we've introduced changes to the presentation of Db2, CICS, and MQ variables from rates to counts, updates to Key Processor Configuration, and the inclusion of new report sets for CICS Transaction Event Counts.

A Mainframe Roundtable: The SYSPROGS | IntelliMagic zAcademy

Discover the vital role of SYSPROGs in the mainframe world. Join industry experts in a concise webinar for insights and strategies in system programming.

Watch Webinar

Go to Resources

Book a Demo or Connect With an Expert

Discuss your technical or sales-related questions with our mainframe experts today

Speak with an Expert Schedule a Demo