Jerry Street - 16 April 2018

When I was growing up, long car rides were a bit challenging due to our car’s alerting system: smoke, steam, horrible clunking noises, or dead silence. Everything was great until Betsy (my mom always named our car Betsy) did not move anymore. Then we had to get the car to a mechanic who was an expert at making us feel ignorant and took a lot of our money to fix something simple (usually).

Then cars started getting better at alerting the operator about simple problems, but you still had to take the car to a mechanic to fix the problem. Today, between YouTube, Google, and Internet forums, you can often get the steps it takes to resolve a lot of these alerts for a whole lot less money; however, there is still more that needs to be done between getting an alert from your car and solving/fixing the issue.

What if your car could alert you to an issue, do an Internet search for you, and send a fixit video to your smartphone before you could even get to a safe place to check your smartphone? That kind of intelligence would be convenient. The same principle applies to alerts you get from your z/OS Operating System.

When I started working in Operations, when we still called it “MVS”, an Operator would see an alert and call me (usually at night). I would sometimes have to drive into the office or call another Systems Programmer, analyze the alert, and act upon it. What if now, the alert could automatically perform root cause analysis and send supporting reports to your smartphone?

One of the major problems with alerts in IT is that digitally oriented machines are generating so many that the Operators become desensitized to them. Projects to “clean up” alerts may end up filtering out necessary ones. I know of one project that was started to reduce alerts, which was intended to improve the alerting, and it created more problems than it solved. Many customers even ask for a single pane of glass to contain alerts and want them to be smarter. This can wind up being a single glass of pain, that adds no value if alerts don’t lead to actionable solutions.

What Does that z/OS Alert Do for You?

Alerts from z/OS are common. Some come with software packages, some are part of the operating system, and some are written by Systems Programmers or your Developers.

Perhaps a loved service class that was missing its goal caused a business impact at one point in time. Someone probably wrote some automation to alert the Operator about that. Maybe a batch job missing its window triggers an alert. Paging could issue alerts. There are many scenarios where alerts cause someone to act and investigate the alert manually.

So when you get an alert, what does that alert do for you? Does it just scratch the surface and require further manual analysis? Whenever you hear or see the word “manually”, it can be synonymous with “expensive”, “difficult”, and “laborious”.

What is meant to help (the alert) and benefit your business almost always results in a lot more work just to understand if the alert is valid and important; not to mention that there is always much more analysis to perform in order to understand root cause.

User-Friendly Alerts

Alerts should do more than cause more work. Alerts should be part of the solution and not part of the problem when monitoring and maintaining z/OS. If z/OS or your z/OS automation alerts you to something, it should also get you very close to solving that something.

Do not settle for complicated, poor, or confusing alerts. When you see or hear about an alert, ask the question, “What solution does the alert suggest?”

Part 2 of this blog, Automating Analysis of z/OS Alerts, discusses how you can automatically take that next step to make alerting even more valuable.



6 Reports Every IBM TS7700 Performance Analyst Should Have

Rather than reviewing every indicator across your entire tape grid, there are 6 key reports that you should be reviewing regularly to keep your TS7700 in good working order.

Read more

IBM z15 Announcement Highlights and How to Take Advantage

The z15 (with a General Availability date of 9/23/2019) offers up to 190 CPU cores (vs. 170 on z14) and 40 TB of usable memory (vs. 32 on z14), in addition to processor cache and overall performance improvements.

Read more

The z/OS Performance and Capacity Skills Gap

The most effective way for mainframe sites to bridge the skills gap is to start using a smart IT Operations Analytics solution that contains built-in z/OS knowledge.


Go to Resources

Predictive Intelligence for z/OS Systems Infrastructure

The goal of this white paper is to show you how to apply predictive intelligence to your z/OS Systems infrastructure analysis so you can avoid costly disruptions, empower your IT staff, and optimize your environment.

This article's author

Jerry Street
Senior z/OS Performance Consultant
Read Jerry's Bio

Share this blog

Subscribe to our Newsletter

Subscribe to our newsletter and receive monthly updates about the latest industry news and high quality content, like webinars, blogs, white papers, and more.