Dennis Moore - 18 October 2021

With more and more customers moving to IBM’s Enterprise Consumption (EC) solution within Tailored Fit Pricing (TFP), it may be time to revisit some of the old tried-and-true MSU tuning strategies. With EC, all MSUs contribute to your software bill, no matter when they are consumed. In the past, you probably focused on MSU usage during your peak 4HRA and didn’t worry too much about the rest. Now, paying more attention to MSUs at non-peak times literally “pays” in terms of saving real money.

In one sense, it’s déjà vu all over again. Performance analysts have always been focused on optimizing for peaks: peak 4HRA, and peak intervals that push the limits of physical hardware. The overall approach is still the same, but now we also care about unnecessary consumption off-peak. This means we have a lot more to look at, and a lot more opportunity for savings.

At the same time, new members with less tuning experience are joining our teams. It’s a good time to revisit processes, tools, and skills to be sure you’re optimizing as much as possible.

In this blog, I’ll remind you of a simple yet powerful approach to identify workload increases quickly. Remember, the sooner you spot them, the sooner you can remediate the issue and save money.

How to Triage MSU Consumption Changes

They say a picture is worth a thousand words. Figure 1 shows MSU consumption by Workload, for the same system, for two different days. Do you see anything different?

Figure 1: 24-hour Workload comparison for system SYSA

Figure 1: 24-hour Workload comparison for system SYSA

 

There are several differences in the chart on the right. In particular, the STC workload increased for several hours in the morning, and the IMS workload increased late in the day. Notice that you didn’t have to take much time, or think a lot, or crunch any numbers to get the big picture quickly.

Humans are a visual species, and we’re pretty good at pattern recognition. You’re probably already thinking things like:

  • The STC increase is about 50 MSUs – not huge, but worth looking into…
  • STC went back down, so they probably already took some action – we just want to make sure it stays fixed…
  • The IMS increase is bigger, and hasn’t dropped yet – better check on that one first…
  • Everything else looks pretty much the same – probably no other follow up needed on SYSA…
  • Is this my 2nd cup of coffee or my 3rd…?

Now that we’ve completed SYSA’s triage-at-a-glance, let’s investigate the IMS workload increase to see what’s going on.

IMS Workload Increase Root Cause Analysis

Drilling down to the region level, we see a clear picture in Figure 2.

Figure 2: IMS address spaces

Figure 2: IMS address spaces

 

One region stands out with a dramatic increase beginning at 8:00pm.

It’s easy to see that this is not typical by reviewing a longer timeframe. Figure 3 shows the service class utilization for these regions for the prior month. Notice the spike on 4/19 – clearly this is not typical.

Figure 3: Prior month view confirms outlier

Figure 3: Prior month view confirms outlier

 

You can perform further due diligence, reviewing prior months’ data as well as plotting transaction rates vs. CPU consumption. Once you’re satisfied this increase is not business-as-usual, it’s time to ask the IMS team to investigate.

The point of this exercise was to quickly identify an increase that could consume MSUs if left unchecked, and you’ve done that.

STC Increase Root Cause Analysis

Figure 1 also showed an increase in workload STC. Let’s drill down to the address space level there as well, as shown in Figure 4.

Figure 4: STC workload by address space

Figure 4: STC workload by address space

 

Figure 4 shows a spike of about 50 MSUs for a single address space. As with the IMS example, due diligence would show that this is not typical and not expected. Informing the appropriate team about this spike will help them investigate and remediate the issue before it happens again, in the spirit of preventing wasted MSUs.

Figure 5 shows another way you can compare CPU consumption from one day to another, for the top 20 address spaces. I like to do a quick check both ways, since some issues may be easier to spot in one format vs. another. For me however, the intuitive ease-of-use of the area charts is hard to beat.

Figure 5: Comparison of daily CPU time for top 20 address spaces

Figure 5: Comparison of daily CPU time for top 20 address spaces

 

Tip: At IntelliMagic, our consultants review charts like these as one element of proactive performance reviews. We typically do a daily review comparing a current day vs. one week ago. We compare Monday to Monday, Tuesday to Tuesday, etc., so that business workload patterns are comparable. We adjust for anomalies such as month-end or holidays and follow the data as needed, but comparing to the same day one week ago is a great place to start.

Utilizing New Ways to Reduce CPU

Looking for ways to reduce CPU consumption is nothing new. In fact, you could say this was considered more critical decades ago, when processor hardware was by far the most expensive component. With TFP, we are now circling back.

So, what is new? Off-peak savings matter, we have a new generation of performance analysts, and new ways to report the data. Visual comparison of CPU utilization at the workload/service class/report class/address space level is a tried-and-true approach that still applies. Leverage the power of visualization, and drill down for more detail.

There are many ways to trim MSU consumption, including those discussed in the tuning articles you’ll find here at IntelliMagic. Use them all to optimize as much as possible. Then repeat, because we all know that things change. It’s Déjà vu all over again

Job-level CPU Saving Opportunities in an Enterprise Consumption Environment

Analysts involved in leading CPU reduction efforts will have to focus limited staff resources on the top potential savings opportunities. This article focuses on job-level CPU reduction opportunities.

This article's author

Dennis Moore
Senior Consultant
See more from Dennis

Share this blog

Sub-Capacity Reporting Tool (SCRT) Filtering Capabilities & Benefits

You May Also Be Interested In:

Blog

Processor Topology – A Picture Says 1024 Words

We all know that our z/OS LPARs rely on processors to execute the code to run our business. We also need to plan on having sufficient resources available for our workload when we design our LPARs.

Read more
Blog

Is Some Automation Unnecessary Noise or Just Lower Priority Work?

Computers are great at doing what we tell them. They don’t get tired, bored, or complain about the endless work we have them doing. But sometimes automation is running in not so obvious places. Learn some practical guidance for managing those workloads.

Read more
Whitepaper

Potential Infrastructure CPU Reduction Opportunities in an Enterprise Consumption Environment

From a performance perspective, IBM's Tailored Fit Pricing opens up an array of options for reducing MSU consumption. This article summarizes tuning actions that you can take at the infrastructure level to realize CPU reductions.

Download

Discover CPU and MLC Reductions for Your Environment