This blog originally appeared as an article in Enterprise Executive.
Computer professionals have been interested in determining how to make computer applications run faster and determine the causes of slow running applications for more than 50 years. In the early days, computer performance was in some ways easy because electronic components were soldered in place. To understand what was happening at any point in the circuitry, we simply attached a probe and examined the electronic wave information on an oscilloscope.
Eventually, we were able to measure activity at key points in the computer circuitry to determine things like CPU Utilization, Channel Utilization and Input/Output response times. However, this method still had many shortcomings. First, the number of probes was very small, usually less than 40. Secondly, this method gives no insight into operating system functions or application operations that may be causing tremendous overhead. And of course, when integrated circuits were developed, the probe points went away.
In 1966 I joined an IBM team that was focusing on a better way to conduct benchmarks in what was then named an IBM Systems Center. Customers considering computer upgrades would come to our data center to determine how their programs would operate on newly released hardware. But it was simply not possible to host every customer in this way.
I was a part of a team of three who undertook the task to quantify the amount of work for specific customer jobs in terms of CPU and I/O activity, analyze that information, and then predict results using simulation models. These simulations allowed us to model the impact on service levels of different CPU speeds, workload variations, I/O device speeds, etc.
We first determined the points in the systems software to measure. Then we quickly discovered that the captured data was extremely helpful, becoming even more important than the predictive models we had developed. We had, for the first time, infrastructure performance information that no one else had. For example, we knew immediately that I/O Channels were busy doing a lot of things besides reading and writing user data. The exact overhead activities to execute commands were below the level of data we were capturing, but it was still amazing to see the overhead contribution to I/O service times.
We were successful in building monitoring and modeling capabilities for both MVT and MFT that were variants of IBM’s OS/360 operating system. The models were also very successful in predicting the impact of making various changes to the environments.
We were tracing systems activity as it happened, and the amount of data being captured was significant. Our team and IBM Systems Engineers used this capability to assist customers in evaluating their systems and applications. We also found a newer way to evaluate computer performance by using software to sample computer stats, first involving CPU utilization, then moving on to I/O and other activity.
At this time, Boole & Babbage developed a product to provide visibility into this data, and the team I was then leading at IBM developed a very similar capability, but it was only available to assist mainframe hardware pre-sales teams. Our team developed this capability for multiple versions of operating systems for IBM marketing team use first, and then released some versions for customers to use for their own analysis.
By the early 70’s, IBM developed MVS that included the Systems Resource Manager. With MVS’ more complex system, it became obvious that a performance monitor would be needed. The first monitor developed was known as MF/1, which provided some basic information such as CPU utilization and some I/O information. As MVS continued to grow, similar functionality that my team had developed for other IBM systems was desired for MVS as well and we worked with the MVS and later z/OS developers to add many features of SVS/PT to MF/1 – which was named RMF – Resource Management Facility – when it was finally announced as a program product in 1974.
I’ve broken this blog into 2 sections so it doesn’t become too long. You can read part 2 here.
Estimating Storage System Capabilities Should not be a Risky Business!
If you want a useful headroom metric you need to define it properly.
How to Avoid Application Infrastructure Performance Problems
"What are the top 5 million things you need to do today to avoid application infrastructure performance problems?"
End to End z/OS Infrastructure Performance Management
IntelliMagic Vision provides integrated visibility across all parts of the infrastructure to quickly identify and diagnose issues.