When troubleshooting remote cluster issues or trying to determine if your remote clusters are receiving replication, a few key reports will provide most of the insights necessary. These reports include:
- Replication Backlog
- Logical Volumes for Copy
- Average Deferred Queue Age
- Average Immediate Queue Age
- Inbound Total Copy Data Rate
- Average CPU / Disk Utilization
- VTS Cache Utilization
- Compressed Data on Logical or Physical Volumes
- Data Flows in and out of Cache
In this video, we walk through these reports and provide some insight into what to look for when trying to determine if your remote clusters are receiving replication.
When trying to determine if your remote clusters are receiving replication, there are really just a handful of key reports you should check.
You likely aren’t going to need to check these reports all that often, but it is still important to make sure you’re keeping an eye on them to ensure remote VTS(s) are operating properly and receiving replication data in a timely manner. Deferred copy throttle will delay replication to remote clusters but should only be active during periods of high host activity.
For this video I used IntelliMagic Vision to create a custom dashboard that has these key reports.
The first key report is actually a group of minicharts covering Replication data for Receiving Clusters. I can drill into any of these reports to explore them further, but having them all in this view makes it easy to assess the replication health at a glance.
If deferred copy throttle is not the cause, then if there is Replication Backlog and the data rate is zero or lower than normal, then network links may need to be reviewed. If data rate is 0, the cluster may be unable to receive data, either because of a full cache utilization situation or a possible hardware issue.
IntelliMagic Vision makes it easy to spot warnings or exceptions with it’s built-in ratings that you can see by the colors around the charts.
Another indication of a replication problem is when either Average CPU or Disk Utilization is near 0 and stays there while Replication Backlog is increasing. It is normal for CPU and Disk activity to be low during a period of no tape activity or if deferred copy throttle is slowing replication, but if the cluster utilization stays near 0 when data is awaiting replication to this cluster, then it should be investigated.
VTS Cache Utilization can be an indication as to why the cluster in a remote location is unable to receive data. Obviously if the utilization reaches 100%, there is no space for data. If the cluster has back-end tape, the library, drives, and media should be investigated to ensure they are operational. If there is no back-end tape, then the investigation moves on to why too much data has been retained in the cluster.
This report is a complete view of data movement within a cluster. If a tape attached cluster is not writing to the tape pool, then the back-end tape system can be investigated.
For host attached clusters, Virtual Device Write Throughput shows data actively being written to the cluster. Outbound copy data rate shows data being replicated to other clusters in the grid. Inbound copy data rate shows data being replicated to this cluster. Write Rate to Pool shows data that is being pre-migrated. Various configuration parameters can influence data movement within the grid, but for a healthy grid, data should be flowing as your configuration allows.
The dashboard helps keep all of these reports in a single location, and any user can share the dashboard with their coworkers. If there were any issues, we could simply click on any of the reports and drill down into the root cause.
And there you go. Keep an eye on these reports to determine if your remote clusters are receiving replication or not. I hope this video helped.
Check us out at intellimagic.com to learn more about z/OS performance and IntelliMagic Vision.
Speak to a Technical Expert Today
Whether you are conducting product research, need support on a project, are experiencing downtime, or want to learn more about how IntelliMagic can support your business, our experts are here to help.
You May Also Be Interested In
How to Find Sick But Not Dead (SBND) TS7700 Tape Clusters
Rather than waiting for a remote VTS to fail, you should be reviewing these key TS7700 reports to determine if remote clusters are receiving replication data or not.
Colruyt Group IT opts for interactive mainframe analysis with IntelliMagic's expert knowledge
Colruyt Group IT needed out of the box reporting with built-in knowledge in the field of performance and capacity.
TS7700 Synchronous Mode Copy Benefits
Compared to a reduced RPO, a lesser-known benefit to using Synchronous Mode is a more efficient cache flow and therefore a more efficient utilization of the disk cache.