This blog is on the topic of the impact of zero Recovery Point Objective (RPO) for Mainframe Virtual Tape Replication focusing on the IBM TS7700 replication capability.
Have you ever thought about how much money you will need to save for retirement? I was talking with my financial advisor the other day and decided that whatever you think you need you should double. You can plan on having social security but if social security fails then retirement plans start to look not so rosy.
The same thing applies to computer systems. Customers spend a lot of time and money on Disk replication, reducing both RPO and RTO. But what if an application corrupts the data or a virus is uploaded? Corrupted or infected data is replicated just as easily as good data. This lends to making offline backup copies of disk files which also need to be replicated.
Most IT departments make at least one tape backup for onsite and one for offsite recovery. But tape failures do occur, and the last place for them is during an actual recovery. Some make two onsite disk copies to TS7720/TS7760 with possibly a 3rd physical copy on a TS7740. A growing number are configuring metro mirrors that are similar to disk metro mirroring. This is easily accomplished by putting the TS7720/TS7760s at two different local data centers within a short distance of each other. Along with the metro mirror is a distant 3rd site receiving data. This can either be lights out, hot, or active depending upon RTO requirements.
Backup jobs are usually run in limited batch windows. Can a zero RPO tape backup replicate fast enough without impacting the batch window? What does the IBM TS7700 offer in the way of replication modes?
TS7700 Replication Modes
The IBM TS7700 virtual tape system can be configured in a grid and allows for several copy modes.
- No Copy (N) – the volume will not be copied to another cluster. This copy mode might be used when the data is not needed in multiple clusters.
- Rewind Unload(R) – copies the tape volume when the host job issued the RUN command. Depending upon the size of the tape volume and the network latency the impact to the host job can be many minutes. Zero RPO.
- Synchronous Copy(S) – a forked write sends the data to both the original cluster and the target cluster at the same time. When the host job issues a Sync Point the TS7700 guarantees that all data is in both clusters. Zero RPO.
- Deferred Copy(D) – the default and allows for the volume to be copied at a later time after the tape volumes have been dismounted. There is no impact to the job which created the tape volume.
- Time Delayed Copy (T) – this is an extension of deferred copy mode where a copy is scheduled only after a user-specified delay. This can be after peak processing, a few hours, or if the volume’s life cycle is short and does not need to be copied, a few days.
Which of the backup copies need a zero RPO?
The metro mirror needs a zero RPO. Because of the low replication latency and a high bandwidth of metro mirrors, Synchronous Copy should have the least impact to host run time.
What about the 3rd local copy to the TS7740 physical tape? If two copies already exist locally, then Deferred Copy has the least performance impact to the TS7700 environment. Time Delayed Copy is even better, making the tape copy after all other replications are complete.
For most configurations, replication to the remote data center will not be zero RPO. The distance is too great, and the latency too high, resulting in too much delay to the backup jobs. The Deferred Copy and its extension Time Delayed Copy have no impact to host job run time but have non-zero RPO. In addition to the volume being copied after the job has dismounted it, there can be considerable time to initiate the copy if there exists a replication backlog.
Measuring the backup impact of RPO?
Both the job impact and replication delays of zero RPO and non-zero RPO can be measured using IntelliMagic Vision for Tape. See Burt Loper’s blog “IBM TS7700 Replication – Is your data safe? (Part 2 of 2)”.
Also see the blog, “Does you Disaster Recovery Plan meet its objective?”, for more information about these and how IntelliMagic Vision provides you with a deep analysis.
Deep analysis of the IBM TS7700 SMF records and BVIR data can be performed using IntelliMagic Vision. IntelliMagic Vision interprets the SMF data from the mainframe using built-in intelligence and ratings about the specific hardware and workloads, and puts them into a database for detailed and historical trending reporting in easy-to-use graphical views.
Using IntelliMagic Vision for TS7700 Performance Analysis
IntelliMagic Vision for TS7700 automatically compares the hardware views (via BVIR data) with the workload metrics, providing you with insight into how the standalone or gridded hardware is handling the work and replication between boxes.
Estimating Storage System Capabilities Should not be a Risky Business!
If you want a useful headroom metric you need to define it properly.
How to Avoid Application Infrastructure Performance Problems
"What are the top 5 million things you need to do today to avoid application infrastructure performance problems?"
End to End z/OS Infrastructure Performance Management
IntelliMagic Vision provides integrated visibility across all parts of the infrastructure to quickly identify and diagnose issues.