Does your Disaster Recovery Plan meet its objectives? Analyzing TS7700 Tape Replication (Part 1 of 2)
This blog is the first in a series of two blogs on the topic of Mainframe Virtual Tape Replication.
One of the challenges in IT is getting your data replicated to another location so that you have a recovery capability if your main operations center is compromised. IBM TS7700 Series Virtualization Engines support the copying of your tape data to other locations.
This article explores the various TS7700 replication modes.
The IBM TS7700 Virtualization Engine is commonly known as a cluster. When you connect two or more clusters together, that is called a grid or composite library. The information here applies to both the TS7740 model (which uses backend tape drives and cartridges to store tape data) as well as the TS7720 model (which uses a large disk cache to store tape data).
In a multi-cluster grid, the clusters are interconnected with each other via a set of 1 Gb or 10 Gb Ethernet links. The TS7700’s use TCP/IP communication protocols to communicate with each other and copy tape data from one cluster to another.
TS7700 Replication Modes
In a multi-cluster grid, the TS7700 can copy tape volume data from one cluster to one or more additional clusters in the grid. This process is known as replication.
The following Copy Modes are available:
- Deferred (D) – the volume will be copied at a later time after the tape volume is closed. The application will not wait for the volume to be copied. This copy mode might be the default, especially if there is significant latency in the network.
- Rewind Unload (R) – the volume will be copied when the Rewind Unload (RUN) command is received for the tape volume. The application will wait for the volume to be copied before it will proceed further. Depending upon the size of the tape volume and the latency in the network, this can be many minutes. This copy mode is also referred to as Immediate Copy and RUN Copy. This copy mode might be used for critical data set backups (e.g. the tape management catalog) where you want to make sure that the data is at the other location before the backup job completes. However, this has performance implications as the elapsed time of your jobs may be elongated.
- No Copy (N) – the volume will not be copied to this cluster. This copy mode might be used when the data is not needed in multiple clusters or when you only need copies in some but not all clusters in the grid.
- Synchronous Copy (S) – This is what I call a forked write. The data is buffered to the original cluster and the target cluster at the same time. When the application issues a Sync Point, the TS7700 guarantees that all data previous to the Sync Point is in both clusters.
This copy mode should be used with care as there are performance implications, especially if there is significant latency in communicating over a network to the target cluster.
This copy mode was originally designed for tape applications like HSM ML2. Customer may be mirroring their disk environment but when a data set is migrated to tape by HSM, it is then deleted from both the local disk as well as the mirrored disk. Sync mode copy insures the migrated data set is in both the local TS7700 as well as the remote TS7700, before the data set is deleted from disk. The other copy modes could not guarantee this, because even with immediate copy, replication did not occur until end-of-volume.
- Time Delayed Copy (T) – this is an extension of deferred copy mode where a copy is scheduled only after a user-specified delay. A copy can be delayed for up to 65,535 hours. This copy mode was introduced with Release 3.1 of the TS7700 microcode in December 2013.
This copy mode is often used in hybrid grid situations where there is both a TS7720 cluster and a TS7740 cluster at each location. You may not want to buy enough capacity in the TS7720 clusters to hold all of your tape data, since there is a fair amount of older archive tape data. The TS7720 capacity would be configured to hold some number of days of tape data with the TS7740 keeping all of the older archive tape data on cartridges.
In the past, new tape data would enter either the TS7720 or the TS7740 and be replicated to the other. However, for short retention data, this caused a lot of unnecessary migrations and reclaims in the TS7740 cluster. Let’s say that you know that most of your tape backups expire within a 30 day period. Now, you can have data enter the TS7720 and only be replicated if the data has not expired after 30 days. This relieves a portion of the replication, migration, and reclamation workload that the TS7740 would otherwise have to perform.
TS7700 Replication Performance
Each TS7700 cluster keeps track of many performance statistics about its operation. These statistics are accumulated internally over a 15 minute interval and then, at the end of the 15 minute interval, a set of statistic records are written to the TS7700’s disk. These are known as Historical Statistics and are kept for a period of 90 days within the TS7700’s. The historical statistics can be retrieved from the TS7700 by a process known as BVIR.
Deep analysis of these BVIR performance metrics can be performed using IntelliMagic Vision.IntelliMagic Vision interprets the BVIR statistics as well as SMF data from the mainframe using built-in intelligence and ratings about the specific hardware and workloads, and puts them into a database for detailed and historical trending reporting in easy-to-use graphical views. The next blog in the series will describe different ways to look at TS7700 replication health using IntelliMagic Vision.
Continue reading more in part 2: IBM TS7700 Replication – Is Your Data Safe?
For a customized demo of how IntelliMagic Vision can help you with your TS7700 replication, please click here.
How to Avoid Application Infrastructure Performance Problems
"What are the top 5 million things you need to do today to avoid application infrastructure performance problems?"
Performance Management for z/OS Systems
z/OS systems infrastructure performance is critical to ensuring availability for end-users, but too often performance analysts are using monitoring tools or methods that are reactive rather than proactive.
Estimating Storage System Capabilities Should not be a Risky Business!
If you want a useful headroom metric you need to define it properly.