Jan 21, 2013
So what did Santa bring your IT department this winter? If it was a shiny new disk-to-disk backup box to help protect your growing data storage estate, as it may have been for many organizations away from the bleeding edge, it is time to sort out how well this is going to fit with the rest of your infrastructure — especially the WAN part.
If, however, you are still debating or planning a move to disk-based backup, you are not alone — 2013 is likely to see an acceleration in the shift away from tape-based backup towards alternatives, whether they be disk-based, cloud, or hybrid. That is not to say tape is dead — it is very much alive, but it is no longer the best way to do system backups. Increasingly, the better option is off-site disk backup (or disk-to-disk replication) which is an ideal business continuity upgrade for the organization that needs the shortest possible recovery time when disaster strikes, but which cannot afford to replicate its entire systems to a duplicate data center.
One of the biggest benefits of off-site disk backup is that it is almost always faster than tape, because there are none of the data-seek or media mount delays typical of tape. With tape, you have to wait for the media and the drive to become ready before you can access the data, but with disk, once the application accesses the data it can open up multiple access streams.
Also, there is no media handling with off-site disk, so we can use replication technology to send data across the WAN instead of having to physically move tapes around. And off-site disk backup can take advantage of other technologies such as snapshots and block-level backups for finer-grained incremental backups.
For many users, the big advantage of disk-based backup over both tape and the cloud is that it enables faster and more reliable bare-metal restores. (It’s great for other things too, but let’s focus on disaster recovery for now.) Sure, you can backup to tapes and send those off-site, or backup to a remote tape library, but getting those tapes back and then restoring them is not going to be fast.
For those on the bleeding edge, restoring an entire system from cloud-based backups will also be slow: recovering individual files over your Internet connection is one thing, but pulling down terabytes of system image is quite another. It has got to the point where it will often be quicker — and can even have a lower carbon footprint, incidentally — for your backup service provider to bung the data onto a fat hard drive or a portable disk array and courier it to you.
All of this is why many organizations have moved to disk backup for business continuity. Arrays of fat disks are optimized for use as backup storage, typically with a layer of data deduplication in the controller, and can either be targeted directly as disk or as an emulated tape drive, the latter giving an easier process migration from tape backup. You can also add a layer of tape at the rear for the long-term bulk storage, making it a hybrid backup, but this is less common.
The key thing, though, for business continuity and disaster recovery is that the disk backup box does need to be off-site, either at a secondary site, or, if you don’t have one of those, at a co-location center. The challenge for many organizations will therefore be the WAN traffic involved in doing that disk-based offsite backup — and, of course, recovery.
While the recovery issue can be dealt with by shipping the disk backup box to your new server hardware, or by setting up the new servers at your co-lo or secondary site — which may of course be necessary anyway if the main site has been torched, flooded or otherwise rendered unusable — the same is not true of the backup.
The WAN characteristics most likely to impact backup are limited bandwidth and high latency. You can buy yourself more bandwidth if you are feeling rich, but a better alternative in almost all cases will be WAN optimization technology, as this will cover all the bases: data compression and deduplication to shrink the amount of data to be transferred, plus data and link optimization to accelerate transfers and reduce latency. Add some bandwidth shaping or traffic prioritization to ensure routine loads don’t swamp key real-time application traffic, and your data protection really should shine like new.
Image credit: pleasantpointinn (flickr)