When I write about tape drives, I usually get two very different sets of reactions. At one end are the jokes about lost data and 48-hour retrieval times. At the other are the comments, usually made in response to the first lot, along the lines of “When was the last time you used a tape drive — 20 years ago?”
That’s the scale of the challenge facing Spectra Logic, one of the few remaining companies committed to tape, as it tries to drag tape into the cloud generation. Yes, there is now a specification — called DS3, and based on Amazon S3 — for cloud-accessed tape libraries.
For many of the Web generation, tape is ancient history. It’s those spinning reels you see in 60s and 70s sci-fi, or the legacy backups that IT admins struggled with in the Dark Ages before cheap SATA disk arrays and online backup services. In short, it’s “Why are you asking me to use that old stuff?”
And yet, and yet… If you are one of the many who are struggling with Big Data and the information explosion, you’ll probably have realized that — attractive as it is — cloud storage is not quite the bargain some like to paint it as. Then again, it’s a damn sight more convenient than having to continually scale your on-site storage, as long as you have the tools in place to make the WAN bandwidth to the cloud affordable, of course.
Plus, those who really understand deep archiving also know that tape libraries have evolved immensely in the last 20 or 30 years. Modern versions can have multiple drives plus disk caching for faster data ingestion and recovery, can automatically scan tapes for health checking, and can even auto-migrate data to newer tape types as they become available. And while the latest LTO drives are scarily expensive, tape media is still cheaper than disk — and you can get Petabytes of it in a single library, making it still the best choice for many types of data.
The problems are that the cost of such a library is way more than most organizations can afford, and tape needs storage skills — skills that are in increasingly short supply as students focus instead on skills for the Web. But those are exactly the sort of problems that cloud storage was invented to solve, so why not have cloud-based tape storage?
That’s exactly what the Spectra Logic technology aims to do — and I would be rather surprised if other developers don’t do something similar, whether it’s tape on the Web or a tape tier as part of a broader storage system. A very clever piece of technology, its DS3 (deep simple storage service) cloud-based API uses the REST representational state transfer architecture and a front-end caching device based on solid-state disk to present a tape library as a Web-based object store.
Spectra’s developers are keen to stress that while they are using a derivative of Amazon’s S3 interface, and yes, existing S3 clients could be adapted for DS3 — they’re also planning to open-source DS3 – they are not trying to do the same thing or be ‘S3 for tape’.
In particular, Web developers need to think of it differently — as an efficient conduit for offloading data. So if you have Big Data to archive, say, you can do it at half to one-tenth the cost of cloud disk. If all you have is a few Terabytes it’s not worth doing, but if you have Petabytes then the savings can be significant.
It also says something important about skillsets of course — that just as consumer tech is made more user-friendly, the industry must also adapt IT to meet the capabilities of those who will be managing and using it. If that means speaking the language of Web developers, then so be it.
But it is perceptions that will probably be the bigger challenge. I suspect that what DS3 needs is for someone to hide it behind a ‘low-cost cloud archiving’ or ‘cloud library’ website, and make an intellectual connection to the Big Data world. Tape in the cloud might not sell, but cheap Big Data archiving is a need that can only grow and grow.