Oxymoronic Disaster Recovery

Disaster!I am still hearing a lot from vendors about how good their disaster recovery is.  Whether it is based on in-data center approaches or through off-site activities, the cry seems to be “Choose us — our solution can give you a vague chance of survival!”

This seems to be slightly wrong to me.  With all the technologies we have available to us these days, the real call should be “Choose us — we’ll stop disasters happening!”

This is not just arguing around semantics.  There is a wealth of difference between trying to rescue an organization from the mire of a massive problem with its IT platform and trying to create a platform where any such problems can be effectively managed on the fly.

Let’s get this clear — “Disaster Recovery” is an attempt to stop your organization from going bust, and “Business Continuity” is an attempt to keep everything working, at least to some extent.  Disaster recovery means that the problem has become an issue to everyone — the organization cannot carry out the activities it needs to do.  Business continuity may involve IT people running around screaming incoherently, but as long as this is just happening down at the data center level, then the business can continue working.

OK — business continuity used to be only for those who had immensely deep pockets.  It involved data center mirroring and mass replication of environments for it to work.  Banks, government, and a few others were using it — everyone else was implementing faster backup and recovery software on the premise that getting an organization up and running again within one working day was better than four.

What can’t have passed anyone by is that the world is busy virtualizing.  This gives the capability for cost-effective business continuity to be looked at.

Application images can be used within the data center — if the data center fails, then a “cold” image held in an external hosted data center can be rapidly spun up.  This will need the data to be replicated in real time, but this is becoming easier to do in the majority of cases where massive transactional throughput is not an issue.  Even with large transactional volumes, the use of store-and-forward message buses (such as IBM WebSphere) can ensure that transactional state is maintained and few transactions are lost on any failure.

The cost of maintaining “cold” images is very low — essentially, it should be storage costs only.  The real cost kicks in when the image has to be spun up and used — resources have to be applied and used at this point.  However, through the use of a shared cloud platform, these resources should be available on a pay-as-you-go model: no direct payment for the hardware required, just per Watt of energy, per unit of cpu, storage, or network used.

The main cost will be on the data management side.  This is not a “cold” image — it always has to be live, so you will be paying for continual use of active storage.  However, this second storage system is also an on-line, real-time backup system.  Essentially, it is also the disaster recovery solution.

I would like to see the term “disaster recovery” consigned to the past, in the same way that “punch cards” and “thermionic valve computer” have been.

Indeed, “disaster recovery” is fast becoming an oxymoron: an increasing number of organizations will not make it out the other side if a disaster happens to them.  The key is to keep some level of capability running, so that the business can keep its own services and sales operating to some extent.  This is business continuity — a far better term in which the business may well want to invest.  Disaster recovery is, to all intents and purposes, an insurance policy — and insurance policies can always be cut in times of economic difficulties.