Oct 19, 2012
Picture the scene. You’re a sys admin, lying in bed at 4am in the UK Suddenly, you’re awakened by the shrill ring of your smartphone. The remote office in Tokyo can’t access the sales system – and prospective customers are getting restless. The remote office has its own server – something has gone wrong, but the Tokyo staff are not techies – they are businesspeople.
Trouble is, you’re in London, at your house. Going back into the dim and dark past (say, five years), you would have had to get in your car, drive to the office, use whatever tools were at your disposal to find the root cause of the problem and then get someone in the Tokyo office to be online with you and take them through rebooting the various parts of the system until a working system was back up and running again. Then a major problem: it became apparent that copies of the necessary files for full recovery of the software were stored in London – and provisioning these to the newly resurrected system in Tokyo was going to take some time.
Time to order in coffee and pizza to keep you going through the coming hours, and other tasks had to be pushed onto the backburner while Tokyo was sorted.
I recognise this – similar things happened to me when I was in charge of global systems covering the UK, US, Europe and the Far East – and I would wake with night sweats at times having dreamed about it happening – it didn’t even need to be a real emergency for it to be scary.
Today’s sys admins can have a relatively easy life – if they so choose. Virtualisation makes possible an abstraction between the physical hardware and the software: the failure of a single physical server is unlikely to cause the complete failure of a system – if configured correctly. The use of virtual images – software stacks packaged as virtual machines or VMs – means that the need to reboot and rebuild can be removed from the list of miserable tasks that a sys admin has to do when an application does go down.
Let’s replay the situation in two different ways:
First – the better-off sys admin, again, lying in bed at 4am. You’re awakened – the remote office has the same problem. From your smartphone of choice, you bring up the systems management dashboard and identify the original problem. It may have been a virtual routing problem, or some such thing. No problem – ticking a few boxes on the screen, a new virtual network topology is created, and a new VM is provisioned from a golden image held in London and up and running in a couple of hours once the image gets passed over the wide area network (WAN) from London to Tokyo. All you need to do is call in again when the transfer has completed to provision it as live. You can turn over and go back to sleep until that point.
Second – the modern, unconcerned sys admin. It’s still 4am and the same problem in Tokyo is picked up by the systems management software. It recognises that a virtual network route has gone down. It automatically creates a new virtual network topology, and provisions a new VM from London using WAN acceleration to optimise the speed that the VM can be provisioned. At 8am, you wake up fully refreshed, get ready, and go to the office. The systems management dashboard shows that Tokyo had a problem, but that it was only down for a few minutes and overall business impact was minimal. You look at the log message and turn back to today’s Dilbert strip.
Sipping your first skinny latte of the day, you wonder why your 50-year-old boss looks 65 and is always muttering about “the bad old days”. You just love your job….