Network Map

Best Practices for Network Change Management

Evolving the network to a next-generation architecture is all the rage now. A digital business needs an agile, dynamic network to use as a foundation for innovation, so the network must now evolve. This is one of the reasons my fellow analysts and I are pushing the concepts of SDN, broadband WAN, and Network Functions Virtualization so hard – not evolving the network will start to cause companies to miss out on business opportunities.

However, most of the focus on network evolution tends to revolve around the technology and I’ve seen very little on the topic of change management. While it’s not the most exciting topic, I can guarantee everyone one thing: it doesn’t matter what kind of new and sexy technologies are thrown at the network — if change management is done sloppily, many of the problems that exist with the legacy network will continue to exist with the next-generation network.

It’s more important than ever for businesses to develop some best practices around network management, and while many points may be unique to an individual business, the below are some great starting points for any organization.

  • Never, ever make changes on the fly. When I was an engineer I made changes on the fly far too often. Most of the time this was fine, but every now and then an unforeseen issue would come up. This would cause me to make more changes on the fly to counter whatever change I just made. No matter how small or minor the change may seem it’s just not a good idea. Businesses have become more reliant on the network and the axiom of ‘measure twice and cut once’ has never been more true.
  • Implement a peer review process. No matter how small the change to the network, it’s always good to have a second or third set of eyes to review the change plan. I know a lot of engineers go through the process of reviewing changes before making them but sometimes it’s hard to see your own errors. Also, another experienced engineer may see an alternative approach to the problem being solved.
  • Create a change management advisory board. Most engineers I know don’t really like the concept of a change management advisory board (CAB), as it’s perceived to get in the way of making changes. However, in most large scale or high-value networks, a CAB can be quite useful and add value. In verticals like healthcare, financial services, retail, or wherever the business is the network, any change should be assessed against the impact it can have. A CAB should include a subset of the following members: engineers, customers, end-user representatives, application developers, facilities individuals, specialists, or other parties.
  • Document all changes. I’ll be the first to admit that documenting changes is never a fun thing. It’s time consuming and no one I know became an engineer because they like to document. However, good documentation can prove to be a valuable resource in the process of troubleshooting or for making future updates to the network. Documentation can significantly reduce the time taken to operate a network. so for anyone that wants to spend more time working on new and innovative things, spend the time to document network activities and save yourself hour upon hour of troubleshooting time later.
  • Keep backups and have a plan to back out of a change. As the expression goes, “stuff happens”. No matter how careful you are, no matter how much peer review is done and how many eyes look at a configuration change, sometimes things break. The question is, when things do break, are you ready for it and how quickly can you recover. This starts with keeping copies of the configuration for all network devices, backups of the software on them, and having the documentation readily available. The other part of recovery is having a plan and understanding the steps involved in being able to roll back a change. Preparation is the key here. Anyone who’s tried to recover a network device or make changes in a panic knows how hard it is so be prepared, keep backups and have a plan.
  • Automate where possible. My research has shown that human error has been the largest component of network downtime for the past decade. Everyone likes to blame telcos, the network device vendor, or even users for outages, but the fact remains that we engineers are responsible for more downtime than any other factor. Automation is one of the value propositions of software-defined networking, so if you’re on that path take advantage of the orchestration and automation capabilities of it. However, even without SDNs, scripts can be created and tested and the changes applied that way – the key is to remove the human element.