Mean Time to Repair (MTTR) is a common term in IT that represents the average time required to repair a failed component or device. In networking, MTTR is often longer than desired because there are many interdependencies, whereby an issue in one part of the network may cause a problem much farther downstream. Furthermore, a configuration change might appear to create a new issue, when in fact it just exposed something that was there all along but hidden.
It takes quite a bit of forensics to get to the root cause of a network problem. In the meantime (pun intended), there is plenty of blame to go around. The Wi-Fi network seems to be at the top of the list when the accusations fly – more so than any other section of the network. Why is that?
Maybe it is because Wi-Fi is notoriously flaky. Or maybe it is because Wi-Fi is on the “front line” – i.e. people are cognizant of the Wi-Fi network as it is right there in their face, whereas all the other components that go into a successful network experience are hidden, such as DHCP servers, DNS servers, WAN routers, and mobile devices. Furthermore, connecting to the Wi-Fi network is often the last activity taken before things go wrong, so it is natural to think the Wi-Fi network itself is to blame.
However, the Wi-Fi network is often not the actual source of the problem. The wired network, internet, and devices are equally problematic, for example. To clear Wi-Fi’s name – a term wireless admins call “Mean time to Innocence” – IT departments have to sort through a ton of data to get to the actual root cause. Often times, this can be a challenging task. Here’s an example that puts things into perspective:
For six months, a Fortune 100 company has been plagued with an intermittent device connectivity issue. Occasionally, the scanners don’t scan. They have deployed onsite IT admins, network engineers, instrumented the warehouse with sniffers on all channels, just to catch the problem while it happens. Six months go by. The problem persisted all over the world in warehouse around the company. Frustration built among the business users, and it was costing the company millions in lost productivity. So how can modern Wi-Fi solve this?
Thanks to artificial intelligence (AI), the solution to the problem described above was discovered in just minutes – the mean time to innocence dropping from months to minutes once they replaced their legacy controller Wi-Fi with an AI-driven network. It’s all thanks to the modern technology that lets us collect huge amounts of data from every mobile client and move it to the cloud where events are correlated using machine learning. Then anomalies can be detected in real-time, and recommendations are given (or performed automatically) to correct any issues before users even know they exist. In this particular case, the AI-driven wireless network got a dynamic packet capture the first time the issue happened. Within 45 minutes of installing the AI-driven cloud wireless, they had a packet capture. And the packet capture doesn’t lie! It proved the problem was never with the “wireless network” – this particular company was setting their roaming request incorrectly. Within three days a solution was discovered, which was then rolled out globally.
IT administrators can also set up service levels to ensure user performance never drops below customized thresholds. For example, the IT department in the scenario above could have set a “roaming” threshold of sub-second for all Wi-Fi users. The minute a roaming threshold was violated, a series of automated events could have been triggered to alert IT of the issue and give them the actionable insight they needed to rapidly resolve it.
AI replaces manual Wi-Fi management tasks with automated intelligence and provides deep insight that helps identify and fix problems faster than ever before. This is the direction that all IT is going – and Wi-Fi is right there on the front line.
This article is published as part of the IDG Contributor Network. Want to Join?