AI, machine learning and AIOps: the next frontier of networking

For telcos to offer next-gen connectivity to customers and end users, they need to get their own houses in order first by optimising their networks – and AI, machine learning and AIOps offers a smarter and quicker way to do it. We spoke to Beth Cohen, SDN Technology Strategist at Verizon, to discover what AI and ML means to Verizon and other telecos, the cultural shifts the industry needs to make it a success, and what to bear in mind when rolling out machine learning across a telco network.


AI, ML and AIOps mean different things to different organisations. How should we understand them in the telco context?

Within the networking and telecommunications industry, AI and machine learning is focused on how we can automate the optimisation of the network. There are two aspects to this – telcos doing it for their own infrastructure, to make sure networks, connections and so on are working optimally as traffic shifts; then providing similar tools to customers to optimise their own networks.

Do these two aspects run side by side?

In Verizon, one led to the other. We started implementing AI and machine learning internally on our own infrastructure, and we have since started to extend these capabilities to our customers. Since SD-WAN networks have some optimisation built in as an integral part of their functionality, this had led to us going down that path with our customers too.

How is machine learning helping Verizon automate?

AI and machine learning is pretty new. At Verizon we have done a lot of optimisation by hand over the years – looking at historic data and tweaking networks – but what has changed recently is the ability to do it more in real time. Speaking generally about the industry, I am involved in the end user group (EUAG) of the Linux foundation networking project, which is creating a white paper on what’s happening in AI and machine learning space within telco. We carried out a survey with both operators and vendors.  The responses made it clear that it is still very early in the adoption cycle. The tech is still evolving and the adoption is still evolving. We at Verizon started our journey a few years ago – but it’s still early days in the technology adoption cycle.

What cultural shifts are required within telcos to make AI and machine learning a success?

20 years ago, many companies had friction between their network engineers and their telecoms engineers - then someone got the brilliant idea of ‘They’re both doing the same thing, so why not put them in the same organisation?’ I see something similar happening in AI and ML as we go down this journey. In many cases, the applications and networks need to work together, because at the end of the day companies and people are interested in how the application performs. They don’t care about the network underneath, even though we as operators do.

I always say we are creating application aware networks, even though I don’t see the app developers doing much on making network-aware applications. At Verizon we are creating more tools to allow developers to take advantage of the network, particularly driven by 5G initiatives. We are building a fantastic 5G network and the applications need the tools to take advantage of it.

What role is open networking playing in AI and machine learning?

Similar to what’s happening in 5G, which is emerging from Open RAN and other industry work within the open source and standards bodies, I see AI and ML coming out of open source consortia. I think this has to be the case – operators are used to working across each other, as we’ve been peering for years! Things are not quite the same on the vendor side.

What would you say is the most important principle of rolling out AI and machine learning within a telco?

Test! Don’t be afraid to go out there and do it. Performance testing is very important – this is not something the vendors particularly like, but at Verizon we find we need to do deep testing before we can roll anything out to our customers. We do extensive integration testing, putting the components into a lab and beating the hell out of them to see when failover occurs, which means that we can be comfortable supporting our customer SLAs.

Many vendors cannot do this kind of deep testing, as we are testing on real, production networks. This is different from pumping data through in a lab. We have found that vendors converting from physical to virtual hardware often do not understand what needs to go into that virtualisation – you can’t snap your fingers and put an ISO on there, you really have to tune it. You have to add DPDK, SR-IOV capability, and other types of tuning to make sure it works with the virtualised infrastructure.

Is there a culture of testing in our industry?

Telcos have been testing forever. We are doing more integration testing than we ever did before. Previously we relied on vendors testing the boxes they supplied, but when we receive a virtual machine application that acts as a router, we no longer trust that the vendor is able to do the full degree of testing that they were doing before, because they no longer have control over the infrastructure.

How can telcos balance speed and availability in their rollouts?

High availability can mean different things. It previously typically meant two boxes with failover. Now things are a lot more complex – some components are in the cloud, some are at the edge, and so on. This falls on us as operators to do the integration, do the testing, and make sure it works.

You will discuss AI, ML and AIOps at the Layer123 360 Network Automation Congress on 14 April. What can we expect from your talk?

I will talk about AI and ML as the next frontier of networking. We’re just starting on that journey at Verizon and we expect great things – but as with all tech, it ends up taking twists and turns and doing something totally different than we first thought!

  • No labels