Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Intelligent networking is rapidly moving out of the lab and being deployed directly into production
  • Operational maintenance, and service assurance are still a priority, but there is increasing interest in using AI/ML to drive network optimization and efficiency
  • More research and development is needed to establish industry wide best practices and a shared understanding of intelligent networking to support interoperability.
  • There has been some work on developing common or shared data sources and standards, but it remains a challenge
  • LFN and the Open Source community are key contributors to furthering the development of intelligent networking now and in the future

Background and History

Beth Cohen 

0.5 page

At their hearts, telecoms are technology companies driven by the need to scale their networks to service millions of users, reliably, transparently and efficiently.   To achieve these ambitious goals, they need to optimize their networks by incorporating the latest technologies to feed the connected world's insatiable appetite for ever more bandwidth.  To do this efficiently, the networks themselves need to become more intelligent.  At the end of 2021, a bit over 2.5 years ago, LFN published its first white paper on the state of intelligent networking in the telecom industry.  Based on a survey of over 70 of its telecom community members, the findings pointed to a still nascent field made up of mostly research projects and lab experiments, with a few operational deployments related to automation and faster ticket resolutions.  The survey did highlight the keen interest of its respondents had in intelligent networking, machine learning and its promise for the future of the telecom industry in general. 

...

Challenges and Opportunities

Beth Cohen 

1 page - Focus on Telco pain points only

Ironically, as generative AI and LLM adoption becomes widespread in many industries, telecom has lagged somewhat due to a number of valid factors.  As was covered in the previous white paper, the overall industry challenges remain the same, that is the constant pressure to increase the efficiency and capacity of operators’ infrastructures to delivery more services to customers for lower operational costs.  The complexity and lack of a standard understanding of network traffic data remains a barrier for the industry to speed the adoption of AI/ML to optimize network service delivery.  Some of the challenges that are motivating continued research and adoption of intelligent networking in the industry include: 

Common Telecom Pain Points

...

  • Operational Efficiency: The continuing need to reduce costs and errors, potentially increase margins
  • Network Automation: Right-sizing network hardware and software, optimizing location placement
  • Availability: Identifying single points of failure in systems to improve equipment maintenance efficiency
  • Capacity Planning: Avoid unnecessary upgrades or poor network performance from overloaded nodes.

Challenges with Achieving Full Autonomy Autonomy 

0.5 page Lingli Deng Andrei Agapi 

...

0.5 page

  • Need for High Quality Structured Data: Communication networks are very different from general human-computer data sets, in that a large number of interactions between systems use structured data. However, due to the complexity of network systems and differences in vendor implementations, the degree of standardization of these structured data is currently very low, causing "information islands" that cannot be uniformly "interpreted" across the systems There is a lack of "standard bridge" to establish correlation between them, and it cannot be used as effective input to incubate modern data-driven AI/ML applications.
  • AI Trustworthiness: In order to meet carrier-level reliability requirements, network operation management needs to be strict, precise, and prudent. Although operators have introduced various automation methods in the process of building autonomous networks, organizations with real people are still responsible for ensuring the quality of communication and network services. In other words, AI is still assisting people and is not advanced enough to replace people. Not only because the loss caused by the AI algorithm itself cannot be defined as the responsible party (the developer or the user of the algorithm), but also because the deep learning models based on the AI/ML algorithms themselves are based on mathematical statistical characteristics, resulting in behavior uncertainty leading to the credibility of the results being difficult to determine.
  • Uneconomic Margin Costs: According to the analysis and research in the Intelligent Networking, AI and Machine Learning While Paper , there are a large number of potential network AI technology application scenarios, but to independently build a customized data-driven AI/ML model for each specific scenario, is uneconomical and hence unsustainable, both for research and operations. Determining how to build an effective business model between basic computing power providers, general AI/ML capability providers and algorithm application consumers is an essential prerequisite for its effective application in the field.  
  • Unsupportable Research Models:  Compared with traditional data-driven dedicated AI/ML models in specific scenarios, the R&D and operation of large language models have higher requirements for pre-training data scales, training computing power cluster scale, fine-tuning engineering and other resource requirements, energy consumption, management and maintenance, etc.  Is it possible to build a shared research and development application, operation and maintenance model for the telecommunications industry so that it can become a stepping stone for operators to realize high-end autonomous networks.
  • Contextual data sets: Another hurtle that is often overlooked is the need for the networking data sets to be understood in context.  What that means is that networks need to work with all the layers of the IT stack, including but not limited to:
    • Applications: Making sure that customer applications perform as expected withe underlying network
    • Security: More important than ever as attack vector expand and customers expect the networks to be protected
    • Interoperability: The data sets must support transparent interoperability with other operators, cloud providers and other systems in the telecom ecosystem
    • OSS/BSS Systems: The operational and business applications that support network services

Emerging Opportunities

Sandeep Panesar 

The Telecoms have been working on converged infrastructure for a while.  Voice over IP has long been an industry standard, but there is far more than can be done to drive even more efficiencies in network and infrastructure convergence.

...

Achievements and Successes

Lingli DengAndrei Agapi 

  • The advent of transformer models and attention mechanisms [][], and the sudden popularity of ChatGPT [], LLMs, transfer learning and foundation models in the NLP domain have all sparked vivid discussions and efforts to apply generative models in many other domains [].

    Interestingly, all of: word embeddings [], sequence models such as LSTMs [] and GRUs [], attention mechanisms [], transformer models [] and pretrained LLMs [][] have long been around before the launch of the ChatGPT tool in late 2022. Pretrained transformers like BERT[] in particular (especially transformer-encoder models) were very popular and widely used in NLP for tasks like sentiment analysis, text classification [], extractive question answering [] etc, long before ChatGPT made chatbots and decoder-based generative models go viral.

  • That said, there has clearly been a spectacular explosion of academic research, commercial activity and ecosystems that have emerged since ChatGPT came out, in the area of both open [][][] and closed source [][][] LLM foundation models, related software, services and training datasets.

    Beyond typical chatbot-style applications, LLMs have been extended to generate code [][], solve Math problems (stated either formally or informally) [], pass science exams [][], or act as incipient "AGI"-style agents for different tasks, including advising on investment strategies, or setting up a small business [][]. Recent advancements to the basic LLM text generation model include instruction finetuning [], retrieval augmented generation using external vector stores [][], using external tools such as web search [], external knowledge databases or other APIs for grounding models [], code interpreters, calculators and formal reasoning tools [][]. Beyond LLMs and NLP, transformers have also been used to handle non-textual data, such as images [], sound [] and arbitrary sequence data [].

  • A natural question arises on how the the power of LLMs can be harnessed for problems and applications related to Intelligent Networking, network automation and for operating and optimizing telecommunication networks in general, at any level of the network stack.

    Datasets encountered in telco-related applications have a few particularities. For one, data one might encounter ranges from fully structured (e.g. code, scripts, configuration, or time series KPIs), to semi-structured (syslogs, design templates etc), to unstructured data (design documents and specifications, Wikis, Github issues, emails, chatbot conversations).

  • Another issue is domain adaptation. Language encountered in telco datasets can be very domain specific (including CLI commands and CLI output, formatted text, network slang and abbreviations, syslogs, RFC language, network device specifications etc). Off-the-shelf performance of LLM models strongly depends on whether those LLMs have actually seen that particular type of data during training (this is true for both generative LLMs and embedding models). There exist several approaches to achieve domain adaptation and downstream task adaptation of LLM models. In general these either rely on 1) In-context-learning, prompting and retrieval augmentation techniques; 2) Finetuning the models; or 3) Hybrid approaches. For finetuning LLMs, unlike for regular neural network models, several specialized techniques exist in the general area of PEFT (Parameter Efficient Fine Tuning), allowing one to only finetune a very small percentage of the many billions of parameters of a typical LLM. In general, the best techniques to achive domain adaptation for an LLM will heavily depend on: 1) the kind of data we have and how much domain data we have available, 2) the downstream task, and 3) the foundation model we start from. In addition to general domain adaptation, many telcos will have the issue of multilingual datasets, where a mix of languages (typically English + something else) will exist in the data (syslogs, wikis, tickets, chat conversations etc). While many options exist for both generative LLMs [] and text embedding models [], not many foundation models have seen enough non-English data in training, thus options in foundation model choice are definitely restricted for operators working on non-English data.

  • In conclusion, while foundation models and transfer learning have been shown to work very well on general human language when pretraining is done on large corpuses of human text (such as Wikipedia, or the Pile[]), it remains an open question to be answered whether domain adaptation and downstream task adaptation work equally well on the kinds of domain-specific, semi-structured, mixed modality datasets we can find in telco networks. To enable this, telcos should very likely focus on standardization and data governance efforts, such as standardized and unified data collection policies and high quality structured data, as discussed earlier in this whitepaper.

  • Deploying large models such as LLMs in production, especially at scale, also raises several other issues in terms of: 1) Performance, scalability and cost of inference, especially when using large context windows (most transformers scale poorly with context size); 2) Deployment of models in the cloud, on premise, multi-cloud, or hybrid; 3) Issues pertaining to privacy and security of the data for each particular application; 4) Issues common to many other ML/AI applications, such as ML-Ops, continuous model validation and continuous re-training.

  • high quality structured data Large language models can be used to understand large amounts of unstructured operation and maintenance data (for example, system logs, operation and maintenance work orders, operation guides, company documents, etc., which are traditionally used in human-computer interaction or human-to-human collaboration scenarios), from which effective knowledge is extracted to provide guidance for further automatic/intelligent operation and maintenance, thereby effectively expanding the scope of the application of autonomous mechanism.
  • non-economic margin cost Although equipment manufacturers can provide many domain AI solutions for professional networks/single-point equipment, these solutions are limited in "field of view" and cannot solve problems that require a "global view" such as end-to-end service quality assurance and rapid response to faults. . Operators can aggregate management and maintenance data in various network domains by building a unified data sharing platform, and based on this, further provide a unified computing resource pool, basic AI algorithms and inference platform (i.e. cross-domain AI platform) for various scenario-specific AI for end-to-end scenarios and intra-domain scenarios. Applied reasoning platform. 

...

Jason Hunt (at least on how foundation models can be applied to network data) Andrei Agapi 

1 page

LF Data (Thoth)-

Sandeep Panesar Beth Cohen 

  • Thoth project - Telco Data Anonymizer Project

...

AI has the potential in creating value in terms of enhanced workload availability and improved performance and efficiency for NFV use cases. This work aims to build machine-Learning models and Tools that can be used by Telcos (typically by the operations team in Telcos). Each of these models aims to solve a single problem within a particular category. For example, the first category we have chosen is Failure prediction, and we aim to create 6 models - failure prediction of VMs. Containers, Nodes,  Network-Links, Applications, and middleware services. This project also aims to define a set of data models for each of the decision-making problems, that will help both providers and consumers of the data to collaborate. 

LLM & GenAI

Sandeep Panesar 

LLM (Large Language Models)

...

How could Open Source Help?

Ranny Haiby 

2-3 pages

When considering the role of open source software in addressing the challenges of Network AI it is important to understand the current landscape of projects and initiatives and how they came into existence. Several such initiatives have already laid down the ground work for building Network AI solutions, or are actively working on creating them. Building on these foundations, it is possible to envision what role open source software will play in unleashing the power of AI for the future generations of networks. Some of the required technologies required of Network AI are unique to the Networking industry, and will have to be addressed by the existing OSS projects on the landscape, or by creation of additional ones. Some of the other pieces of technology are more generic, and will come from the broad landscape of AI OSS. Here is a rough outline of the different layers of Networking AI and the source of the required technology:

...

Experience with OSS in other domains shows that whenever there is an OSS technology that powers commercial products or offerings, there is a need to validate the products to make sure they are properly using the OSS technology and are ready to serve the end users in a predictable manner. Such validation/verification programs have existed for a while as part of OSS ecosystems. They are often created and maintained by the same OSS community that develops the OSS projects themselves. The Cloud Native Computing Foundation (CNCF) has a successful "Certified Kubernetes" program that helps vendors and end users ensure that Kubernetes distributions provide all the necessary APIs and functionality. A similar approach should be applied to any OSS Networking AI projects. End users should have a certain level of confidence, knowing that the OSS based AI Networking solution they use will behave as expected. 

Common Vision:

...

Intelligence plane for XG Networks

Ranny Haiby Muddasar Ahmed 

1 page

In the dynamic realm of communication technologies, the fusion of artificial intelligence (AI) with networks promises to redefine connectivity, ushering in an era of unprecedented intelligence, efficiency, and adaptability. As we embark on the journey towards 6G and embrace the vision outlined by the International Telecommunication Union (ITU) for IMT-2030, it becomes clear that AI will play a pivotal role in reshaping network operations.

...

In conclusion, the future of networks in the era of 6G and beyond hinges on the transformative power of AI, fueled by open-source collaboration. By embracing AI-driven intelligence, networks can enhance situational awareness, performance, and capacity management, while enabling quick reactions to undesired states. As we navigate this AI-powered future, the convergence of technological innovation and open collaboration holds the key to unlocking boundless opportunities for progress and prosperity in the telecommunications landscape.

Call for Action

Beth Cohen 

0.25 page

The future of Intelligent Networks and AI adoption in the telecom industry is in the hands of the individuals and organizations who are already contributing to projects and initiatives, and those who will join them. If you are involved in building and operating networks, developing network technology or consuming network services, you are most heartily encouraged to getting involved. Engaging with OSS communities is a way to shape the future of Networking. Your contribution could be small or large, and does not necessarily involve writing code. In fact the community is very much in need of contributors of white papers such as this one, evangelists and big thinkers who want to drive the realization of some really cool and useful leading edge technologies come to fruition. Some of the ways to contribute include:

...