Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A natural question arises on how the the power of LLMs can be harnessed for problems and applications related to Intelligent Networking, network automation and for operating and optimizing telecommunication networks in general, at any level of the network stack.Datasets  Datasets in telco-related applications have a few particularities. For one, the data one might encounter ranges from fully structured (e.g. code, scripts, configuration, or time series KPIs), to semi-structured (syslogs, design templates etc.), to unstructured data (design documents and specifications, Wikis, Github issues, emails, chatbot conversations).

...

Related Open Source Landscape

1-2 pages

  • Network communities

Successful development of AI models for use in Networking relies on the availability of data that could be shared under a common license. The Linux Foundation created the Community Data License Agreement (CDLA) for this purpose. Using this license, end users can share data and make it available for researches, who in turn develop the necessary models and applications that benefit the end users.  In addition to providing a legal framework for sharing research and innovation, Open Source provides a forum for creating communities with shared purpose.

  • Network communities

Open Source Software (OSS) communities have been successfully building projects that provide the building blocks of networks for over decade now. OSS projects provide the underlying technology for all layers of the network, including the data/forwarding plane, control plane, management and Open Source Software (OSS) communities have been successfully building projects that provide the building blocks of networks for over decade now. OSS projects provide the underlying technology for all layers of the network, including the data/forwarding plane, control plane, management and orchestration. A vibrant ecosystem of contributing companies exists around these projects, consisting of organizations that realized the value in the principles OSS for networking:

...

The same principles are now being applied to the shared development of Network AI technologies, where the open source community fosters innovation and stimulates business growth. AI innovation has been strongly propelled by OSS projects that were initiated following the same principles mentioned above. It is hard to imagine doing any modern AI development without heavily relying on OSS. OSS AI and ML projects range from anything between from the framework for developing , Libraries frameworks for development  to libraries and programming tools. Data OSS work means that data scientists who develop domain specific models do not have to start from scratch, instead they can leverage such as networking and telecommunications, can focus on innovation by leveraging OSS projects to jump start their work and focus on creation of innovationwork. It would be almost impossible to mention all the relevant OSS AI projects here as there are already so many of them, and the list only keeps growing quickly. The Linux Foundation AI & Data provides maintains a useful dynamic landscape here.

Successful development of AI models for use in Networking relies on the availability of data that could be shared under a common license. The Linux Foundation created the Community Data License Agreement (CDLA) for this purpose. Using this license, end users can share data and make it available for researches, who in turn develop the necessary models and applications that benefit the end users.


When it comes to the popular subject of LLMs, there is a lot of debating going on currently about what an "Open AI model" really means. While it is out of the scope of this paper to try and settle any of those debates, it is obvious that there is a clear need to create the definition of "open LLM". The sooner such definition is created and blessed by the industry, the faster innovation can happen.

In the area of open source LLMs, both with respect to generative models [11] as well as more specialisedspecialized, discriminative models such as text classifiers, QA, summarisation, and text embedding models [12], has been particularly vibrant and rapidly evolving over the past 5 years. A number of widely used, global platforms are being widely used for sharing open models, code, datasets and accompanying research papers, have been particularly instrumental in democratising democratizing access to cutting edge technologies and fostering an environment of global collaboration. Among these platforms, Huggingface [30] has played a particularly pivotal role. At the time of this writing, HuggingFace hosts [31] over 350K models, 75K datasets and 150K demo apps (Spaces), in more than 100 languages. It also maintains Transformers, a popular open source library that facilitates integrating, modifying and performing downstream task adaptation for thousands of foundation models from this vast repository. It also provides the Datasets library, as well as several widely used benchmarks and leaderboards [11][12] that are very instrumental for researchers and developers implementing LLM solutions. Other important platforms used by the AI/ML open source community in general (not necessarily LLM-focused) are Kaggle [32] (used for public datasets and high profile ML competitions in all areas) and Paperswithcode [33] (this platform links academic research papers to their respective code and implementation, as well as providing benchmarks and leaderboards comparing different competing solutions for a wide area of ML tasks).

...