Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: added a paragraph to the Intelligent Networking Differences section

...

In addition to general domain adaptation, many telcos will have the issue of multilingual datasets, where a mix of languages (typically English + something else) will exist in the data (syslogs, wikis, tickets, chat conversations etc.). While many options exist for both generative LLMs [11] and text embedding models [12], not many foundation models have seen enough non-English data in training, thus options in foundation model choice are somewhat restricted for operators working on non-English data. A solution to work around this issue is to use automated translation and language detection models on the data as a preprocessing step.

In conclusion, while foundation models and transfer learning have been shown to work very well on general human language when pre-training is done on large corpuses of human text (such as Wikipedia, or the Pile [29]), it remains an open question to be answered whether domain adaptation and downstream task adaptation work equally well on the kinds of domain-specific, semi-structured, mixed modality datasets we find in the telecom industry. To enable this, telecoms should focus on standardization and data governance efforts, such as standardized and unified data collection policies and developing high quality structured data across a common definition and understanding. 

Projects and Research

1.5 page

3GPP Intelligent Radio Access Network (RAN)

ChangJin Wang 

The intelligent evolution of wireless access networks is in a phase of rapid evolution and continuous innovation. In June 2022, 3GPP announced the freezing of R17 and described the process diagram of an intelligent RAN in TR37.817, including data collection, model training, model inference, and execution modules, which together form the infrastructure of an intelligent RAN. This promotes the rapid implementation and deployment of 5G RAN intelligence and provides support for intelligent scenarios such as energy saving, load balancing, and mobility optimization.

  • AI and Machine Learning Drive 5G RAN Intelligence

<sorry for the late addition, feel free to edit or delete Andrei Agapi Beth Cohen - added by Jason Hunt >

Beyond these approaches, an emerging technique of pre-training foundation models on network data holds potential promise.  In this technique, network data is essentially turned into a language via pre-processing and tokenization which can then be used for pre-training a new "network foundation model." [34]  Initial research has demonstrated this technique on Domain Name Service (DNS) data [35] and geospatial data [36].  As this area of research matures, it could allow for general purpose network foundation models that can be fine-tuned to answer a variety of questions around network data or configurations without having to train individual models for bespoke network management tasks. 

<end addition>

In conclusion, while foundation models and transfer learning have been shown to work very well on general human language when pre-training is done on large corpuses of human text (such as Wikipedia, or the Pile [29]), it remains an open question to be answered whether domain adaptation and downstream task adaptation work equally well on the kinds of domain-specific, semi-structured, mixed modality datasets we find in the telecom industry. To enable this, telecoms should focus on standardization and data governance efforts, such as standardized and unified data collection policies and developing high quality structured data across a common definition and understanding. 

Projects and Research

1.5 page

3GPP Intelligent Radio Access Network (RAN)

ChangJin Wang 

The intelligent evolution of wireless access networks is in a phase of rapid evolution and continuous innovation. In June 2022, 3GPP announced the freezing of R17 and described the process diagram of an intelligent RAN in TR37.817, including data collection, model training, model inference, and execution modules, which together form the infrastructure of an intelligent RAN. This promotes the rapid implementation and deployment of 5G RAN intelligence and provides support for intelligent scenarios such as energy saving, load balancing, and mobility optimization.

  • AI and Machine Learning Drive 5G RAN Intelligence

Artificial intelligence and machine learning technologies are playing an increasingly important role in 5G RAN intelligence. The application of these technologies enables the network to learn autonomously, self-optimize, and self-repair, thereby improving network stability, reliability, and performance. For example, by using machine learning algorithms to predict and schedule network traffic, more efficient resource allocation and load balancing can be achieved. By leveraging AI technologies for automatic network fault detection and repair, operation and maintenance costs can be greatly reduced while improving user experience. The Artificial intelligence and machine learning technologies are playing an increasingly important role in 5G RAN intelligence. The application of these technologies enables the network to learn autonomously, self-optimize, and self-repair, thereby improving network stability, reliability, and performance. For example, by using machine learning algorithms to predict and schedule network traffic, more efficient resource allocation and load balancing can be achieved. By leveraging AI technologies for automatic network fault detection and repair, operation and maintenance costs can be greatly reduced while improving user experience. The intelligence of 5G wireless access networks also provides broad space for various vertical industry applications. For instance, in intelligent manufacturing, 5G can enable real-time communication and data transmission between devices, improving production efficiency and product quality. In smart cities, 5G can provide high-definition video surveillance, intelligent transportation management, and other services to enhance urban governance. Additionally, 5G has played a significant role in remote healthcare, online education, and other fields.

...

[1] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).

[2] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

[3] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).

[4] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems 26 (2013).

[5] Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.

[6] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.

[7] Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).

[8] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).

[9] Jigsaw Multilingual Toxic Comment Classification Kaggle Competition: https://www.kaggle.com/competitions/jigsaw-multilingual-toxic-comment-classification

." Advances in neural information processing systems 30 (2017).

[2] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

[3] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).

[4] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems 26 (2013).

[5] Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.

[6] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.

[7] Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).

[8] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).

[9] Jigsaw Multilingual Toxic Comment Classification Kaggle Competition: https://www.kaggle.com/competitions/jigsaw-multilingual-toxic-comment-classification

[10] TensorFlow 2.0 Question Answering Kaggle Competition: https://www.kaggle.com/competitions/tensorflow2-question-answering

[11] Huggingface Open LLM Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

[12] Huggingface Massive Text Embedding Benchmark (MTEB) Leaderboard: https://huggingface.co/spaces/mteb/leaderboard

[13] https://chat.openai.com

[14] https://gemini.google.com

[15] https://grok.x.ai

[16] Li, Raymond, et al. "Starcoder: may the source be with you!." arXiv preprint arXiv:2305.06161 (2023).

[17] Lozhkov, Anton, et al. "StarCoder 2 and The Stack v2: The Next Generation." arXiv preprint arXiv:2402.19173 (2024).

[18] Nijkamp, Erik, et al. "Codegen: An open large language model for code with multi-turn program synthesis." arXiv preprint arXiv:2203.13474 (2022).

[19] Nijkamp, Erik, et al. "Codegen2: Lessons for training llms on programming and natural languages." arXiv preprint arXiv:2305.02309 (2023).

[20] Azerbayev, Zhangir, et al. "Llemma: An open language model for mathematics." arXiv preprint arXiv:2310.10631 (2023).

[21] Kaggle LLM Science Exam [10] TensorFlow 2.0 Question Answering Kaggle Competition: https://www.kaggle.com/competitions/tensorflow2kaggle-llm-question-answering

[11] Huggingface Open LLM Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

[12] Huggingface Massive Text Embedding Benchmark (MTEB) Leaderboard: https://huggingface.co/spaces/mteb/leaderboard

[13] https://chat.openai.com

science-exam

[22] BabyAGI: [14] https://gemini.googlegithub.com/yoheinakajima/babyagi

[15] https://grok.x.ai

[16] Li, Raymond, et al. "Starcoder: may the source be with you!." arXiv preprint arXiv:2305.06161 (2023).

23] Ouyang, Long, et al. "Training language models to follow instructions with human feedback." Advances in neural information processing systems 35 (2022): 27730-27744.

[24] Borgeaud, Sebastian, et al. "Improving language models by retrieving from trillions of tokens." International conference on machine learning. PMLR, 2022.

[25] Izacard, Gautier, and Edouard Grave. "Leveraging passage retrieval with generative models for open domain question answering[17] Lozhkov, Anton, et al. "StarCoder 2 and The Stack v2: The Next Generation." arXiv preprint arXiv:24022007.1917301282 (20242020).

[1826] NijkampYao, ErikShunyu, et al. "Codegen: An open large language model for code with multi-turn program synthesis." arXiv preprint arXiv:2203.13474 (2022).Webshop: Towards scalable real-world web interaction with grounded language agents." Advances in Neural Information Processing Systems 35 (2022): 20744-20757.

[27] Gao, Luyu[19] Nijkamp, Erik, et al. "Codegen2: Lessons for training llms on programming and natural languages." arXiv preprint arXiv:2305.02309 (2023).Pal: Program-aided language models." International Conference on Machine Learning. PMLR, 2023.

[28] Wang, Ruoyao, et al. "Behavior cloned transformers are neurosymbolic reasoners[20] Azerbayev, Zhangir, et al. "Llemma: An open language model for mathematics." arXiv preprint arXiv:23102210.1063107382 (20232022).

[21] Kaggle LLM Science Exam Competition: https://www.kaggle.com/competitions/kaggle-llm-science-exam[22] BabyAGI29] The Pile Dataset: https://github.com/yoheinakajima/babyagipile.eleuther.ai/

[23] Ouyang, Long, et al. "Training language models to follow instructions with human feedback." Advances in neural information processing systems 35 (2022): 27730-27744.

[24] Borgeaud, Sebastian, et al. "Improving language models by retrieving from trillions of tokens." International conference on machine learning. PMLR, 2022.

[25] Izacard, Gautier, and Edouard Grave. "Leveraging passage retrieval with generative models for open domain question answering." arXiv preprint arXiv:2007.01282 (2020).

[26] Yao, Shunyu, et al. "Webshop: Towards scalable real-world web interaction with grounded language agents." Advances in Neural Information Processing Systems 35 (2022): 20744-20757.

[27] Gao, Luyu, et al. "Pal: Program-aided language models." International Conference on Machine Learning. PMLR, 2023.

[28] Wang, Ruoyao, et al. "Behavior cloned transformers are neurosymbolic reasoners." arXiv preprint arXiv:2210.07382 (2022).

[29] The Pile Dataset: https://pile.eleuther.ai/

[30] Huggingface platform: https://huggingface.co/

[31] Huggingface platform statistics: https://originality.ai/blog/huggingface-statistics

[32] Kaggle platform: https://www.kaggle.com/

30] Huggingface platform: https://huggingface.co/

[31] Huggingface platform statistics: https://originality.ai/blog/huggingface-statistics

[32] Kaggle platform: https://www.kaggle.com/

[33] Paperswithcode platform: https://paperswithcode.com/

[34] Le, Srivatsa, et al. "Rethinking Data-driven Networking with Foundation Models" HotNets ’22, November 14–15, 2022, Austin, TX, USA

[35] Le, Franck, et al. "NorBERT: NetwOrk Representations Through BERT for Network Analysis & Management." 2022 30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 2022.

[36] Geospatial Foundation Model: https://research.ibm.com/blog/geospatial-models-nasa-ai[33] Paperswithcode platform: https://paperswithcode.com/