Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Large Language Models to Understand Telecom Standards (2404.02929v2)

Published 2 Apr 2024 in cs.CL and cs.AI

Abstract: The Third Generation Partnership Project (3GPP) has successfully introduced standards for global mobility. However, the volume and complexity of these standards has increased over time, thus complicating access to relevant information for vendors and service providers. Use of Generative AI and in particular LLMs, may provide faster access to relevant information. In this paper, we evaluate the capability of state-of-art LLMs to be used as Question Answering (QA) assistants for 3GPP document reference. Our contribution is threefold. First, we provide a benchmark and measuring methods for evaluating performance of LLMs. Second, we do data preprocessing and fine-tuning for one of these LLMs and provide guidelines to increase accuracy of the responses that apply to all LLMs. Third, we provide a model of our own, TeleRoBERTa, that performs on-par with foundation LLMs but with an order of magnitude less number of parameters. Results show that LLMs can be used as a credible reference tool on telecom technical documents, and thus have potential for a number of different applications from troubleshooting and maintenance, to network operations and software product development.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” CoRR, vol. abs/1706.03762, 2017. [Online]. Available: http://arxiv.org/abs/1706.03762
  2. W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.
  3. OpenAI, “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
  4. H. T. et all, “Llama 2: Open foundation and fine-tuned chat models,” 2023.
  5. CommonCrawl, “Common crawl maintains a free, open repository of web crawl data that can be used by anyone. (last accessed friday, sep 15).” [Online]. Available: https://commoncrawl.org
  6. Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, and Z. Sui, “A survey on in-context learning,” arXiv preprint arXiv:2301.00234, 2023.
  7. J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, and D. C. Schmidt, “A prompt pattern catalog to enhance prompt engineering with chatgpt,” 2023.
  8. Y. Wang, W. Zhong, L. Li, F. Mi, X. Zeng, W. Huang, L. Shang, X. Jiang, and Q. Liu, “Aligning large language models with human: A survey,” arXiv preprint arXiv:2307.12966, 2023.
  9. A. Karapantelakis, P. Alizadeh, A. Alabassi, K. Dey, and A. Nikou, “Generative ai in mobile networks: a survey,” Annals Telecommunications, Link: https://link.springer.com/article/10.1007/s12243-023-00980-9, 2023.
  10. L. Bariah, Q. Zhao, H. Zou, Y. Tian, F. Bader, and M. Debbah, “Large language models for telecom: The next big thing?” arXiv preprint arXiv:2306.10249, 2023.
  11. Y. Huang, H. Du, X. Zhang, D. Niyato, J. Kang, Z. Xiong, S. Wang, and T. Huang, “Large language models for networking: Applications, enabling techniques, and challenges,” arXiv preprint arXiv:2311.17474, 2023.
  12. Y. Chen, R. Li, Z. Zhao, C. Peng, J. Wu, E. Hossain, and H. Zhang, “Netgpt: A native-ai network architecture beyond provisioning personalized generative services,” arXiv preprint arXiv:2307.06148, 2023.
  13. Y. Huang, M. Xu, X. Zhang, D. Niyato, Z. Xiong, S. Wang, and T. Huang, “Ai-generated network design: A diffusion model-based learning approach,” IEEE Network, pp. 1–1, 2023.
  14. C. Wang, M. Scazzariello, A. Farshin, D. Kostic, and M. Chiesa, “Making network configuration human friendly,” arXiv preprint arXiv:2309.06342, 2023.
  15. L. Bariah, H. Zou, Q. Zhao, B. Mouhouche, F. Bader, and M. Debbah, “Understanding telecom language through large language models,” arXiv preprint arXiv:2306.07933, 2023.
  16. S. Soman et al., “Observations on llms for telecom domain: Capabilities and limitations,” arXiv preprint arXiv:2305.13102, 2023.
  17. H. Holm, “Bidirectional encoder representations from transformers (bert) for question answering in the telecom domain.: Adapting a bert-like language model to the telecom domain using the electra pre-training approach,” 2021.
  18. M. Gunnarsson, “Multi-hop neural question answering in the telecom domain,” 2021.
  19. A. Maatouk, F. Ayed, N. Piovesan, A. D. Domenico, M. Debbah, and Z.-Q. Luo, “Teleqna: A benchmark dataset to assess large language models telecommunications knowledge,” 2023.
  20. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  21. T. Lin, Y. Wang, X. Liu, and X. Qiu, “A survey of transformers,” AI Open, 2022.
  22. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018.
  23. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  24. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  25. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
  26. G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, A. Cappelli, H. Alobeidli, B. Pannier, E. Almazrouei, and J. Launay, “The refinedweb dataset for falcon llm: outperforming curated corpora with web data, and web data only,” arXiv preprint arXiv:2306.01116, 2023.
  27. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
  28. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized BERT pretraining approach,” CoRR, vol. abs/1907.11692, 2019. [Online]. Available: http://arxiv.org/abs/1907.11692
  29. P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.   Austin, Texas: Association for Computational Linguistics, Nov. 2016, pp. 2383–2392. [Online]. Available: https://www.aclweb.org/anthology/D16-1264
  30. P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for SQuAD,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).   Association for Computational Linguistics, 2018, pp. 784–789. [Online]. Available: https://www.aclweb.org/anthology/P18-2124
  31. P. S. H. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W. Yih, T. Rocktäschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” CoRR, vol. abs/2005.11401, 2020.
  32. G. Gerganov, “Ggml library: Large language models for everyone,” https://github.com/rustformers/llm/tree/main/crates/ggml, 2022.
  33. “Langchain framework,” accessed: 2023-09-05. [Online]. Available: https://python.langchain.com/docs/
  34. “Langchain query engine,” accessed: 2023-09-05. [Online]. Available: https://python.langchain.com/docs/modules/data_connection/
  35. “Faiss: A library for efficient similarity search,” (Accessed on 10/05/2023). [Online]. Available: https://tinyurl.com/2s3d8kfc
  36. T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” in International Conference on Learning Representations, 2020.
  37. “Llama.cpp, a library to perform fast inference of llms,” accessed: 2023-09-28. [Online]. Available: https://github.com/ggerganov/llama.cpp
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Athanasios Karapantelakis (5 papers)
  2. Alexandros Nikou (28 papers)
  3. Farnaz Moradi (4 papers)
  4. Christian Orlog (1 paper)
  5. Fitsum Gaim (2 papers)
  6. Henrik Holm (22 papers)
  7. Doumitrou Daniil Nimara (3 papers)
  8. Vincent Huang (3 papers)
  9. Mukesh Thakur (1 paper)
Citations (5)