Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Integrated Fine-tuning and Inference when Generative AI meets Edge Intelligence (2401.02668v1)

Published 5 Jan 2024 in cs.DC and cs.LG

Abstract: The high-performance generative artificial intelligence (GAI) represents the latest evolution of computational intelligence, while the blessing of future 6G networks also makes edge intelligence (EI) full of development potential. The inevitable encounter between GAI and EI can unleash new opportunities, where GAI's pre-training based on massive computing resources and large-scale unlabeled corpora can provide strong foundational knowledge for EI, while EI can harness fragmented computing resources to aggregate personalized knowledge for GAI. However, the natural contradictory features pose significant challenges to direct knowledge sharing. To address this, in this paper, we propose the GAI-oriented synthetical network (GaisNet), a collaborative cloud-edge-end intelligence framework that buffers contradiction leveraging data-free knowledge relay, where the bidirectional knowledge flow enables GAI's virtuous-cycle model fine-tuning and task inference, achieving mutualism between GAI and EI with seamless fusion and collaborative evolution. Experimental results demonstrate the effectiveness of the proposed mechanisms. Finally, we discuss the future challenges and directions in the interplay between GAI and EI.

Introduction

The evolution of generative artificial intelligence (GAI) has brought significant advancements in AI-generated content across various fields. At the same time, edge intelligence (EI), propelled by future 6G network technologies, appears to be a game-changer in the world of distributed computational power. The intersection of these two domains presents a unique set of opportunities and challenges. This paper introduces the GAI-oriented synthetical network (GaisNet), a pioneering framework aiming to synergize GAI with EI in a collaborative cloud-edge-end intelligence architecture.

GaisNet: A Collaborative Framework

GaisNet is designed to bridge the gap between the centralized resource-heavy GAI models and the lightweight, flexible EI models situated closer to end-users. By employing a bidirectional knowledge flow mechanism, GaisNet enables efficient, fine-tuned model adjustments and improves the inference capabilities of GAI models. Edge servers play a pivotal role as knowledge relays in this process, handling both the domain-specific knowledge from client devices and the foundational knowledge from cloud-based GAI models.

The paper highlights that while GAI benefits from significant pre-training on large datasets, its growth is restricted due to data exhaustion and the monopolization by tech giants. On the other hand, EI, due to its proximity to users and vast data from IoT devices, faces limitations due to smaller model scales that lack prior knowledge. This is where GaisNet steps in, proposing an integrated cloud-edge-end approach to tap into the best of both worlds.

The Operations of GaisNet

GaisNet operates on a dual-level knowledge flow: the cloud-edge subnetworks and the edge-end subnetworks. The cloud-edge subnetworks are characterized by large-scale knowledge transfer focusing on generalized foundation knowledge, while edge-end subnetworks deal with small-scale, domain-specific knowledge transfer. The framework enables the edge server to function without actual data transfer, thus safeguarding user privacy and efficiently using localized knowledge.

The operational workflow of GaisNet includes stages such as model segmentation, data embedding, computing and transmission of tunable modules, and aggregation of enhanced models. With tunable parts of models efficiently distributed across the client clusters, GaisNet fosters simultaneous model fine-tuning and task inference while maintaining privacy and reduced communication overhead.

Experimental Results and Future Directions

Experiments conducted to validate GaisNet's effectiveness suggest the superiority of pre-trained models over non-pretrained ones in inference accuracy. Moreover, parameter-efficient fine-tuning demonstrates a significant performance with reduced computing resources compared to full parameter fine-tuning. The influence of non-IID (independent and identically distributed) and the number of client clusters partaking in fine-tuning reveal insights into the convergence accuracy of the model.

As we look forward, the paper underscores several future challenges, including privacy concerns with GAI use, the theoretical bounds of GAI's performance, considering resource constraints, and the development of incentive mechanisms for the participation of 6G end devices. These considerations are crucial in ensuring that as GaisNet and similar frameworks evolve, they do so with a balanced view of ethical usage, resource optimization, and fair incentive distribution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Y. Cao, S. Li, Y. Liu, Z. Yan, Y. Dai, P. Yu, L. Sun, “A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt,” arXiv preprint arXiv:2303.04226, 2023.
  2. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, vol. 30, 2017.
  3. H. Du, Z. Li, D. Niyato, J. Kang, Z. Xiong, D. Kim and others, “Enabling AI-generated content (AIGC) services in wireless edge networks,” arXiv preprint arXiv:2301.03220, 2023.
  4. W. Zhuang, C. Chen, L. Lyu, “When foundation model meets federated learning: Motivations, challenges, and future directions,” arXiv preprint arXiv:2306.15546, 2023.
  5. G. Zhu, Z. Lyu, X. Jiao, P. Liu, M. Chen, J. Xu, S. Cui, P. Zhang, “Pushing AI to wireless network edge: An overview on integrated sensing, communication, and computation towards 6G,” in Science China Information Sciences, vol. 66, no. 3, pp. 130301, 2023.
  6. X. Xia, F. Chen, Q. He, J. Grundy, M. Abdelrazek, and H. Jin, “Online collaborative data caching in edge computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 2, pp. 281-294, 2020.
  7. X. Xia, F. Chen, Q. He, J. Grundy, M. Abdelrazek, and H Jin, “Cost-effective app data distribution in edge computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 1, pp. 31-44, 2020.
  8. S. Duan, D. Wang, J. Ren, F. Lyu, Y. Zhang, H. Wu, X. Shen, “Distributed artificial intelligence empowered by end-edge-cloud computing: A survey,” IEEE Communications Surveys & Tutorials, 2022.
  9. X. Huang, P. Li, H. Du, J. Kang, D. Niyato, D. Kim, Y. Wu, “Federated Learning-Empowered AI-Generated Content in Wireless Networks,” arXiv preprint arXiv:2307.07146, 2023.
  10. Z. Zhang, Y. Yang, Y. Dai, Q. Wang, Y. Yu, L. Qu, Z. Xu, “FedPETuning: When federated learning meets the parameter-efficient tuning methods of pre-trained language models,” Association for Computational Linguistics (ACL), pp. 9963–9977, 2023.
  11. Y. Tian, Y. Wan, L. Lyu, D. Yao, H. Jin, L. Sun, “FedBERT: When federated learning meets pre-training,” in ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 4, pp. 1–26, 2022.
  12. H. Zou, Q. Zhao, L. Bariah, M. Bennis, M. Debbah, “Wireless multi-agent generative ai: From connected intelligence to collective intelligence,” arXiv preprint arXiv:2307.02757, 2023.
  13. Z. Lin, G. Qu, X. Chen, K. Huang, “Split Learning in 6G Edge Networks,” arXiv preprint arXiv:2306.12194, 2023.
  14. A. Agarwal, M. Rezagholizadeh, P. Parthasarathi, “Practical Takes on Federated Learning with Pretrained Language Models,” Findings of the Association for Computational Linguistics: EACL 2023, pp. 454–471, 2023.
  15. M. Jia, L. Tang, B. Chen, C. Cardie, S. Belongie, B. Hariharan, S. Lim, Ser-Nam, “Visual prompt tuning,” European Conference on Computer Vision, pp. 709–727, 2022.
  16. J. He, C. Zhou, X. Ma, T. Berg-Kirkpatrick, G. Neubig, “Towards a unified view of parameter-efficient transfer learning,” arXiv preprint arXiv:2110.04366, 2021.
  17. E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
  18. J. Chen, W. Xu, S. Guo, J. Wang, J. Zhang, H. Wang, “FedTune: A Deep Dive into Efficient Federated Fine-Tuning with Pre-trained Transformers,” arXiv preprint arXiv:2211.08025, 2022.
  19. Z. Cheng, X. Xia, M. Liwang, X. Fan, Y. Sun, X. Wang, L. Huang, “CHEESE: distributed clustering-based hybrid federated Split learning over edge networks,” IEEE Transactions on Parallel and Distributed Systems, 2023.
  20. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and others, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
  21. X. Li, S. Bi, H. Wang, “Optimizing resource allocation for joint AI model training and task inference in edge intelligence systems,” in IEEE Wireless Communications Letters, vol. 10, no. 3, pp. 532–536, 2020.
  22. A. Eshratifar, M. Abrishami, M. Pedram, “JointDNN: An efficient training and inference engine for intelligent mobile cloud computing services,” in IEEE Transactions on Mobile Computing, vol. 20, no. 2, pp. 565–576, 2019.
  23. N. Chen, Z. Cheng, X. Fan, B. Huang, X. Du, and G. Mohsen, “Integrated Sensing, Communication, and Computing for Cost-effective Multimodal Federated Perception,” arXiv preprint arXiv:2311.03815, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ning Chen (128 papers)
  2. Zhipeng Cheng (16 papers)
  3. Xuwei Fan (8 papers)
  4. Xiaoyu Xia (15 papers)
  5. Lianfen Huang (13 papers)