Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models (2401.01666v1)

Published 3 Jan 2024 in cs.NI
An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models

Abstract: Generative artificial intelligence (GenAI) offers various services to users through content creation, which is believed to be one of the most important components in future networks. However, training and deploying big artificial intelligence models (BAIMs) introduces substantial computational and communication overhead.This poses a critical challenge to centralized approaches, due to the need of high-performance computing infrastructure and the reliability, secrecy and timeliness issues in long-distance access of cloud services. Therefore, there is an urging need to decentralize the services, partly moving them from the cloud to the edge and establishing native GenAI services to enable private, timely, and personalized experiences. In this paper, we propose a brand-new bottom-up BAIM architecture with synergetic big cloud model and small edge models, and design a distributed training framework and a task-oriented deployment scheme for efficient provision of native GenAI services. The proposed framework can facilitate collaborative intelligence, enhance adaptability, gather edge knowledge and alleviate edge-cloud burden. The effectiveness of the proposed framework is demonstrated through an image generation use case. Finally, we outline fundamental research directions to fully exploit the collaborative potential of edge and cloud for native GenAI and BAIM applications.

Overview of the Framework

Recent advancements in Generative AI (GenAI) services, particularly those that produce content like images, text, and videos, have called for more efficient deployment solutions. In light of the substantial compute and communication demands of big AI models (BAIMs), the paper underscores the urgency in decentralizing these services to the edge of the network. By integrating small edge models with a larger cloud model, the authors propose a novel edge-cloud collaboration framework designed for native GenAI service provision. This approach aims to reduce the computational load on centralized infrastructures, improve data security, ensure timely responses, and offer personalized services.

The Challenges Addressed

The proposed framework tackles three major challenges: adaptability, edge knowledge acquisition, and mitigation of cloud burden. The model adapts to varying communication, computation, and storage capacities across network nodes and has the ability to learn from local edge data, allowing for the generation of more sophisticated BAIMs. Additionally, a distributed approach to training and deploying models helps lower demands on data storage, processing, and communication on central servers, which aligns with the ecological and economic goals.

Architecture and Model Training Considerations

Focused on the bottom-up BAIM architecture, the paper outlines a two-tiered gating network concept (HierGate) that selects top-performing edge models for each user task. In this setup, the cloud server conducts fine-tuning or freezing strategies on the BAIM, bolstered by updating procedures such as continual learning, pruning, and few-shot learning to harness emerging knowledge from different nodes. This ensures the adaptability of the system to evolving tasks without degrading its core performance.

Demonstrating the Framework with Image Generation

An empirical validation through an image generation case paper is presented, employing variational autoencoders (VAEs) across multiple edge nodes. Engagingly, fine-tuning of the central model on the cloud, followed by edge personalization, stands out in improving the quality of generated images. Quantitative measurements, such as the Frechet Inception Distance (FID), confirm the finetuning strategy's superior effectiveness over alternative training approaches.

Concluding Prospects

Although the introduced framework marks a promising step towards distributed BAIM architecture for native GenAI provisioning, it surfaces several research challenges. Future directions include managing user data securely, creating more robust model fusion schemes, adapting to dynamic edge network changes, and devising steadfast mechanisms against security threats. The paper's insights magnify the prospect that these improvements, both in technology and operational modality, can substantiate the full potential of edge-cloud collaboration in GenAI services.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. M. Xu, H. Du and D. Niyato “Unleashing the power of edge-cloud generative AI in mobile networks: A survey of AIGC services” In arXiv preprint arXiv:2303.16129, 2023
  2. “Foundation Model Based Native AI Framework in 6G with Cloud-Edge-End Collaboration” In arXiv preprint arXiv:2310.17471, 2023
  3. Z. Chen, Z. Zhang and Z. Yang “Big AI models for 6G wireless networks: Opportunities, challenges, and research directions” In arXiv preprint arXiv:2308.06250, 2023
  4. “Toward Self-Learning Edge Intelligence in 6G” In IEEE Communications Magazine 58.12, 2020, pp. 34–40 DOI: 10.1109/MCOM.001.2000388
  5. X. Lin “An Overview of the 3GPP Study on Artificial Intelligence for 5G New Radio” In arXiv preprint arXiv:2308.05315, 2023
  6. “Knowledge Distillation of Large Language Models” In arXiv preprint arXiv:2306.08543, 2023
  7. M. Chen, D. Gündüz and K. Huang “Distributed Learning in Wireless Networks: Recent Progress and Future Challenges” In IEEE Journal on Selected Areas in Communications 39.12, 2021, pp. 3579–3605 DOI: 10.1109/JSAC.2021.3118346
  8. W. Xu, Z. Yang and D.W.K. Ng “Edge Learning for B5G Networks With Distributed Signal Processing: Semantic Communication, Edge Computing, and Wireless Sensing” In IEEE Journal of Selected Topics in Signal Processing 17.1, 2023, pp. 9–39 DOI: 10.1109/JSTSP.2023.3239189
  9. “JMSNAS: Joint model split and neural architecture search for learning over mobile edge networks” In 2022 IEEE International Conference on Communications Workshops, 2022, pp. 103–108 IEEE
  10. Y. Yang, Z. Zhang and Y. Tian “Over-the-Air Split Machine Learning in Wireless MIMO Networks” In IEEE Journal on Selected Areas in Communications 41.4, 2023, pp. 1007–1022 DOI: 10.1109/JSAC.2023.3242701
  11. “MocoSFL: enabling cross-client collaborative self-supervised learning” In The Eleventh International Conference on Learning Representations, 2022
  12. Google “Introducing Pathways: A next-generation AI architecture” https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/.
  13. B. Mustafa, C. Riquelme and J. Puigcerver “Multimodal contrastive learning with LiMoE: the language-image mixture of experts” In Advances in Neural Information Processing Systems 35, 2022, pp. 9564–9576
  14. “Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11828–11837
  15. “Multimodality helps unimodality: Cross-modal few-shot learning with multimodal models” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19325–19337
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuqing Tian (3 papers)
  2. Zhaoyang Zhang (273 papers)
  3. Yuzhi Yang (8 papers)
  4. Zirui Chen (27 papers)
  5. Zhaohui Yang (193 papers)
  6. Richeng Jin (32 papers)
  7. Tony Q. S. Quek (237 papers)
  8. Kai-Kit Wong (227 papers)
Citations (7)