Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Overview on Generative AI at Scale with Edge-Cloud Computing (2306.17170v2)

Published 2 Jun 2023 in cs.DC, cs.AI, cs.SY, and eess.SY

Abstract: As a specific category of AI, generative artificial intelligence (GenAI) generates new content that resembles what is created by humans. The rapid development of GenAI systems has created a huge amount of new data on the Internet, posing new challenges to current computing and communication frameworks. Currently, GenAI services rely on the traditional cloud computing framework due to the need for large computation resources. However, such services will encounter high latency because of data transmission and a high volume of requests. On the other hand, edge-cloud computing can provide adequate computation power and low latency at the same time through the collaboration between edges and the cloud. Thus, it is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm. In this overview paper, we review recent developments in GenAI and edge-cloud computing, respectively. Then, we use two exemplary GenAI applications to discuss technical challenges in scaling up their solutions using edge-cloud collaborative systems. Finally, we list design considerations for training and deploying GenAI systems at scale and point out future research directions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (113)
  1. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  2. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
  3. Fastspeech: Fast, robust and controllable text to speech. Advances in neural information processing systems, 32, 2019.
  4. Fastspeech 2: Fast and high-quality end-to-end text to speech. arXiv preprint arXiv:2006.04558, 2020.
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  6. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  7. Scangan360: A generative model of realistic scanpaths for 360 images. IEEE Transactions on Visualization and Computer Graphics, 28(5):2003–2013, 2022.
  8. Point-e: A system for generating 3d point clouds from complex prompts. arXiv preprint arXiv:2212.08751, 2022.
  9. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
  10. Naturalspeech: End-to-end text to speech synthesis with human-level quality. arXiv preprint arXiv:2205.04421, 2022.
  11. OpenAI. Gpt-4 technical report, 2023.
  12. A proposed meta-reality immersive development pipeline: Generative ai models and extended reality (xr) content for the metaverse. Journal of Intelligent Learning Systems and Applications, 15, 2023.
  13. Enabling deep learning on iot edge: Approaches and evaluation. In 2018 IEEE/ACM Symposium on Edge Computing (SEC), pages 367–372. IEEE, 2018.
  14. Towards performance clarity of edge video analytics. In 2021 IEEE/ACM Symposium on Edge Computing (SEC), pages 148–164. IEEE, 2021.
  15. Ai-generated content (aigc): A survey. arXiv preprint arXiv:2304.06632, 2023.
  16. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226, 2023.
  17. A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need? arXiv preprint arXiv:2303.11717, 2023.
  18. A survey on audio diffusion models: Text to speech synthesis and enhancement in generative ai. arXiv preprint arXiv:2303.13336, 2, 2023.
  19. Text-to-image diffusion model in generative ai: A survey. arXiv preprint arXiv:2303.07909, 2023.
  20. A survey of multimodal deep generative models. Advanced Robotics, 36(5-6):261–278, 2022.
  21. Edge-cloud polarization and collaboration: A comprehensive survey for ai. IEEE Transactions on Knowledge and Data Engineering, 2022.
  22. Edge computing with artificial intelligence: A machine learning perspective. ACM Computing Surveys, 55(9):1–35, 2023.
  23. A survey on the convergence of edge computing and ai for uavs: Opportunities and challenges. IEEE Internet of Things Journal, 2022.
  24. The convergence and interplay of edge, fog, and cloud in the ai-driven internet of things (iot). Information Systems, 107:101840, 2022.
  25. Roadmap for edge ai: A dagstuhl perspective, 2022.
  26. Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services. arXiv preprint arXiv:2303.16129, 2023.
  27. Deep generative models: Survey. In 2018 International conference on intelligent systems and computer vision (ISCV), pages 1–8. IEEE, 2018.
  28. A comprehensive survey and analysis of generative models in machine learning. Computer Science Review, 38:100285, 2020.
  29. David Foster. Generative deep learning: teaching machines to paint, write, compose, and play. O’Reilly Media, 2019.
  30. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  31. Variations in variational autoencoders-a comparative evaluation. Ieee Access, 8:153651–153670, 2020.
  32. Dynamical variational autoencoders: A comprehensive review. arXiv preprint arXiv:2008.12595, 2020.
  33. Autoencoder and its various variants. In 2018 IEEE international conference on systems, man, and cybernetics (SMC), pages 415–419. IEEE, 2018.
  34. A survey on variational autoencoders from a green ai perspective. SN Computer Science, 2(4):301, 2021.
  35. Wasserstein auto-encoders. arXiv preprint arXiv:1711.01558, 2017.
  36. Autoencoding beyond pixels using a learned similarity metric. In International conference on machine learning, pages 1558–1566. PMLR, 2016.
  37. Zero-vae-gan: Generating unseen features for generalized and transductive zero-shot learning. IEEE Transactions on Image Processing, 29:3665–3680, 2020.
  38. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28, 2015.
  39. Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys (CSUR), 54(2):1–38, 2021.
  40. Recent progress on generative adversarial networks (gans): A survey. IEEE access, 7:36322–36333, 2019.
  41. How generative adversarial networks and their variants work: An overview. ACM Computing Surveys (CSUR), 52(1):1–43, 2019.
  42. Generative adversarial networks: An overview. IEEE signal processing magazine, 35(1):53–65, 2018.
  43. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE transactions on knowledge and data engineering, 2021.
  44. A survey on generative adversarial networks: Variants, applications, and training. ACM Computing Surveys (CSUR), 54(8):1–49, 2021.
  45. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arxiv 2014. arXiv preprint arXiv:1406.1078, 2020.
  46. Sequence to sequence learning with neural networks, 2014.
  47. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555, 2020.
  48. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32, 2019.
  49. A focused study on sequence length for dialogue summarization. arXiv preprint arXiv:2209.11910, 2022.
  50. Diversity driven attention model for query-based abstractive summarization. arXiv preprint arXiv:1704.08300, 2017.
  51. An overview on language models: Recent developments and outlook. arXiv preprint arXiv:2303.05759, 2023.
  52. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  53. A survey on vision transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(01):87–110, jan 2023.
  54. Transformers in vision: A survey. ACM computing surveys (CSUR), 54(10s):1–41, 2022.
  55. Efficient transformers: A survey. ACM Computing Surveys, 55(6):1–28, 2022.
  56. A survey of visual transformers. IEEE Transactions on Neural Networks and Learning Systems, 2023.
  57. Video transformers: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  58. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  59. A survey of transformers. AI Open, 2022.
  60. Ammus: A survey of transformer-based pretrained models in natural language processing. arXiv preprint arXiv:2108.05542, 2021.
  61. A survey of controllable text generation using transformer-based pre-trained language models. arXiv preprint arXiv:2201.05337, 2022.
  62. A survey of vision-language pre-trained models. arXiv preprint arXiv:2202.10936, 2022.
  63. Chatgpt is not all you need. a state of the art review of large generative ai models. arXiv preprint arXiv:2301.04655, 2023.
  64. Simson Garfinkel. Architects of the information society: 35 years of the Laboratory for Computer Science at MIT. MIT press, 1999.
  65. Cloud computing: History and overview. In 2019 IEEE Cloud Summit, pages 1–7. IEEE, 2019.
  66. Distributed artificial intelligence empowered by end-edge-cloud computing: A survey. IEEE Communications Surveys & Tutorials, 2022.
  67. A survey on mobile edge computing: The communication perspective. IEEE communications surveys & tutorials, 19(4):2322–2358, 2017.
  68. The promise of edge computing. Computer, 49(5):78–81, 2016.
  69. An overview on edge computing research. IEEE access, 8:85714–85728, 2020.
  70. Survey on mobile edge-cloud computing: A taxonomy on computation offloading approaches. Security and Privacy Preserving for IoT and 5G Networks: Techniques, Challenges, and New Directions, pages 117–158, 2022.
  71. Towards efficient edge cloud augmentation for virtual reality mmogs. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing, pages 1–14, 2017.
  72. Caching and computing at the edge for mobile augmented reality and virtual reality (ar/vr) in 5g. In Ad Hoc Networks: 9th International Conference, AdHocNets 2017, Niagara Falls, ON, Canada, September 28–29, 2017, Proceedings, pages 169–177. Springer, 2018.
  73. Distributed and efficient object detection in edge computing: Challenges and solutions. IEEE Network, 32(6):137–143, 2018.
  74. Edgelens: Deep learning based object detection in integrated iot, fog and cloud computing environments. In 2019 4th International Conference on Information Systems and Computer Networks (ISCON), pages 496–502. IEEE, 2019.
  75. Security and privacy issues in cloud, fog and edge computing. Procedia Computer Science, 160:734–739, 2019.
  76. Computation offloading in mobile edge computing networks: A survey. Journal of Network and Computer Applications, page 103366, 2022.
  77. Survey on computation offloading in uav-enabled mobile edge computing. Journal of Network and Computer Applications, page 103341, 2022.
  78. Learning based massive data offloading in the iov: Routing based on pre-rlga. IEEE Transactions on Network Science and Engineering, 9(4):2330–2340, 2022.
  79. Fog computing for energy-efficient data offloading of iot applications in industrial sensor networks. IEEE Sensors Journal, 22(9):8663–8671, 2022.
  80. Green learning: Introduction, examples and outlook. Journal of Visual Communication and Image Representation, page 103685, 2022.
  81. Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys & Tutorials, 23(3):1622–1658, 2021.
  82. Federated learning for internet of things: Recent advances, taxonomy, and open challenges. IEEE Communications Surveys & Tutorials, 23(3):1759–1799, 2021.
  83. Distributed q𝑞qitalic_q-learning-based online optimization algorithm for unit commitment and dispatch in smart grid. IEEE transactions on cybernetics, 50(9):4146–4156, 2019.
  84. Online optimization of wireless powered mobile-edge computing for heterogeneous industrial internet of things. IEEE Internet of Things Journal, 6(6):9880–9892, 2019.
  85. Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  86. Knowledge distillation: A survey. International Journal of Computer Vision, 129:1789–1819, 2021.
  87. Smart network maintenance in an edge cloud computing environment: An adaptive model compression algorithm based on model pruning and model clustering. IEEE Transactions on Network and Service Management, 2022.
  88. Model pruning enables efficient federated learning on edge devices. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  89. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  90. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  91. Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model. arXiv preprint arXiv:2201.11990, 2022.
  92. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239, 2022.
  93. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
  94. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
  95. Adversarial audio synthesis. arXiv preprint arXiv:1802.04208, 2018.
  96. Gansynth: Adversarial neural audio synthesis. arXiv preprint arXiv:1902.08710, 2019.
  97. Flowavenet: A generative flow for raw audio. arXiv preprint arXiv:1811.02155, 2018.
  98. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
  99. Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350, 2021.
  100. Acg-engine: An inference accelerator for content generative neural networks. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1–7. IEEE, 2019.
  101. A generative ai for heterogeneous network-on-chip design space pruning. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1135–1138. IEEE, 2022.
  102. Hiertrain: Fast hierarchical edge ai learning with hybrid parallelism in mobile-edge-cloud computing. IEEE Open Journal of the Communications Society, 1:634–645, 2020.
  103. Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7, 2023.
  104. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  105. Personalized knowledge graph summarization: From the cloud to your pocket. In 2019 IEEE International Conference on Data Mining (ICDM), pages 528–537. IEEE, 2019.
  106. A perceptual quality assessment exploration for aigc images. arXiv preprint arXiv:2303.12618, 2023.
  107. Generative ai-aided optimization for ai-generated content (aigc) services in edge networks. arXiv preprint arXiv:2303.13052, 2023.
  108. Enabling ai-generated content (aigc) services in wireless edge networks. arXiv preprint arXiv:2301.03220, 2023.
  109. Defakehop: A light-weight high-performance deepfake detector. In 2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2021.
  110. NITES: A non-parametric interpretable texture synthesis method. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 1698–1706. IEEE, 2020.
  111. Tghop: an explainable, efficient, and lightweight method for texture generation. APSIPA Transactions on Signal and Information Processing, 10, 2021.
  112. Pager: Progressive attribute-guided extendable robust image generation. arXiv preprint arXiv:2206.00162, 2022.
  113. Genhop: An image generation method based on successive subspace learning. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 3314–3318. IEEE, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yun-Cheng Wang (17 papers)
  2. Jintang Xue (7 papers)
  3. Chengwei Wei (17 papers)
  4. C. -C. Jay Kuo (176 papers)
Citations (14)