Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 100 tok/s

Gemini 2.5 Pro 51 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 103 tok/s Pro

Kimi K2 200 tok/s Pro

GPT OSS 120B 447 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices (2502.15816v1)

Published 19 Feb 2025 in cs.DC

Abstract: Generative Artificial Intelligence (GenAI) applies models and algorithms such as LLM and Foundation Model (FM) to generate new data. GenAI, as a promising approach, enables advanced capabilities in various applications, including text generation and image processing. In current practice, GenAI algorithms run mainly on the cloud server, leading to high latency and raising security concerns. Consequently, these challenges encourage the deployment of GenAI algorithms directly on edge devices. However, the large size of such models and their significant computational resource requirements pose obstacles when deploying them in resource-constrained systems. This survey provides a comprehensive overview of recent proposed techniques that optimize GenAI for efficient deployment on resource-constrained edge devices. For this aim, this work highlights three main categories for bringing GenAI to the edge: software optimization, hardware optimization, and frameworks. The main takeaways for readers of this survey will be a clear roadmap to design, implement, and refine GenAI systems for real-world implementation on edge devices.