Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey of Resource-efficient LLM and Multimodal Foundation Models (2401.08092v2)

Published 16 Jan 2024 in cs.LG, cs.AI, and cs.DC
A Survey of Resource-efficient LLM and Multimodal Foundation Models

Abstract: Large foundation models, including LLMs, vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment. However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources. To support the growth of these large models in a scalable and environmentally sustainable way, there has been a considerable focus on developing resource-efficient strategies. This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects. It offers a comprehensive analysis and valuable insights gleaned from existing literature, encompassing a broad array of topics from cutting-edge model architectures and training/serving algorithms to practical system designs and implementations. The goal of this survey is to provide an overarching understanding of how current approaches are tackling the resource challenges posed by large foundation models and to potentially inspire future breakthroughs in this field.

Overview of Resource-Efficient Models

The application of LLMs and multimodal foundation models has been revolutionary in various domains of machine learning. These models have displayed exceptional performance in tasks ranging from natural language processing to computer vision. However, their versatility comes with significant resource requirements, necessitating research into the development of resource-efficient strategies.

Algorithmic and Systemic Analysis

The survey explores the importance of research in resource-efficiency for LLMs, exploring both algorithmic and systemic aspects. Algorithmic advancements comprise a comprehensive review of model architectures, while systemic aspects encompass the practical implementation within computing systems. Analyses are detailed for different types of models, including text, image, and multimodal variants.

The Architecture of Foundation Models

Language foundation models, for instance, have seen numerous architectural improvements—whether through the optimization of attention mechanisms or through dynamic neural networks. These alterations aim to streamline the processing efficiency without compromising the models' ability to learn from data. Similar advancements are observed for vision foundation models, where the emphasis is on creating efficient transformer pipelines and encoder-decoder structures.

Training and Serving Considerations

Lastly, the survey considers the entire life cycle of large foundation models, from training to serving. Strategies for distributed training, model compression, and knowledge distillation are discussed, highlighting the challenges of scaling up these models and potential solutions to mitigate resource demands. Serving systems for foundation models, which facilitate their practical usage, are also assessed for their efficiency in handling various deployment scenarios, including cloud and edge computing environments.

In conclusion, current research efforts are consistently pushing the boundaries of resource-efficiency in foundation models. As the field continues to evolve, future breakthroughs are expected to further enhance the effectiveness of these models while reducing their impact on computational resources.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (18)
  1. Mengwei Xu (62 papers)
  2. Wangsong Yin (4 papers)
  3. Dongqi Cai (19 papers)
  4. Rongjie Yi (7 papers)
  5. Daliang Xu (9 papers)
  6. Qipeng Wang (15 papers)
  7. Bingyang Wu (7 papers)
  8. Yihao Zhao (10 papers)
  9. Chen Yang (193 papers)
  10. Shihe Wang (10 papers)
  11. Qiyang Zhang (16 papers)
  12. Zhenyan Lu (8 papers)
  13. Li Zhang (690 papers)
  14. Shangguang Wang (58 papers)
  15. Yuanchun Li (37 papers)
  16. Yunxin Liu (58 papers)
  17. Xin Jin (285 papers)
  18. Xuanzhe Liu (59 papers)
Citations (51)