Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LOTUS: Improving Transformer Efficiency with Sparsity Pruning and Data Lottery Tickets (2405.00906v1)

Published 1 May 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Vision transformers have revolutionized computer vision, but their computational demands present challenges for training and deployment. This paper introduces LOTUS (LOttery Transformers with Ultra Sparsity), a novel method that leverages data lottery ticket selection and sparsity pruning to accelerate vision transformer training while maintaining accuracy. Our approach focuses on identifying and utilizing the most informative data subsets and eliminating redundant model parameters to optimize the training process. Through extensive experiments, we demonstrate the effectiveness of LOTUS in achieving rapid convergence and high accuracy with significantly reduced computational requirements. This work highlights the potential of combining data selection and sparsity techniques for efficient vision transformer training, opening doors for further research and development in this area.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (5)
  1. Instant soup: cheap pruning ensembles in a single pass can draw lottery tickets from large models. In Proceedings of the 40th International Conference on Machine Learning (¡conf-loc¿, ¡city¿Honolulu¡/city¿, ¡state¿Hawaii¡/state¿, ¡country¿USA¡/country¿, ¡/conf-loc¿) (ICML’23). JMLR.org, Article 599, 11 pages.
  2. The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter. In Thirty-seventh Conference on Neural Information Processing Systems. https://openreview.net/forum?id=bU9hwbsVcy
  3. DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. In Advances in Neural Information Processing Systems. https://arxiv.org/abs/2106.02034
  4. Data Level Lottery Ticket Hypothesis for Vision Transformers. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-2023). International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2023/153
  5. Attention is All you Need. In Neural Information Processing Systems. https://api.semanticscholar.org/CorpusID:13756489

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com