Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 70 tok/s
Gemini 2.5 Pro 45 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 102 tok/s Pro
Kimi K2 212 tok/s Pro
GPT OSS 120B 466 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training (2408.17081v2)

Published 30 Aug 2024 in cs.CV

Abstract: Recent Vision Mamba (Vim) models exhibit nearly linear complexity in sequence length, making them highly attractive for processing visual data. However, the training methodologies and their potential are still not sufficiently explored. In this paper, we investigate strategies for Vim and propose Stochastic Layer-Wise Shuffle (SLWS), a novel regularization method that can effectively improve the Vim training. Without architectural modifications, this approach enables the non-hierarchical Vim to get leading performance on ImageNet-1K compared with the similar type counterparts. Our method operates through four simple steps per layer: probability allocation to assign layer-dependent shuffle rates, operation sampling via Bernoulli trials, sequence shuffling of input tokens, and order restoration of outputs. SLWS distinguishes itself through three principles: \textit{(1) Plug-and-play:} No architectural modifications are needed, and it is deactivated during inference. \textit{(2) Simple but effective:} The four-step process introduces only random permutations and negligible overhead. \textit{(3) Intuitive design:} Shuffling probabilities grow linearly with layer depth, aligning with the hierarchical semantic abstraction in vision models. Our work underscores the importance of tailored training strategies for Vim models and provides a helpful way to explore their scalability.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.