Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 69 tok/s

Gemini 2.5 Pro 53 tok/s Pro

GPT-5 Medium 42 tok/s Pro

GPT-5 High 41 tok/s Pro

GPT-4o 120 tok/s Pro

Kimi K2 191 tok/s Pro

GPT OSS 120B 459 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Gradient Shaping Beyond Clipping: A Functional Perspective on Update Magnitude Control (2510.01578v1)

Published 2 Oct 2025 in cs.LG

Abstract: Gradient clipping is widely used to stabilize deep network training, but its formulation as a hard, fixed threshold limits flexibility and ignores gradient distribution dynamics. We propose SPAMP (Statistical Per-layer Adaptive Modulation and Projection), a unified framework that generalizes clipping into smooth, per-layer gradient shaping. SPAMP tracks local gradient statistics, dynamically estimates thresholds, and applies power-based transformations to modulate update magnitudes in a differentiable manner. This perspective recasts clipping and warmup as dual mechanisms for controlling the effective update scale $\eta_t |g_t|$, offering a principled alternative to rigid heuristics. Extensive experiments across image and language tasks demonstrate that SPAMP improves stability, convergence, and robustness over existing methods.