Dice Question Streamline Icon: https://streamlinehq.com

Role of trainable decay rates in EMA-based memory

Investigate whether learning element-wise decay rates in an Exponential Moving Average (EMA) memory enables the separation of fast and slow memory features that accounts for the observed performance gains on Mem‑RPE.

Information Square Streamline Icon: https://streamlinehq.com

Background

The authors evaluate EMA memory with fixed versus trainable decay factors and find that trainable decay improves Mem‑RPE performance.

They hypothesize that element-wise trainable decay allows the model to represent multiple timescales via fast and slow features, explaining the improvement.

References

Making $\lambda$ trainable has a positive impact, which can be explained quite easily: we conjecture that it allows the model to choose whether certain memory features are 'fast' or 'slow'.

Kinaema: a recurrent sequence model for memory and pose in motion (2510.20261 - Sariyildiz et al., 23 Oct 2025) in Appendix E.1, EMA w. constant λ vs. trainable λ (Table 8)