Effect of fixed training sequence lengths on generalization and state abstraction

Ascertain whether training Kinaema and related recurrent sequence models with constant sequence lengths causes the models to conflate temporal state updates with abstraction layering, thereby hindering generalization to longer sequences, and determine the mechanisms responsible for this behavior.

Background

In the ablation experiments, the authors compare training regimes that randomize sequence length versus those that keep it fixed and observe poorer generalization when sequence length is constant.

They hypothesize a specific failure mode: that fixed-length training might make the recurrent updates serve dual roles—temporal progression and abstraction changes—thus degrading out-of-distribution performance on longer sequences.

References

We found that training with constant lengths hindered generalization to longer sequences. We conjecture that it leads to models confusing the notion of 'state' (in a control theory sense) with 'layer of abstraction', i.e using recurrent updates not only to push representations forward in time, but also to make changes in abstraction levels as a neural network would do between layers.

— Kinaema: a recurrent sequence model for memory and pose in motion (2510.20261 - Sariyildiz et al., 23 Oct 2025) in Section 5, Ablation studies (Table 5: Kinaema ablations of different losses)

Effect of fixed training sequence lengths on generalization and state abstraction

Background

References

Related Problems