Regeneration of the simplicity-based density order under severe data restrictions

Determine why the observed simple-to-complex density ranking is regenerated across different neural network architectures and density estimators even when training data are severely restricted, including regimes such as training on only the lowest-density 10% of the dataset or on a single lowest-density sample.

Background

The authors show that the simplicity-based density ranking persists when models are retrained using only the lowest-density 10% of the training data (LDT10) and, in extreme cases, when trained on a single lowest-density image (LDT1). For iGPT and Glow, the induced rankings remain highly correlated with those from full-data training, indicating the ranking is not merely inherited from data frequency.

This regeneration across architectures and estimators, and even under drastic data scarcity, suggests a deeper inductive bias or organizing principle in how networks structure data, which the authors identify as an unresolved problem.

References

Explaining why that order is regenerated across architectures, estimators, and even severely restricted training sets remains an open problem.

— Deep Networks Favor Simple Data (2604.00394 - Lu et al., 1 Apr 2026) in Conclusion

Regeneration of the simplicity-based density order under severe data restrictions

Background

References

Related Problems