Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simplicity bias, algorithmic probability, and the random logistic map (2401.00593v2)

Published 31 Dec 2023 in cs.IT, math.DS, math.IT, nlin.CD, and stat.ML

Abstract: Simplicity bias is an intriguing phenomenon prevalent in various input-output maps, characterized by a preference for simpler, more regular, or symmetric outputs. Notably, these maps typically feature high-probability outputs with simple patterns, whereas complex patterns are exponentially less probable. This bias has been extensively examined and attributed to principles derived from algorithmic information theory and algorithmic probability. In a significant advancement, it has been demonstrated that the renowned logistic map and other one-dimensional maps exhibit simplicity bias when conceptualized as input-output systems. Building upon this work, our research delves into the manifestations of simplicity bias within the random logistic map, specifically focusing on scenarios involving additive noise. We discover that simplicity bias is observable in the random logistic map for specific ranges of $\mu$ and noise magnitudes. Additionally, we find that this bias persists even with the introduction of small measurement noise, though it diminishes as noise levels increase. Our studies also revisit the phenomenon of noise-induced chaos, particularly when $\mu=3.83$, revealing its characteristics through complexity-probability plots. Intriguingly, we employ the logistic map to illustrate a paradoxical aspect of data analysis: more data adhering to a consistent trend can occasionally lead to \emph{reduced} confidence in extrapolation predictions, challenging conventional wisdom. We propose that adopting a probability-complexity perspective in analyzing dynamical systems could significantly enrich statistical learning theories related to series prediction and analysis. This approach not only facilitates a deeper understanding of simplicity bias and its implications but also paves the way for novel methodologies in forecasting complex systems behavior.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. Balanced reduction of nonlinear control systems in reproducing kernel hilbert space. In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 294–301, 2010.
  2. Nulty & Ors v. Milton Keynes Borough Council, 2013. [2013] EWCA Civ 15, Case No. A1/2012/0459. http://www.bailii.org/ew/cases/EWCA/Civ/2013/15.html.
  3. Identification of distributed parameter systems: A neural net based approach. Computers & Chemical Engineering, 22:S965–S968, 1998. European Symposium on Computer Aided Process Engineering-8.
  4. Data-driven prediction of a multi-scale lorenz 96 chaotic system using a hierarchy of deep learning methods: Reservoir computing, ann, and RNN-LSTM. CoRR, abs/1906.08829, 2019.
  5. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15):3932–3937, 2016.
  6. Using machine learning to replicate chaotic attractors and calculate lyapunov exponents from data. Chaos: An Interdisciplinary Journal of Nonlinear Science, 27(12):121102, 2017.
  7. A. Nielsen. Practical Time Series Analysis: Prediction with Statistics and Machine Learning. O’Reilly Media, 2019.
  8. Physics-constrained, low-dimensional models for magnetohydrodynamics: First-principles and data-driven approaches. Physical Review E, 104(1):015206, July 2021.
  9. Parsimony as the ultimate regularizer for physics-informed machine learning. Nonlinear Dynamics, January 2022.
  10. Kernel methods for center manifold approximation and a weak data-based version of the center manifold theorems. Physica D, 2021.
  11. Approximation of Lyapunov functions from noisy data. Journal of Computational Dynamics, 2019. https://arxiv.org/abs/1601.01568.
  12. Learning dynamical systems from data: A simple cross-validation perspective, part i: Parametric kernel flows. Physica D: Nonlinear Phenomena, 421:132817, 2021.
  13. Kernel methods for the approximation of discrete-time linear autonomous and control systems. SN Applied Sciences, 1(7):674, July 2019.
  14. Kernel-based approximation of the koopman generator and schrödinger operator. Entropy, 22, 2020. https://www.mdpi.com/1099-4300/22/7/722.
  15. Data-driven approximation of the koopman generator: Model reduction, system identification, and control. Physica D: Nonlinear Phenomena, 406:132416, 2020.
  16. Operator-theoretic framework for forecasting nonlinear time series with kernel analog techniques. Physica D: Nonlinear Phenomena, 409:132520, 2020.
  17. Dimensionality reduction of complex metastable systems via kernel embeddings of transition manifold, 2019. https://arxiv.org/abs/1904.08622.
  18. Empirical estimators for stochastically forced nonlinear systems: Observability, controllability and the invariant measure. Proc. of the 2012 American Control Conference, pages 294–301, 2012. https://arxiv.org/abs/1204.0563v1.
  19. Kernel methods for the approximation of nonlinear systems. SIAM J. Control and Optimization, 2017. https://arxiv.org/abs/1108.2903.
  20. Kernel methods for the approximation of some key quantities of nonlinear systems. Journal of Computational Dynamics, 1, 2017. http://arxiv.org/abs/1204.0563.
  21. A note on kernel methods for multiscale systems with critical transitions. Mathematical Methods in the Applied Sciences, 42(3):907–917, 2019.
  22. David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Copyright Cambridge University Press, 2003.
  23. R. J. Solomonoff. A preliminary report on a general theory of inductive inference (revision of report v-131). Contract AF, 49(639):376, 1960.
  24. A.N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of information transmission, 1(1):1–7, 1965.
  25. Gregory J Chaitin. A theory of program size formally identical to information theory. Journal of the ACM (JACM), 22(3):329–340, 1975.
  26. Ray J Solomonoff. A formal theory of inductive inference. part i. Information and control, 7(1):1–22, 1964.
  27. L.A. Levin. Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Problemy Peredachi Informatsii, 10(3):30–35, 1974.
  28. Algorithmic probability. Scholarpedia, 2(8):2572, 2007.
  29. Exploring simplicity bias in 1d dynamical systems. arXiv preprint arXiv:2403.06989, 2024.
  30. Random dynamical systems. Springer, 1995.
  31. Knudsen’s law and random billiards in irrational triangles. Nonlinearity, 26(2):369, 2012.
  32. G Mayer-Kress and H Haken. The influence of noise on the logistic model. Journal of Statistical Physics, 26:149–171, 1981.
  33. Input–output maps are strongly biased towards simple outputs. Nature communications, 9(1):761, 2018.
  34. Generic predictions of output probability based on complexities of inputs and outputs. Scientific reports, 10(1):1–9, 2020.
  35. Deep learning generalizes because the parameter-function map is biased towards simple functions. arXiv preprint arXiv:1805.08522, 2018.
  36. Neural networks are a priori biased towards boolean functions with low entropy. arXiv preprint arXiv:1909.11522, 2019.
  37. Simplicity bias in transformers and their ability to learn sparse boolean functions. arXiv preprint arXiv:2211.12316, 2022.
  38. A fine-grained spectral perspective on neural networks. arXiv preprint arXiv:1907.10599, 2019.
  39. Algorithmic probability-guided machine learning on non-differentiable spaces. Frontiers in artificial intelligence, 3:567356, 2021.
  40. Multiclass classification utilising an estimated algorithmic probability prior. Physica D: Nonlinear Phenomena, 448:133713, 2023.
  41. Do deep neural networks have an inbuilt occam’s razor? arXiv preprint arXiv:2304.06670, 2023.
  42. M. Li and P.M.B. Vitanyi. An introduction to Kolmogorov complexity and its applications. Springer-Verlag New York Inc, 2008.
  43. Hopf bifurcation with additive noise. Nonlinearity, 31(10):4567, 2018.
  44. On the algorithmic nature of the world. In Information and Computation: Essays on Scientific and Philosophical Understanding of Foundations of Information and Computation, pages 477–496. World Scientific, 2011.
  45. A note on a priori forecasting and simplicity bias in time series. Physica A: Statistical Mechanics and Its Applications, 609:128339, 2023.
  46. Kelty Ann Allen. Martin-Löf Randomness and Brownian Motion. PhD thesis, University of California, Berkeley, Berkeley, May 2014. Available at https://escholarship.org/uc/item/20072582.
  47. Logistic chaotic maps for binary numbers generations. Chaos, Solitons & Fractals, 40(5):2557–2568, 2009.
  48. C.S. Calude. Information and randomness: An algorithmic perspective. Springer, 2002.
  49. P. Gács. Lecture notes on descriptional complexity and randomness. Boston University, Graduate School of Arts and Sciences, Computer Science Department, 1988.
  50. Kolmogorov complexity and algorithmic randomness, volume 220. American Mathematical Society, 2022.
  51. Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks. Physica A: Statistical Mechanics and its Applications, 404:341–358, 2014.
  52. Predicting phenotype transition probabilities via conditional algorithmic probability approximations. Journal of the Royal Society Interface, 19(197):20220694, 2022.
  53. Low complexity, low probability patterns and consequences for algorithmic probability applications. Complexity, 2023, 2023.
  54. Symmetry and simplicity spontaneously emerge from the algorithmic nature of evolution. Proceedings of the National Academy of Sciences, 119(11):e2113883119, 2022.
  55. Dynamical characterization of stochastic bifurcations in a random logistic map. arXiv preprint arXiv:1811.03994, 2018.
  56. An introduction to symbolic dynamics and coding. Cambridge university press, 1995.
  57. F. Kaspar and HG Schuster. Easily calculable measure for the complexity of spatiotemporal patterns. Physical Review A, 36(2):842, 1987.
  58. A. Lempel and J. Ziv. On the complexity of finite sequences. Information Theory, IEEE Transactions on, 22(1):75–81, 1976.
  59. A first course in dynamics: with a panorama of recent developments. Cambridge University Press, 2003.
  60. Arno Berger. Chaos and Chance: An Introduction to Stochastic Apects of Dynamics. Walter de Gruyter, 2001.
  61. Fluctuations and simple chaotic dynamics. Physics Reports, 92(2):45–82, 1982.
  62. Marcus Hutter. Universal artificial intelligence: Sequential decisions based on algorithmic probability. Springer Science & Business Media, 2004.
  63. Sven Neth. A dilemma for solomonoff prediction. Philosophy of Science, 90(2):288–306, 2023.
  64. Marcus Hutter. On universal prediction and bayesian confirmation. Theoretical Computer Science, 384(1):33–48, 2007.
  65. An algorithmic information calculus for causal discovery and reprogramming systems. Iscience, 19:1160–1172, 2019.
  66. Ray Solomonoff. Complexity-based induction systems: comparisons and convergence theorems. IEEE transactions on Information Theory, 24(4):422–432, 1978.
  67. Paul MB Vitányi. Similarity and denoising. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371(1984):20120091, 2013.
  68. Coding-theorem like behaviour and emergence of the universal distribution from resource-bounded algorithmic probability. International Journal of Parallel, Emergent and Distributed Systems, 34(2):161–180, 2019.
  69. Pieter Adriaans. Learning as data compression. In Computation and Logic in the Real World: Third Conference on Computability in Europe, CiE 2007, Siena, Italy, June 18-23, 2007. Proceedings 3, pages 11–24. Springer, 2007.
  70. Language modeling is compression. arXiv preprint arXiv:2309.10668, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.