The Causal Chambers: Real Physical Systems as a Testbed for AI Methodology
Abstract: In some fields of AI, machine learning and statistics, the validation of new methods and algorithms is often hindered by the scarcity of suitable real-world datasets. Researchers must often turn to simulated data, which yields limited information about the applicability of the proposed methods to real problems. As a step forward, we have constructed two devices that allow us to quickly and inexpensively produce large datasets from non-trivial but well-understood physical systems. The devices, which we call causal chambers, are computer-controlled laboratories that allow us to manipulate and measure an array of variables from these physical systems, providing a rich testbed for algorithms from a variety of fields. We illustrate potential applications through a series of case studies in fields such as causal discovery, out-of-distribution generalization, change point detection, independent component analysis, and symbolic regression. For applications to causal inference, the chambers allow us to carefully perform interventions. We also provide and empirically validate a causal model of each chamber, which can be used as ground truth for different tasks. All hardware and software is made open source, and the datasets are publicly available at causalchamber.org or through the Python package causalchamber.
- “Causation, Prediction, and Search” MIT Press, 2000
- Judea Pearl “Causality” Cambridge University Press, 2009
- Jonas Peters, Dominik Janzing and Bernhard Schölkopf “Elements of Causal Inference: Foundations and Learning Algorithms” MIT Press, 2017
- “Distilling Free-Form Natural Laws from Experimental Data” In Science 324.5923, 2009, pp. 81–85
- “Contemporary Symbolic Regression Methods and their Relative Performance” In 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021
- “Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations” In Proceedings of the 36th International Conference on Machine Learning, 2019, pp. 4114–4124
- “Toward causal representation learning” In Proceedings of the IEEE 109.5 IEEE, 2021, pp. 612–634
- “Wilds: A benchmark of in-the-wild distribution shifts” In Proceedings of the 38th International Conference on Machine Learning, 2021, pp. 5637–5664
- Juan L. Gamella and Christina Heinze-Deml “Active invariant causal prediction: experiment selection through stability” In Advances in Neural Information Processing Systems 33, 2020, pp. 15464–15475
- “causalAssembly: Generating Realistic Production Data for Benchmarking Causal Discovery” In Causal Learning and Reasoning, 2024, pp. 609–642 PMLR
- “CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery” In The Twelfth International Conference on Learning Representations, 2023
- “AI Feynman: A physics-inspired method for symbolic regression” In Science Advances 6.16 American Association for the Advancement of Science, 2020, pp. eaay2631
- “DREAM4: Combining genetic and dynamic information to identify biological networks and dynamical models” In PloS one 5.10 Public Library of Science San Francisco, USA, 2010, pp. e13397
- Steffen L. Lauritzen “Causal inference from graphical models” In Monographs on Statistics and Applied Probability 87 Chapman & Hall, 2001, pp. 63–108
- Judea Pearl “Causal inference in statistics: An overview” In Statistics Surveys 3 The author, under a Creative Commons Attribution License, 2009, pp. 96–146
- David M. Chickering “Optimal structure identification with greedy search” In Journal of Machine Learning Research 3, 2002, pp. 507–554
- Clark Glymour, Kun Zhang and Peter Spirtes “Review of Causal Discovery Methods Based on Graphical Models” In Frontiers in Genetics 10, 2019
- Christina Heinze-Deml, Marloes H. Maathuis and Nicolai Meinshausen “Causal structure learning” In Annual Review of Statistics and Its Application 5 Annual Reviews, 2018, pp. 371–391
- Jakob Runge “Causal network reconstruction from time series: From theoretical assumptions to practical estimation” In Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 AIP Publishing, 2018
- “Foundations of structural causal models with cycles and latent variables” In The Annals of Statistics 49.5 Institute of Mathematical Statistics, 2021
- Tom Claassen and Joris M. Mooij “Establishing Markov equivalence in cyclic directed graphs” In Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence, 2023, pp. 433–442
- “A linear non-Gaussian acyclic model for causal discovery” In Journal of Machine Learning Research 7, 2006, pp. 2003–2030
- Peter Spirtes, Christopher Meek and Thomas Richardson “An algorithm for causal inference in the presence of latent variables and selection bias” In Computation, Causation, and Discovery 21, 1999, pp. 211–252
- David M. Chickering “A Transformational Characterization of Equivalent Bayesian Network Structures” In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, 1995, pp. 87–98
- Chandler Squires, Yuhao Wang and Caroline Uhler “Permutation-based causal structure learning with unknown intervention targets” In Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence, 2020, pp. 1039–1048
- Jakob Runge “Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets” In Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence, 2020, pp. 1388–1397
- Vaishnavh Nagarajan, Anders Andreassen and Behnam Neyshabur “Understanding the failure modes of out-of-distribution generalization” In Proceedings of the 8th International Conference on Learning Representations, 2020
- “Shortcut learning in deep neural networks” In Nature Machine Intelligence 2.11 Nature Publishing Group UK London, 2020, pp. 665–673
- “Anchor regression: Heterogeneous data meet causality” In Journal of the Royal Statistical Society Series B (Statistical Methodology) 83.2 Royal Statistical Society, 2021, pp. 215–246
- Kunihiko Fukushima “Neocognitron: A hierarchical neural network capable of visual pattern recognition” In Neural Networks 1.2, 1988, pp. 119–130
- Jonas Peters, Peter Bühlmann and Nicolai Meinshausen “Causal inference by using invariant prediction: identification and confidence intervals” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78.5 Wiley Online Library, 2016, pp. 947–1012
- Charles Truong, Laurent Oudre and Nicolas Vayatis “Selective review of offline change point detection methods” In Signal Processing 167, 2020, pp. 107299
- Samaneh Aminikhanghahi and Diane J Cook “A survey of methods for time series change point detection” In Knowledge and information systems 51.2 Springer, 2017, pp. 339–367
- Malte Londschien, Peter Bühlmann and Solt Kovács “Random Forests for Change Point Detection” In Journal of Machine Learning Research 24.216, 2023, pp. 1–45
- Aapo Hyvärinen, Ilyes Khemakhem and Hiroshi Morioka “Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning” In Patterns 4.10 Elsevier, 2023
- Aapo Hyvärinen, Juha Karhunen and Erkki Oja “Independent Component Analysis” Wiley Interscience, 2001
- “Independent component analysis: algorithms and applications” In Neural networks 13.4-5 Elsevier, 2000, pp. 411–430
- Aapo Hyvarinen “Fast and robust fixed-point algorithms for independent component analysis” In IEEE Transactions on Neural Networks 10.3 IEEE, 1999, pp. 626–634
- “Discovering symbolic models from deep learning with inductive biases” In Advances in Neural Information Processing Systems 33, 2020, pp. 17429–17442
- “End-to-end symbolic regression with transformers” In Advances in Neural Information Processing Systems 35, 2022, pp. 10269–10281
- “Physics-informed machine learning” In Nature Reviews Physics 3.6 Nature Publishing Group UK London, 2021, pp. 422–440
- Kyle Cranmer, Johann Brehmer and Gilles Louppe “The frontier of simulation-based inference” In Proceedings of the National Academy of Sciences 117.48, 2020, pp. 30055–30062
- “Physics-integrated variational autoencoders for robust and interpretable generative modeling” In Advances in Neural Information Processing Systems 34, 2021, pp. 14809–14821
- “Robust Hybrid Learning With Expert Augmentation” In Transaction on Machine Learning Research, 2023
- “Augmenting physical models with deep networks for complex dynamics forecasting” In Journal of Statistical Mechanics: Theory and Experiment 2021.12 IOP Publishing, 2021, pp. 124012
- “Gradient-based learning applied to document recognition” In Proceedings of the IEEE 86.11, 1998, pp. 2278–2324
- Yasuki Nakayama “Introduction to fluid mechanics” Butterworth-Heinemann, 2018
- Philip Leckner “Ludwig’s Applied Process Design for Chemical and Petrochemical Plants Volume 1, By A. Kayode Coker” In Chemical Engineering 115.7 Access Intelligence, LLC, 2008, pp. 8–9
- Johann Tang “Fan Basics: Air Flow, Static Pressure, and Impedance” Accessed: 2024-01-28, https://blog.orientalmotor.com/fan-basics-air-flow-static-pressure-impedance
- Edward Collett “Field guide to polarization” International society for opticsphotonics, 2005
- Jose Lages, Remo Giust and Jean-Marie Vigoureux “Composition law for polarizers” In Physical Review A 78.3 APS, 2008, pp. 033810
- Nickolay Smirnov “Table for estimating the goodness of fit of empirical distributions” In The Annals of Mathematical Statistics 19.2 Institute of Mathematical Statistics, 1948, pp. 279–281
- Henry B Mann and Donald R Whitney “On a test of whether one of two random variables is stochastically larger than the other” In The Annals of Mathematical Statistics JSTOR, 1947, pp. 50–60
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.