Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Statistically Efficient Bayesian Sequential Experiment Design via Reinforcement Learning with Cross-Entropy Estimators (2305.18435v2)

Published 29 May 2023 in cs.LG and stat.ME

Abstract: Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and the design policy. Our method overcomes the exponential-sample complexity of previous approaches and provide more accurate estimates of high EIG values. More importantly, it allows learning of superior design policies, and is compatible with continuous and discrete design spaces, non-differentiable likelihoods and even implicit probabilistic models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Abcd-strategy: Budgeted experimental design for targeted causal structure discovery. In International Conference on Artificial Intelligence and Statistics.
  2. Baltas, G. (2001). Utility-consistent brand demand systems with endogenous category consumption: principles and marketing applications. Decision Sciences, 32(3):399–422.
  3. The IM algorithm: a variational approach to information maximization. In Advances in neural information processing systems.
  4. Pyro: Deep Universal Probabilistic Programming. Journal of Machine Learning Research.
  5. Optimizing sequential experimental design with deep reinforcement learning. In International Conference on Machine Learning.
  6. Code optimizations for lazy evaluation. Lisp and Symbolic Computation, 1(2):147–164.
  7. The cross-entropy method for optimization. In Handbook of statistics, volume 31, pages 35–59. Elsevier.
  8. Conditioning by adaptive sampling for robust design. In International conference on machine learning, pages 773–782. PMLR.
  9. Bayesian experimental design: A review. Statistical Science, pages 273–304.
  10. Simpler, Faster, Stronger: Breaking The log-K Curse On Contrastive Learners With FlatNCE. pages 1–21.
  11. Flow-based recurrent belief state learning for pomdps. In International Conference on Machine Learning, pages 3444–3468. PMLR.
  12. Randomized ensembled double q-learning: Learning fast without a model. In International Conference on Learning Representations.
  13. The frontier of simulation-based inference. Proceedings of the National Academy of Sciences, 117(48):30055–30062.
  14. Density estimation using real nvp. In International Conference on Learning Representations.
  15. Hidden parameter markov decision processes: A semiparametric regression approach for discovering latent task parametrizations. In International Joint Conference on Artificial Intelligence.
  16. A sequential monte carlo algorithm to incorporate model uncertainty in Bayesian sequential design. Journal of Computational and Graphical Statistics, 23(1):3–24.
  17. Duff, M. O. (2002). Optimal Learning: Computational procedures for Bayes-adaptive Markov decision processes. University of Massachusetts Amherst.
  18. Deep adaptive design: Amortizing sequential Bayesian experimental design. International Conference on Machine Learning.
  19. Variational Bayesian optimal experimental design. In Advances in Neural Information Processing Systems.
  20. A unified stochastic gradient approach to designing Bayesian-optimal experiments. In International Conference on Artificial Intelligence and Statistics.
  21. Deep Bayesian active learning with image data. In International Conference on Machine Learning.
  22. Garage Contributors (2019). Garage: A toolkit for reproducible reinforcement learning research. https://github.com/rlworkgroup/garage.
  23. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Teh, Y. W. and Titterington, M., editors, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, volume 9 of Proceedings of Machine Learning Research, pages 297–304, Chia Laguna Resort, Sardinia, Italy. PMLR.
  24. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745.
  25. Implicit deep adaptive design: Policy-based experimental design without likelihoods. In Advances in Neural Information Processing Systems.
  26. Binoculars for efficient, nonmyopic sequential experimental design. In International Conference on Machine Learning.
  27. Planning and acting in partially observable stochastic domains. Artificial intelligence, 101(1-2):99–134.
  28. Bayesian experimental design for implicit models by mutual information neural estimation. In International Conference on Machine Learning.
  29. Nonmyopic active learning of gaussian processes: an exploration-exploitation approach. In International Conference on Machine learning.
  30. Continuous control with deep reinforcement learning. International Conference on Learning Representations.
  31. Lindley, D. V. (1956). On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, pages 986–1005.
  32. Formal limitations on the measurement of mutual information. In International Conference on Artificial Intelligence and Statistics.
  33. Rare events via cross-entropy population monte carlo. IEEE Signal Processing Letters, 29:439–443.
  34. Sequential experimental design for predator–prey functional response experiments. Journal of the Royal Society Interface, 17(166).
  35. On variational bounds of mutual information. In International Conference on Machine Learning.
  36. On nesting monte carlo estimators. In International Conference on Machine Learning.
  37. Modern Bayesian experimental design. arXiv preprint arXiv:2302.14545.
  38. Rubinstein, R. (1999). The cross-entropy method for combinatorial and continuous optimization. Methodology and computing in applied probability, 1:127–190.
  39. A review of modern computational algorithms for Bayesian optimal design. International Statistical Review, 84(1):128–154.
  40. Bayesian inference and online experimental design for mapping neural microcircuits. Advances in Neural Information Processing Systems.
  41. Model-based active exploration. In International Conference on Machine Learning.
  42. Normflows: A PyTorch Package for Normalizing Flows. arXiv preprint arXiv:2302.12014.
  43. Deep reinforcement learning for optimal experimental design in biology. PLOS Computational Biology, 18(11).
  44. Deep reinforcement learning and the deadly triad. In NeurIPS Deep Reinforcement Learning Workshop.
  45. Smoothed infonce: Breaking the log n curse without overshooting. In 2022 IEEE International Symposium on Information Theory (ISIT), pages 724–729.
  46. Learning likelihoods with conditional normalizing flows. arXiv preprint arXiv:1912.00042.
  47. Why non-myopic Bayesian optimization is promising and how far should we look-ahead? a study via rollout. In International Conference on Artificial Intelligence and Statistics.
  48. Uncertainty-aware active learning for optimal Bayesian classifier. In International Conference on Learning Representations.
  49. Varibad: variational bayes-adaptive deep rl via meta-learning. The Journal of Machine Learning Research, 22(1):13198–13236.

Summary

We haven't generated a summary for this paper yet.