Amortized nonmyopic active search via deep imitation learning (2405.15031v1)
Abstract: Active search formalizes a specialized active learning setting where the goal is to collect members of a rare, valuable class. The state-of-the-art algorithm approximates the optimal Bayesian policy in a budget-aware manner, and has been shown to achieve impressive empirical performance in previous work. However, even this approximate policy has a superlinear computational complexity with respect to the size of the search problem, rendering its application impractical in large spaces or in real-time systems where decisions must be made quickly. We study the amortization of this policy by training a neural network to learn to search. To circumvent the difficulty of learning from scratch, we appeal to imitation learning techniques to mimic the behavior of the expert, expensive-to-compute policy. Our policy network, trained on synthetic data, learns a beneficial search strategy that yields nonmyopic decisions carefully balancing exploration and exploitation. Extensive experiments demonstrate our policy achieves competitive performance at real-world tasks that closely approximates the expert's at a fraction of the cost, while outperforming cheaper baselines.
- Finding hotspots: development of an adaptive spatial sampling approach. Scientific Reports, 10, 2020.
- Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems 29, 2016.
- Peter Auer. Using Upper Confidence Bounds for Online Learning . In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pages 270–279. IEEE, 2000.
- Richard Bellman. Dynamic Programming. Princeton University Press, 1957.
- Amortized Inference for Gaussian Process Hyperparameters of Structured Kernels. In Uncertainty in Artificial Intelligence, pages 184–194, 2023.
- Optimizing Sequential Experimental Design with Deep Reinforcement Learning. In Proceedings of the 39th International Conference on Machine Learning, pages 2107–2128, 2022.
- GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling, 59(3):1096–1108, 2019.
- Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. In International Conference on Algorithmic Learning Theory, pages 189–203, 2011.
- Similarity Search for Efficient Active Learning and Search of Rare Concepts. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022.
- Deep Reinforcement Learning in Large Discrete Action Spaces. arXiv preprint, 2015. arXiv:1512.07679 [cs.AI].
- Learning how to Active Learn: A Deep Reinforcement Learning Approach. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 595–605, 2017.
- Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design. In Proceedings of the 38th International Conference on Machine Learning, pages 3384–3395, 2021.
- Bayesian Optimal Active Search and Surveying. In Proceedings of the 29th International Conference on Machine Learning, 2012.
- Introducing the ‘active search’ method for iterative virtual screening. Journal of Computer-Aided Molecular Design, 29:305–314, 2015.
- Graph Policy Network for Transferable Active Learning on Graphs. In Advances in Neural Information Processing Systems 33, pages 10174–10185, 2020.
- Multi-Agent Active Search: A Reinforcement Learning Approach. IEEE Robotics and Automation Letters, 7(2):754–761, 2021.
- Active Covering. In Proceedings of the 38th International Conference on Machine Learning, pages 5013–5022, 2021.
- Efficient Nonmyopic Active Search. In Proceedings of the 34th International Conference on Machine Learning, pages 1714–1723, 2017.
- Efficient nonmyopic batch active search. In Advances in Neural Information Processing Systems 31, pages 1099–1109, 2018.
- Cost effective active search. In Advances in Neural Information Processing Systems 32, pages 4880–4889, 2019.
- Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
- Nonequilibrium Phase Diagrams of Ternary Amorphous Alloys. Condensed Matters. Springer-Verlag, 1997.
- Adam: A Method for Stochastic Optimization. In Proceddings of the 3rd International Conference for Learning Representations, 2015.
- Auto-Encoding Variational Bayes. In Proceddings of the 2nd International Conference for Learning Representations, 2014.
- Semi-supervised Learning with Deep Generative Models. Advances in Neural Information Processing Systems 27, 2014.
- Learning Active Learning from Data. In Advances in Neural Information Processing Systems 30, 2017.
- Dennis V Lindley. On a Measure of the Information Provided by an Experiment . The Annals of Mathematical Statistics, 27(4):986–1005, 1956.
- Learning How to Actively Learn: A Deep Imitation Learning Approach. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pages 1874–1883, 2018a.
- Learning to Actively Learn Neural Machine Translation. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 334–344, 2018b.
- Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters. In Advances in Neural Information Processing Systems 33, pages 21440–21452, 2020.
- BindingDB: A web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Research, 35:D198–D201, 2007.
- David MacKay. The Evidence Framework Applied to Classification Networks. Neural Computation, 1992a.
- David MacKay. Information-Based Objective Functions for Active Data Selection. Neural Computation, 1992b.
- Local Latent Space Bayesian Optimization over Structured Inputs. Advances in Neural Information Processing Systems 35, pages 34505–34518, 2022.
- UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint, 2018. arXiv:1802.03426 [stat.ML].
- Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
- Guided Data Discovery in Interactive Visualizations via Active Search. In 2022 IEEE Visualization and Visual Analytics (VIS), pages 70–74. IEEE, 2022.
- Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design. In Proceedings of the 41st International Conference on Machine Learning, 2024. To appear.
- Nonmyopic Multiclass Active Search with Diminishing Returns for Diverse Discovery . In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics, 2023.
- Nonmyopic Multifidelity Acitve Search. In Proceedings of the 38th International Conference on Machine Learning, pages 8109–8118, 2021.
- Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.
- Extended-Connectivity Fingerprints. Journal of Chemical Information and Modeling, 50(5):742–754, 2010.
- A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning . In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pages 627–635, 2011.
- Learning to Optimize via Information-Directed Sampling. In Advances in Neural Information Processing Systems 27, 2014.
- Learning to Optimize via Information-Directed Sampling. Operations Research, 66:230–252, 2018.
- A Partially Supervised Reinforcement Learning Framework for Visual Active Search. In Advances in Neural Information Processing Systems 36, pages 12245–12270, 2023.
- A Visual Active Search Framework for Geospatial Exploration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 8316–8325, 2024.
- Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning. Computer Methods in Applied Mechanics and Engineering, 2023.
- ZINC 15–Ligand Discovery for Everyone. Journal of Chemical Information and Modeling, 55(11):2324–2337, 2015.
- Policy Gradient Methods for Reinforcement Learning with Function Approximation. Advances in Neural Information Processing Systems 12, 1999.
- ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery. Journal of Chemical Information and Modeling, 63(4):1166–1176, 2023.
- A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials, 2(1):1–7, 2016.
- Active learning in the drug discovery process. In Advances in Neural Information Processing Systems 15, pages 1449–1456, 2002.
- Active Learning with Support Vector Machines in the Drug Discovery Process. Journal of Chemical Information and Computer Sciences, 43(2):667–673, 2003.
- Chemical Similarity Searching. Journal of Chemical Information and Computer Sciences, 38(6):983–996, 1998.
- Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint, 2017. arXiv:1708.07747 [cs.LG].
- Adaptive Sampling for Discovery. In Advances in Neural Information Processing Systems 35, pages 1114–1126, 2022.