Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Amortized nonmyopic active search via deep imitation learning (2405.15031v1)

Published 23 May 2024 in cs.LG

Abstract: Active search formalizes a specialized active learning setting where the goal is to collect members of a rare, valuable class. The state-of-the-art algorithm approximates the optimal Bayesian policy in a budget-aware manner, and has been shown to achieve impressive empirical performance in previous work. However, even this approximate policy has a superlinear computational complexity with respect to the size of the search problem, rendering its application impractical in large spaces or in real-time systems where decisions must be made quickly. We study the amortization of this policy by training a neural network to learn to search. To circumvent the difficulty of learning from scratch, we appeal to imitation learning techniques to mimic the behavior of the expert, expensive-to-compute policy. Our policy network, trained on synthetic data, learns a beneficial search strategy that yields nonmyopic decisions carefully balancing exploration and exploitation. Extensive experiments demonstrate our policy achieves competitive performance at real-world tasks that closely approximates the expert's at a fraction of the cost, while outperforming cheaper baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Finding hotspots: development of an adaptive spatial sampling approach. Scientific Reports, 10, 2020.
  2. Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems 29, 2016.
  3. Peter Auer. Using Upper Confidence Bounds for Online Learning . In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pages 270–279. IEEE, 2000.
  4. Richard Bellman. Dynamic Programming. Princeton University Press, 1957.
  5. Amortized Inference for Gaussian Process Hyperparameters of Structured Kernels. In Uncertainty in Artificial Intelligence, pages 184–194, 2023.
  6. Optimizing Sequential Experimental Design with Deep Reinforcement Learning. In Proceedings of the 39th International Conference on Machine Learning, pages 2107–2128, 2022.
  7. GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling, 59(3):1096–1108, 2019.
  8. Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. In International Conference on Algorithmic Learning Theory, pages 189–203, 2011.
  9. Similarity Search for Efficient Active Learning and Search of Rare Concepts. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022.
  10. Deep Reinforcement Learning in Large Discrete Action Spaces. arXiv preprint, 2015. arXiv:1512.07679 [cs.AI].
  11. Learning how to Active Learn: A Deep Reinforcement Learning Approach. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 595–605, 2017.
  12. Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design. In Proceedings of the 38th International Conference on Machine Learning, pages 3384–3395, 2021.
  13. Bayesian Optimal Active Search and Surveying. In Proceedings of the 29th International Conference on Machine Learning, 2012.
  14. Introducing the ‘active search’ method for iterative virtual screening. Journal of Computer-Aided Molecular Design, 29:305–314, 2015.
  15. Graph Policy Network for Transferable Active Learning on Graphs. In Advances in Neural Information Processing Systems 33, pages 10174–10185, 2020.
  16. Multi-Agent Active Search: A Reinforcement Learning Approach. IEEE Robotics and Automation Letters, 7(2):754–761, 2021.
  17. Active Covering. In Proceedings of the 38th International Conference on Machine Learning, pages 5013–5022, 2021.
  18. Efficient Nonmyopic Active Search. In Proceedings of the 34th International Conference on Machine Learning, pages 1714–1723, 2017.
  19. Efficient nonmyopic batch active search. In Advances in Neural Information Processing Systems 31, pages 1099–1109, 2018.
  20. Cost effective active search. In Advances in Neural Information Processing Systems 32, pages 4880–4889, 2019.
  21. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  22. Nonequilibrium Phase Diagrams of Ternary Amorphous Alloys. Condensed Matters. Springer-Verlag, 1997.
  23. Adam: A Method for Stochastic Optimization. In Proceddings of the 3rd International Conference for Learning Representations, 2015.
  24. Auto-Encoding Variational Bayes. In Proceddings of the 2nd International Conference for Learning Representations, 2014.
  25. Semi-supervised Learning with Deep Generative Models. Advances in Neural Information Processing Systems 27, 2014.
  26. Learning Active Learning from Data. In Advances in Neural Information Processing Systems 30, 2017.
  27. Dennis V Lindley. On a Measure of the Information Provided by an Experiment . The Annals of Mathematical Statistics, 27(4):986–1005, 1956.
  28. Learning How to Actively Learn: A Deep Imitation Learning Approach. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pages 1874–1883, 2018a.
  29. Learning to Actively Learn Neural Machine Translation. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 334–344, 2018b.
  30. Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters. In Advances in Neural Information Processing Systems 33, pages 21440–21452, 2020.
  31. BindingDB: A web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Research, 35:D198–D201, 2007.
  32. David MacKay. The Evidence Framework Applied to Classification Networks. Neural Computation, 1992a.
  33. David MacKay. Information-Based Objective Functions for Active Data Selection. Neural Computation, 1992b.
  34. Local Latent Space Bayesian Optimization over Structured Inputs. Advances in Neural Information Processing Systems 35, pages 34505–34518, 2022.
  35. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint, 2018. arXiv:1802.03426 [stat.ML].
  36. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
  37. Guided Data Discovery in Interactive Visualizations via Active Search. In 2022 IEEE Visualization and Visual Analytics (VIS), pages 70–74. IEEE, 2022.
  38. Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design. In Proceedings of the 41st International Conference on Machine Learning, 2024. To appear.
  39. Nonmyopic Multiclass Active Search with Diminishing Returns for Diverse Discovery . In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics, 2023.
  40. Nonmyopic Multifidelity Acitve Search. In Proceedings of the 38th International Conference on Machine Learning, pages 8109–8118, 2021.
  41. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.
  42. Extended-Connectivity Fingerprints. Journal of Chemical Information and Modeling, 50(5):742–754, 2010.
  43. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning . In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pages 627–635, 2011.
  44. Learning to Optimize via Information-Directed Sampling. In Advances in Neural Information Processing Systems 27, 2014.
  45. Learning to Optimize via Information-Directed Sampling. Operations Research, 66:230–252, 2018.
  46. A Partially Supervised Reinforcement Learning Framework for Visual Active Search. In Advances in Neural Information Processing Systems 36, pages 12245–12270, 2023.
  47. A Visual Active Search Framework for Geospatial Exploration. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 8316–8325, 2024.
  48. Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning. Computer Methods in Applied Mechanics and Engineering, 2023.
  49. ZINC 15–Ligand Discovery for Everyone. Journal of Chemical Information and Modeling, 55(11):2324–2337, 2015.
  50. Policy Gradient Methods for Reinforcement Learning with Function Approximation. Advances in Neural Information Processing Systems 12, 1999.
  51. ZINC-22—A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery. Journal of Chemical Information and Modeling, 63(4):1166–1176, 2023.
  52. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials, 2(1):1–7, 2016.
  53. Active learning in the drug discovery process. In Advances in Neural Information Processing Systems 15, pages 1449–1456, 2002.
  54. Active Learning with Support Vector Machines in the Drug Discovery Process. Journal of Chemical Information and Computer Sciences, 43(2):667–673, 2003.
  55. Chemical Similarity Searching. Journal of Chemical Information and Computer Sciences, 38(6):983–996, 1998.
  56. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint, 2017. arXiv:1708.07747 [cs.LG].
  57. Adaptive Sampling for Discovery. In Advances in Neural Information Processing Systems 35, pages 1114–1126, 2022.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets