Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration (2312.12583v2)
Abstract: We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.
- Adam, J. R. 2016. Europa Mission Overview. Technical report.
- Ahmed, N. 2018. Data-Free/Data-Sparse Softmax Parameter Estimation With Structured Class Geometries. IEEE Signal Processing Letters, 25: 1–1.
- Bayesian Multicategorical Soft Data Fusion for Human–Robot Collaboration. IEEE Transactions on Robotics, 29(1): 189–206.
- Finite-Time Analysis of the Multiarmed Bandit Problem. Mach. Learn., 47(2–3): 235–256.
- The probabilistic data association filter. IEEE Control Systems Magazine, 29(6): 82–100.
- Bishop, C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Berlin, Heidelberg: Springer-Verlag. ISBN 0387310738.
- Survey on Applications of Multi-Armed and Contextual Bandits. In 2020 IEEE Congress on Evolutionary Computation (CEC), 1–8.
- Data Validation for Machine Learning. In MLSys.
- OceanWATERS Lander Robotic Arm Operation. In 2021 IEEE Aerospace Conference (50100), 1–11.
- Friston, K. 2010. The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2): 127–138.
- Science goals and mission architecture of the Europa lander mission concept. The Planetary Science Journal, 3(1): 22.
- The impact of selecting a validation method in machine learning on predicting basketball game outcomes. Symmetry, 12(3): 431.
- Review of the recent advances and applications of LIBS-based imaging. Spectrochimica Acta Part B: Atomic Spectroscopy, 151: 41–53.
- Multiple hypothesis tracking revisited. In Proceedings of the IEEE international conference on computer vision, 4696–4704.
- Kochenderfer, M. J. 2015. Decision making under uncertainty: theory and application. MIT press.
- Kurniawati, H. 2022. Partially observable markov decision processes and robotics. Annual Review of Control, Robotics, and Autonomous Systems, 5: 253–277.
- Active Inference in Robotics and Artificial Agents: Survey and Challenges. CoRR, abs/2112.01871.
- Tracking in clutter with nearest neighbor filters: analysis and performance. IEEE transactions on aerospace and electronic systems, 32(3): 995–1010.
- An empirical evaluation of active inference in multi-armed bandits. Neural Networks; 2021 Special Issue on AI and Brain Science: AI-powered Brain Science, 144: 229–246.
- Everybody Needs Somebody Sometimes: Validation of Adaptive Recovery in Robotic Space Operations. IEEE Robotics and Automation Letters, 4(2): 1216–1223.
- REASON-RECOURSE Software for Science Operations of Autonomous Robotic Landers. In 2023 IEEE Aerospace Conference, 1–11.
- Europa Clipper Mission Concept: Exploring Jupiter’s Ocean Moon. EOS Transactions, 95(20): 165–167.
- You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788.
- Runnalls, A. 2007. Kullback-Leibler Approach to Gaussian Mixture Reduction. Aerospace and Electronic Systems, IEEE Transactions on, 43: 989 – 999.
- Salmond, D. J. 1990. Mixture reduction algorithms for target tracking in clutter. In Drummond, O. E., ed., Signal and Data Processing of Small Targets 1990, volume 1305 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, 434–445.
- A step-by-step tutorial on active inference and its application to empirical data. Journal of mathematical psychology, 107: 102632.
- Thompson, W. R. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25: 285–294.
- Human–Robot Communications of Probabilistic Beliefs via a Dirichlet Process Mixture of Statements. IEEE Transactions on Robotics, 34(5): 1280–1298.
- Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits. In 2023 IEEE International Conference on Robotics and Automation (ICRA), 7916–7922.
- Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing. IEEE Transactions on Robotics, 39(4): 3008–3023.