Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration (2312.12583v2)

Published 19 Dec 2023 in cs.RO and cs.LG

Abstract: We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Adam, J. R. 2016. Europa Mission Overview. Technical report.
  2. Ahmed, N. 2018. Data-Free/Data-Sparse Softmax Parameter Estimation With Structured Class Geometries. IEEE Signal Processing Letters, 25: 1–1.
  3. Bayesian Multicategorical Soft Data Fusion for Human–Robot Collaboration. IEEE Transactions on Robotics, 29(1): 189–206.
  4. Finite-Time Analysis of the Multiarmed Bandit Problem. Mach. Learn., 47(2–3): 235–256.
  5. The probabilistic data association filter. IEEE Control Systems Magazine, 29(6): 82–100.
  6. Bishop, C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Berlin, Heidelberg: Springer-Verlag. ISBN 0387310738.
  7. Survey on Applications of Multi-Armed and Contextual Bandits. In 2020 IEEE Congress on Evolutionary Computation (CEC), 1–8.
  8. Data Validation for Machine Learning. In MLSys.
  9. OceanWATERS Lander Robotic Arm Operation. In 2021 IEEE Aerospace Conference (50100), 1–11.
  10. Friston, K. 2010. The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2): 127–138.
  11. Science goals and mission architecture of the Europa lander mission concept. The Planetary Science Journal, 3(1): 22.
  12. The impact of selecting a validation method in machine learning on predicting basketball game outcomes. Symmetry, 12(3): 431.
  13. Review of the recent advances and applications of LIBS-based imaging. Spectrochimica Acta Part B: Atomic Spectroscopy, 151: 41–53.
  14. Multiple hypothesis tracking revisited. In Proceedings of the IEEE international conference on computer vision, 4696–4704.
  15. Kochenderfer, M. J. 2015. Decision making under uncertainty: theory and application. MIT press.
  16. Kurniawati, H. 2022. Partially observable markov decision processes and robotics. Annual Review of Control, Robotics, and Autonomous Systems, 5: 253–277.
  17. Active Inference in Robotics and Artificial Agents: Survey and Challenges. CoRR, abs/2112.01871.
  18. Tracking in clutter with nearest neighbor filters: analysis and performance. IEEE transactions on aerospace and electronic systems, 32(3): 995–1010.
  19. An empirical evaluation of active inference in multi-armed bandits. Neural Networks; 2021 Special Issue on AI and Brain Science: AI-powered Brain Science, 144: 229–246.
  20. Everybody Needs Somebody Sometimes: Validation of Adaptive Recovery in Robotic Space Operations. IEEE Robotics and Automation Letters, 4(2): 1216–1223.
  21. REASON-RECOURSE Software for Science Operations of Autonomous Robotic Landers. In 2023 IEEE Aerospace Conference, 1–11.
  22. Europa Clipper Mission Concept: Exploring Jupiter’s Ocean Moon. EOS Transactions, 95(20): 165–167.
  23. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788.
  24. Runnalls, A. 2007. Kullback-Leibler Approach to Gaussian Mixture Reduction. Aerospace and Electronic Systems, IEEE Transactions on, 43: 989 – 999.
  25. Salmond, D. J. 1990. Mixture reduction algorithms for target tracking in clutter. In Drummond, O. E., ed., Signal and Data Processing of Small Targets 1990, volume 1305 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, 434–445.
  26. A step-by-step tutorial on active inference and its application to empirical data. Journal of mathematical psychology, 107: 102632.
  27. Thompson, W. R. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25: 285–294.
  28. Human–Robot Communications of Probabilistic Beliefs via a Dirichlet Process Mixture of Statements. IEEE Transactions on Robotics, 34(5): 1280–1298.
  29. Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits. In 2023 IEEE International Conference on Robotics and Automation (ICRA), 7916–7922.
  30. Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing. IEEE Transactions on Robotics, 39(4): 3008–3023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.