Optimally Teaching a Linear Behavior Cloning Agent (2311.15399v1)
Abstract: We study optimal teaching of Linear Behavior Cloning (LBC) learners. In this setup, the teacher can select which states to demonstrate to an LBC learner. The learner maintains a version space of infinite linear hypotheses consistent with the demonstration. The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations. This number is known as the Teaching Dimension(TD). We present a teaching algorithm called ``Teach using Iterative Elimination(TIE)" that achieves instance optimal TD. However, we also show that finding optimal teaching set computationally is NP-hard. We further provide an approximation algorithm that guarantees an approximation ratio of $\log(|A|-1)$ on the teaching dimension. Finally, we provide experimental results to validate the efficiency and effectiveness of our algorithm.
- Machine teaching for inverse reinforcement learning: Algorithms and applications. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7749–7758, 2019.
- Algorithmic and human teaching of sequential decision tasks. In Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.
- Active learning with mixed query types in learning from demonstration. In Proc. of the ICML workshop on new developments in imitation learning. Citeseer, 2011.
- Exploring the limitations of behavior cloning for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
- On the complexity of teaching. Journal of Computer and System Sciences, 50(1):20–31, 1995.
- Teaching a smart learner. In Proceedings of the sixth annual conference on computational learning theory, pages 67–76, 1993.
- Teaching inverse reinforcement learners via features and demonstrations. In Advances in Neural Information Processing Systems, pages 8464–8473, 2018.
- Interactive teaching algorithms for inverse reinforcement learning. In IJCAI, pages 2692–2700, 2019.
- The teaching dimension of linear learners. In International Conference on Machine Learning, pages 117–126. PMLR, 2016.
- Iterative teaching by label synthesis. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 21681–21695. Curran Associates, Inc., 2021.
- Teacher improves learning by selecting a training subset. In International Conference on Artificial Intelligence and Statistics, 2018.
- Preference-based batch and sequential teaching: Towards a unified view of models. Advances in neural information processing systems, 32, 2019.
- Optimal teaching for online perceptrons. University of Wisconsin-Madison, 2016.
- The teaching dimension of regularized kernel learners. In International Conference on Machine Learning, pages 17984–18002. PMLR, 2022.
- Policy teaching via environment poisoning: Training-time adversarial attacks against reinforcement learning. In International Conference on Machine Learning, pages 7974–7984. PMLR, 2020.
- Policy teaching in reinforcement learning via environment poisoning attacks. The Journal of Machine Learning Research, 22(1):9567–9611, 2021.
- Recent advances in imitation learning from observation. CoRR, abs/1905.13566, 2019.
- Learner-aware teaching: Inverse reinforcement learning with preferences and constraints. In Advances in Neural Information Processing Systems, 2019.
- Dynamic teaching in sequential decision making environments. arXiv preprint arXiv:1210.4918, 2012.
- Approximating set cover, 2005. Accessed: 2023-10-15.
- Teaching an active learner with contrastive examples. Advances in Neural Information Processing Systems, 34:17968–17980, 2021.
- The sample complexity of teaching by reinforcement on q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10939–10947, 2021.
- Jerry Zhu. Machine teaching for bayesian learners in the exponential family. Advances in Neural Information Processing Systems, 26, 2013.
- Xiaojin Zhu. Machine teaching: An inverse problem to machine learning and an approach toward optimal education. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 29, 2015.