Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Role of Higher-Order Cognitive Models in Active Learning (2401.04397v1)

Published 9 Jan 2024 in cs.LG and cs.RO

Abstract: Building machines capable of efficiently collaborating with humans has been a longstanding goal in artificial intelligence. Especially in the presence of uncertainties, optimal cooperation often requires that humans and artificial agents model each other's behavior and use these models to infer underlying goals, beliefs or intentions, potentially involving multiple levels of recursion. Empirical evidence for such higher-order cognition in human behavior is also provided by previous works in cognitive science, linguistics, and robotics. We advocate for a new paradigm for active learning for human feedback that utilises humans as active data sources while accounting for their higher levels of agency. In particular, we discuss how increasing level of agency results in qualitatively different forms of rational communication between an active learning system and a teacher. Additionally, we provide a practical example of active learning using a higher-order cognitive model. This is accompanied by a computational study that underscores the unique behaviors that this model produces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, HRI ’18, 132–140. New York, NY, USA: Association for Computing Machinery. ISBN 9781450349536.
  2. Active preference-based Gaussian process regression for reward learning and optimization. The International Journal of Robotics Research.
  3. Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences. The International Journal of Robotics Research, 41(1): 45–67.
  4. Modeling needs user modeling. Frontiers in Artificial Intelligence, 6.
  5. Active deep Q-learning with demonstration. Machine Learning, 109(9): 1699–1725.
  6. Human Strategic Steering Improves Performance of Interactive Optimization. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization, UMAP ’20, 293–297. New York, NY, USA: Association for Computing Machinery. ISBN 9781450368612.
  7. Pragmatic-Pedagogic Value Alignment. In Amato, N. M.; Hager, G.; Thomas, S.; and Torres-Torriti, M., eds., Robotics Research, 49–57. Cham: Springer International Publishing.
  8. Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning. In Chaudhuri, K.; and Salakhutdinov, R., eds., Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, 1942–1951. PMLR.
  9. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science, 349(6245): 273–278.
  10. Pragmatic Language Interpretation as Probabilistic Inference. Trends in Cognitive Sciences, 20(11): 818–829.
  11. Cooperative Inverse Reinforcement Learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, 3916–3924. Red Hook, NY, USA: Curran Associates Inc. ISBN 9781510838819.
  12. Showing versus doing: Teaching by demonstration. In Lee, D.; Sugiyama, M.; Luxburg, U.; Guyon, I.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc.
  13. An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning. In Dy, J.; and Krause, A., eds., Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 3394–3402. PMLR.
  14. Literal or Pedagogic Human? Analyzing Human Model Misspecification in Objective Learning. In Conference on Uncertainty in Artificial Intelligence.
  15. Neural Recursive Belief States in Multi-Agent Reinforcement Learning. ArXiv, abs/2102.02274.
  16. Machine Teaching of Active Sequential Learners. In 33rd Conference on Neural Information Processing Systems, Advances in Neural Information Processing Systems, 11202–11213. United States: Neural Information Processing Systems Foundation.
  17. Machine Theory of Mind. In Dy, J.; and Krause, A., eds., Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 4218–4227. PMLR.
  18. Modern Bayesian Experimental Design. arXiv:2302.14545.
  19. Active learning from demonstration for robust autonomous navigation. In 2012 IEEE International Conference on Robotics and Automation, 200–207.
  20. On Players’ Models of Other Players: Theory and Experimental Evidence. Games and Economic Behavior, 10(1): 218–254.
  21. How to talk so AI will learn: Instructions, descriptions, and autonomy. In Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; and Oh, A., eds., Advances in Neural Information Processing Systems, volume 35, 34762–34775. Curran Associates, Inc.
  22. Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. In International Conference on Learning Representations.
  23. Too Many Cooks: Bayesian Inference for Coordinating Multi-Agent Collaboration. Topics in Cognitive Science, 13(2): 414–432.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Oskar Keurulainen (3 papers)
  2. Gokhan Alcan (15 papers)
  3. Ville Kyrki (102 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets