Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inverse Constraint Learning and Generalization by Transferable Reward Decomposition (2306.12357v2)

Published 21 Jun 2023 in cs.RO

Abstract: We present the problem of inverse constraint learning (ICL), which recovers constraints from demonstrations to autonomously reproduce constrained skills in new scenarios. However, ICL suffers from an ill-posed nature, leading to inaccurate inference of constraints from demonstrations. To figure it out, we introduce a transferable constraint learning (TCL) algorithm that jointly infers a task-oriented reward and a task-agnostic constraint, enabling the generalization of learned skills. Our method TCL additively decomposes the overall reward into a task reward and its residual as soft constraints, maximizing policy divergence between task- and constraint-oriented policies to obtain a transferable constraint. Evaluating our method and five baselines in three simulated environments, we show TCL outperforms state-of-the-art IRL and ICL algorithms, achieving up to a $72\%$ higher task-success rates with accurate decomposition compared to the next best approach in novel scenarios. Further, we demonstrate the robustness of TCL on two real-world robotic tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. C. Tessler, D. J. Mankowitz, and S. Mannor, “Reward constrained policy optimization,” in Proc. Int’l. Conf. on Learning Representations, 2019.
  2. L. Armesto, J. Bosga, V. Ivan, and S. Vijayakumar, “Efficient learning of constraints and generic null space policies,” in Proc. Int’l Conf. on Robotics and Automation, 2017, pp. 1520–1526.
  3. D. R. Scobee and S. S. Sastry, “Maximum likelihood constraint inference for inverse reinforcement learning,” in Proc. Int’l. Conf. on Learning Representations, 2020.
  4. G. Chou, D. Berenson, and N. Ozay, “Learning constraints from demonstrations with grid and parametric representations,” Int’l J. of Robotics Research, vol. 40, pp. 1255 – 1283, 2018.
  5. C. Pérez-D’Arpino and J. A. Shah, “C-learn: Learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy,” in Proc. Int’l Conf. on Robotics and Automation, 2017, pp. 4058–4065.
  6. D. Park, M. Noseworthy, R. Paul, S. Roy, and N. Roy, “Inferring task goals and constraints using bayesian nonparametric inverse reinforcement learning,” in Proc. Conf. on robot learning, vol. 100, 2020, pp. 1005–1014.
  7. D. L. McPherson, K. C. Stocking, and S. S. Sastry, “Maximum likelihood constraint inference from stochastic demonstrations,” in Proc. Conf. on Control Technology and Applications, 2021, pp. 1208–1213.
  8. S. Malik, U. Anwar, A. Aghasi, and A. Ahmed, “Inverse constrained reinforcement learning,” in Proc. Int’l Conf. on Machine Learning, vol. 139, 2021, pp. 7390–7399.
  9. A. Gaurav, K. Rezaee, G. Liu, and P. Poupart, “Learning soft constraints from constrained expert demonstrations,” in Proc. Int’l. Conf. on Learning Representations, 2023.
  10. G. Liu, Y. Luo, A. Gaurav, K. Rezaee, and P. Poupart, “Benchmarking constraint inference in inverse reinforcement learning,” in Proc. Int’l. Conf. on Learning Representations, 2023.
  11. T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, “Reinforcement learning with deep energy-based policies,” in Proc. Int’l Conf. on Machine Learning, vol. 70, 2017, pp. 1352–1361.
  12. N. Somani, M. Rickert, A. Gaschler, C. Cai, A. Perzylo, and A. Knoll, “Task level robot programming using prioritized non-linear inequality constraints,” in Proc. RSJ Int’l Conf. on Intelligent Robots and Systems, 2016, pp. 430–437.
  13. A. H. Qureshi, J. Dong, A. Choe, and M. C. Yip, “Neural manipulation planning on constraint manifolds,” IEEE Robotics and Automation Letters, vol. 5, pp. 6089–6096, 2020.
  14. J. Garcıa and F. Fernández, “A comprehensive survey on safe reinforcement learning,” Journal of Machine Learning Research, vol. 16, pp. 1437–1480, 2015.
  15. Z. Qin, Y. Chen, and C. Fan, “Density constrained reinforcement learning,” in Proc. Int’l Conf. on Machine Learning, vol. 139, 2021, pp. 8682–8692.
  16. J. Achiam, D. Held, A. Tamar, and P. Abbeel, “Constrained policy optimization,” in Proc. Int’l Conf. on Machine Learning, vol. 70, 2017, pp. 22–31.
  17. Y. Chow, M. Ghavamzadeh, L. Janson, and M. Pavone, “Risk-constrained reinforcement learning with percentile risk criteria,” Journal of Machine Learning Research, vol. 18, pp. 1–51, 2018.
  18. S. Huang, A. Abdolmaleki, G. Vezzani, P. Brakel, D. J. Mankowitz, M. Neunert, S. Bohez, Y. Tassa, N. Heess, M. Riedmiller, and R. Hadsell, “A constrained multi-objective reinforcement learning framework,” in Proc. Conf. on robot learning, vol. 164, 2022, pp. 883–893.
  19. J. Lee, C. Paduraru, D. J. Mankowitz, N. Heess, D. Precup, K.-E. Kim, and A. Guez, “Coptidice: Offline constrained reinforcement learning via stationary distribution correction estimation,” in Proc. Int’l. Conf. on Learning Representations, 2022.
  20. D. Papadimitriou, U. Anwar, and D. S. Brown, “Bayesian inverse constrained reinforcement learning,” in Workshop on Safe and Robust Control of Uncertain Systems (NeurIPS), 2021.
  21. J. Fischer, C. Eyberg, M. Werling, and M. Lauer, “Sampling-based inverse reinforcement learning algorithms with safety constraints,” in Proc. RSJ Int’l Conf. on Intelligent Robots and Systems, 2021, pp. 791–798.
  22. A. Schlaginhaufen and M. Kamgarpour, “Identifiability and generalizability in constrained inverse reinforcement learning,” in Proc. Int’l Conf. on Machine Learning, vol. 202, 23–29 Jul 2023, pp. 30 224–30 251.
  23. G. Subramani, M. Zinn, and M. Gleicher, “Inferring geometric constraints in human demonstrations,” in Proc. Conf. on robot learning, vol. 87, 2018, pp. 223–236.
  24. C. Willibald and D. Lee, “Multi-level task learning based on intention and constraint inference for autonomous robotic manipulation,” in Proc. RSJ Int’l Conf. on Intelligent Robots and Systems, 2022, pp. 7688–7695.
  25. M. Hasanbeig, D. Kroening, and A. Abate, “LCRL: Certified policy synthesis via logically-constrained reinforcement learning,” in Proc. Int’l Conf. on Quantitative Evaluation of Systems, 2022, pp. 217–231.
  26. B. D. Ziebart, J. A. Bagnell, and A. K. Dey, “Modeling interaction via the principle of maximum causal entropy,” in Proc. Int’l Conf. on Machine Learning, 2010, p. 1255–1262.
  27. M. Bloem and N. Bambos, “Infinite time horizon maximum causal entropy inverse reinforcement learning,” in Proc. Conf. on decision and control, 2014, pp. 4911–4916.
  28. S. J. Russell and A. Zimdars, “Q-decomposition for reinforcement learning agents,” in Proc. Int’l Conf. on Machine Learning, 2003, p. 656–663.
  29. E. Catto. Box2d: A 2d physics engine for games. [Online]. Available: http://www.box2d.org
  30. J. Ho and S. Ermon, “Generative adversarial imitation learning,” in Conf. on Neural Information Processing Systems, vol. 29, 2016.
  31. J. Fu, K. Luo, and S. Levine, “Learning robust rewards with adverserial inverse reinforcement learning,” in Proc. Int’l. Conf. on Learning Representations, 2018.
  32. G. Bradski. Open computer vision library (OpenCV). [Online]. Available: http://opencv.org/

Summary

We haven't generated a summary for this paper yet.