Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts (2303.00638v3)

Published 1 Mar 2023 in cs.LG and cs.RO

Abstract: Imitation learning has been widely applied to various autonomous systems thanks to recent development in interactive algorithms that address covariate shift and compounding errors induced by traditional approaches like behavior cloning. However, existing interactive imitation learning methods assume access to one perfect expert. Whereas in reality, it is more likely to have multiple imperfect experts instead. In this paper, we propose MEGA-DAgger, a new DAgger variant that is suitable for interactive learning with multiple imperfect experts. First, unsafe demonstrations are filtered while aggregating the training data, so the imperfect demonstrations have little influence when training the novice policy. Next, experts are evaluated and compared on scenarios-specific metrics to resolve the conflicted labels among experts. Through experiments in autonomous racing scenarios, we demonstrate that policy learned using MEGA-DAgger can outperform both experts and policies learned using the state-of-the-art interactive imitation learning algorithms such as Human-Gated DAgger. The supplementary video can be found at \url{https://youtu.be/wPCht31MHrw}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Deep learning in robotics: a review of recent research. Advanced Robotics, 31(16):821–835, 2017.
  2. A survey of robot learning from demonstration. Robotics and autonomous systems, 57(5):469–483, 2009.
  3. Survey of imitation learning for robotic manipulation. International Journal of Intelligent Robotics and Applications, 3:362–369, 2019.
  4. A survey on imitation learning techniques for end-to-end autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems, 2022.
  5. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316, 2016.
  6. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011.
  7. Hg-dagger: Interactive imitation learning with human experts. In 2019 International Conference on Robotics and Automation (ICRA), pages 8077–8083. IEEE, 2019.
  8. S. Ross and D. Bagnell. Efficient reductions for imitation learning. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 661–668. JMLR Workshop and Conference Proceedings, 2010.
  9. J. Zhang and K. Cho. Query-efficient imitation learning for end-to-end autonomous driving. arXiv preprint arXiv:1605.06450, 2016.
  10. Ensembledagger: A bayesian approach to safe imitation learning. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5041–5048. IEEE, 2019.
  11. Lazydagger: Reducing context switching in interactive imitation learning. In 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pages 502–509. IEEE, 2021.
  12. Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning. arXiv preprint arXiv:2109.08273, 2021.
  13. Pennsylvania Department of Transportation. 2021 pennsylvania crash facts & statistics. https://www.penndot.pa.gov/TravelInPA/Safety/Documents/2021_CFB_linked.pdf.
  14. P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning, page 1, 2004.
  15. Agile autonomous driving using end-to-end deep imitation learning. arXiv preprint arXiv:1709.07174, 2017.
  16. Exploring imitation learning for autonomous driving with feedback synthesizer and differentiable rasterization. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1450–1457. IEEE, 2021.
  17. A benchmark comparison of imitation learning-based control policies for autonomous racing. In 2023 IEEE Intelligent Vehicles Symposium (IV), pages 1–5, 2023.
  18. Imitation learning from imperfect demonstration. In International Conference on Machine Learning, pages 6818–6827. PMLR, 2019.
  19. Better-than-demonstrator imitation learning via automatically-ranked demonstrations. In Conference on robot learning, pages 330–359. PMLR, 2020.
  20. Learning to weight imperfect demonstrations. In International Conference on Machine Learning, pages 10961–10970. PMLR, 2021.
  21. Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations. In International conference on machine learning, pages 783–792. PMLR, 2019.
  22. Supervised learning from multiple experts: whom to trust when everyone lies a bit. In Proceedings of the 26th Annual international conference on machine learning, pages 889–896, 2009.
  23. Learning from crowds. Journal of machine learning research, 11(4), 2010.
  24. Autonomous vehicles on the edge: A survey on autonomous vehicle racing. IEEE Open Journal of Intelligent Transportation Systems, 3:458–488, 2022.
  25. Indy autonomous challenge-autonomous race cars at the handling limits. In 12th International Munich Chassis Symposium 2021: chassis. tech plus, pages 163–182. Springer, 2022.
  26. The roborace contest. IEEE Control Systems Magazine, 24(5):57–60, 2004.
  27. Design of an autonomous race car for the formula student driverless (fsd). In Oagm & Arw Joint Workshop, 2017.
  28. F1tenth: An open-source evaluation environment for continuous control and reinforcement learning. Proceedings of Machine Learning Research, 123, 2020.
  29. Learn-to-race: A multimodal control environment for autonomous racing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9793–9802, 2021.
  30. Learning-based model predictive control for autonomous racing. IEEE Robotics and Automation Letters, 4(4):3363–3370, 2019.
  31. Control barrier function based quadratic programs for safety critical systems. IEEE Transactions on Automatic Control, 62(8):3861–3876, 2016.
  32. Enhancing feasibility and safety of nonlinear model predictive control with discrete-time control barrier functions. In 2021 60th IEEE Conference on Decision and Control (CDC), pages 6137–6144. IEEE, 2021.
  33. Pedestrian detection with lidar point clouds based on single template matching. Electronics, 8(7):780, 2019.
  34. Global pose initialization based on gridded gaussian distribution with wasserstein distance. IEEE Transactions on Intelligent Transportation Systems, pages 1–11, 2023.
  35. Sharp corner/edge recognition in domestic environments using rgb-d camera systems. IEEE Transactions on Circuits and Systems II: Express Briefs, 62(10):987–991, 2015.
  36. We know where they are looking at from the rgb-d camera: Gaze following in 3d. IEEE Transactions on Instrumentation and Measurement, 71:1–14, 2022.
  37. F1TENTH ICRA 2022: Results. https://icra2022-race.f1tenth.org/results.html.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xiatao Sun (9 papers)
  2. Shuo Yang (244 papers)
  3. Mingyan Zhou (4 papers)
  4. Kunpeng Liu (54 papers)
  5. Rahul Mangharam (44 papers)
Citations (11)

Summary

We haven't generated a summary for this paper yet.