Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distilling Privileged Information for Dubins Traveling Salesman Problems with Neighborhoods (2404.16721v1)

Published 25 Apr 2024 in cs.AI and cs.LG

Abstract: This paper presents a novel learning approach for Dubins Traveling Salesman Problems(DTSP) with Neighborhood (DTSPN) to quickly produce a tour of a non-holonomic vehicle passing through neighborhoods of given task points. The method involves two learning phases: initially, a model-free reinforcement learning approach leverages privileged information to distill knowledge from expert trajectories generated by the LinKernighan heuristic (LKH) algorithm. Subsequently, a supervised learning phase trains an adaptation network to solve problems independently of privileged information. Before the first learning phase, a parameter initialization technique using the demonstration data was also devised to enhance training efficiency. The proposed learning method produces a solution about 50 times faster than LKH and substantially outperforms other imitation learning and RL with demonstration schemes, most of which fail to sense all the task points.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Hindsight experience replay. Advances in neural information processing systems, 30, 2017.
  2. Multi-task learning for continuous control. arXiv preprint arXiv:1802.01034, 2018.
  3. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.
  4. X. Bresson and T. Laurent. The transformer network for the traveling salesman problem. arXiv preprint arXiv:2103.03012, 2021.
  5. Learning by cheating. In Conference on Robot Learning, pages 66–75. PMLR, 2020.
  6. Heterogeneous, multiple depot multi-uav path planning for remote sensing tasks. In 2018 AIAA Information Systems-AIAA Infotech@ Aerospace, page 0894. 2018.
  7. Mix & match agent curricula for reinforcement learning. In International Conference on Machine Learning, pages 1087–1095. PMLR, 2018.
  8. Distilling policy distillation. In The 22nd international conference on artificial intelligence and statistics, pages 1331–1340. PMLR, 2019.
  9. Compact trilinear interaction for visual question answering. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 392–401, 2019.
  10. Deep drone acrobatics. RSS: Robotics, Science, and Systems, 2020.
  11. Modality distillation with multiple stream networks for action recognition. In Proceedings of the European Conference on Computer Vision (ECCV), pages 103–118, 2018.
  12. K. Helsgaun. An effective implementation of the lin–kernighan traveling salesman heuristic. European journal of operational research, 126(1):106–130, 2000.
  13. Deep q-learning from demonstrations. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
  14. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  15. J. Ho and S. Ermon. Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
  16. Towards multi-modal perception-based navigation: A deep reinforcement learning method. IEEE Robotics and Automation Letters, 6(3):4986–4993, 2021.
  17. Dubins traveling salesman problem with neighborhoods: A graph-based approach. Algorithms, 6(1):84–99, 2013.
  18. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018.
  19. Pomo: Policy optimization with multiple optima for reinforcement learning. Advances in Neural Information Processing Systems, 33:21188–21198, 2020.
  20. Deep learning under privileged information using heteroscedastic dropout. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8886–8895, 2018.
  21. Learning quadrupedal locomotion over challenging terrain. Science robotics, 5(47):eabc5986, 2020.
  22. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
  23. Structured knowledge distillation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2604–2613, 2019.
  24. Unifying distillation and privileged information. arXiv preprint arXiv:1511.03643, 2015.
  25. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  26. Dipcan: Distilling privileged information for crowd-aware navigation. Robotics: Science and Systems (RSS) XVIII, 2022.
  27. Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA), pages 6292–6299. IEEE, 2018.
  28. Sampling-based path planning for a visual reconnaissance unmanned air vehicle. Journal of Guidance, Control, and Dynamics, 35(2):619–631, 2012.
  29. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4):1–14, 2018.
  30. J. Peters and S. Schaal. Reinforcement learning of motor skills with policy gradients. Neural networks, 21(4):682–697, 2008.
  31. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087, 2017.
  32. S. Ross and D. Bagnell. Efficient reductions for imitation learning. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 661–668. JMLR Workshop and Conference Proceedings, 2010.
  33. Policy distillation. arXiv preprint arXiv:1511.06295, 2015.
  34. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
  35. S. Schaal. Learning from demonstration. Advances in neural information processing systems, 9, 1996.
  36. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  37. Learning to navigate sidewalks in outdoor environments. IEEE Robotics and Automation Letters, 7(2):3906–3913, 2022.
  38. K. Sundar and S. Rathinam. Algorithms for heterogeneous, multiple depot, multiple unmanned vehicle path planning problems. Journal of Intelligent & Robotic Systems, 88:513–526, 2017.
  39. Distral: Robust multitask reinforcement learning. Advances in neural information processing systems, 30, 2017.
  40. Visual navigation among humans with optimal control as a supervisor. IEEE Robotics and Automation Letters, 6(2):2288–2295, 2021.
  41. Jump-start reinforcement learning. In International Conference on Machine Learning, pages 34556–34583. PMLR, 2023.
  42. V. Vapnik and A. Vashist. A new learning paradigm: Learning using privileged information. Neural networks, 22(5-6):544–557, 2009.
  43. Learning using privileged information: similarity control and knowledge transfer. J. Mach. Learn. Res., 16(1):2023–2049, 2015.
  44. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv preprint arXiv:1707.08817, 2017.
  45. Pointer networks. Advances in neural information processing systems, 28, 2015.
  46. Kdgan: Knowledge distillation with generative adversarial networks. Advances in neural information processing systems, 31, 2018.
  47. Adversarial distillation for learning with privileged provisions. IEEE transactions on pattern analysis and machine intelligence, 43(3):786–797, 2019.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets