Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

iMTSP: Solving Min-Max Multiple Traveling Salesman Problem with Imperative Learning (2405.00285v4)

Published 1 May 2024 in cs.AI, cs.LG, and cs.RO

Abstract: This paper considers a Min-Max Multiple Traveling Salesman Problem (MTSP), where the goal is to find a set of tours, one for each agent, to collectively visit all the cities while minimizing the length of the longest tour. Though MTSP has been widely studied, obtaining near-optimal solutions for large-scale problems is still challenging due to its NP-hardness. Recent efforts in data-driven methods face challenges of the need for hard-to-obtain supervision and issues with high variance in gradient estimations, leading to slow convergence and highly suboptimal solutions. We address these issues by reformulating MTSP as a bilevel optimization problem, using the concept of imperative learning (IL). This involves introducing an allocation network that decomposes the MTSP into multiple single-agent traveling salesman problems (TSPs). The longest tour from these TSP solutions is then used to self-supervise the allocation network, resulting in a new self-supervised, bilevel, end-to-end learning framework, which we refer to as imperative MTSP (iMTSP). Additionally, to tackle the high-variance gradient issues during the optimization, we introduce a control variate-based gradient estimation algorithm. Our experiments showed that these innovative designs enable our gradient estimator to converge 20% faster than the advanced reinforcement learning baseline and find up to 80% shorter tour length compared with Google OR-Tools MTSP solver, especially in large-scale problems (e.g. 1000 cities and 15 agents).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. K. Sundar and S. Rathinam, “Algorithms for routing an unmanned aerial vehicle in the presence of refueling depots,” IEEE Transactions on Automation Science and Engineering, vol. 11, no. 1, pp. 287–294, 2013.
  2. Y. Ma, H. Zhang, Y. Zhang, R. Gao, Z. Xu, and J. Yang, “Coordinated optimization algorithm combining GA with cluster for multi-UAVs to multi-tasks task assignment and path planning,” in International Conference on Control and Automation.   IEEE, 2019, pp. 1026–1031.
  3. R. F. Carpio, J. Maiolini, C. Potena, E. Garone, G. Ulivi, and A. Gasparn, “Mp-stsp: A multi-platform steiner traveling salesman problem formulation for precision agriculture in orchards,” in International Conference on Robotics and Automation.   IEEE, 2021, pp. 2345–2351.
  4. Z. Ren, S. Rathinam, and H. Choset, “CBSS: A new approach for multiagent combinatorial path finding,” IEEE Transactions on Robotics, vol. 39, no. 4, pp. 2669–2683, 2023.
  5. T. Bektas, “The multiple traveling salesman problem: an overview of formulations and solution procedures,” omega, vol. 34, no. 3, pp. 209–219, 2006.
  6. E. Gaztañaga, “The mass of our observable universe,” Monthly Notices of the Royal Astronomical Society: Letters, vol. 521, no. 1, pp. L59–L63, 2023.
  7. J. Park, C. Kwon, and J. Park, “Learn to solve the min-max multiple traveling salesmen problem with reinforcement learning,” in Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023, pp. 878–886.
  8. A. M. Ham, “Integrated scheduling of m-truck, m-drone, and m-depot constrained by time-window, drop-pickup, and m-visit using constraint programming,” Transportation Research Part C: Emerging Technologies, vol. 91, pp. 1–14, 2018.
  9. Z. Ren, S. Rathinam, and H. Choset, “MS: A new exact algorithm for multi-agent simultaneous multi-goal sequencing and path finding,” in International Conference on Robotics and Automation.   IEEE, 2021, pp. 11 560–11 565.
  10. K. Bérczi, M. Mnich, and R. Vincze, “Approximations for many-visits multiple traveling salesman problems,” Omega, vol. 116, p. 102816, 2023.
  11. M. A. Al-Omeer and Z. H. Ahmed, “Comparative study of crossover operators for the MTSP,” in 2019 International Conference on Computer and Information Sciences.   IEEE, 2019, pp. 1–6.
  12. R. Necula, M. Breaban, and M. Raschip, “Tackling the bi-criteria facet of multiple traveling salesman problem with ant colony systems,” in 27th International Conference on Tools with Artificial Intelligence.   IEEE, 2015, pp. 873–880.
  13. H. Liang, Y. Ma, Z. Cao, T. Liu, F. Ni, Z. Li, and J. Hao, “SplitNet: a reinforcement learning based sequence splitting method for the MinMax multiple travelling salesman problem,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 7, 2023, pp. 8720–8727.
  14. Y. Hu, Y. Yao, and W. S. Lee, “A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs,” Knowledge-Based Systems, vol. 204, p. 106244, 2020.
  15. L. Xin, W. Song, Z. Cao, and J. Zhang, “NeuroLKH: Combining deep learning model with Lin-Kernighan-Helsgaun heuristic for solving the traveling salesman problem,” Advances in Neural Information Processing Systems, vol. 34, pp. 7472–7483, 2021.
  16. S. Miki, D. Yamamoto, and H. Ebara, “Applying deep learning and reinforcement learning to traveling salesman problem,” in International Conference on Computing, Electronics and Communications Engineering.   IEEE, 2018, pp. 65–70.
  17. O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.
  18. W. Kool, H. Van Hoof, and M. Welling, “Attention, learn to solve routing problems!” arXiv preprint arXiv:1803.08475, 2018.
  19. M. Nazari, A. Oroojlooy, L. Snyder, and M. Takác, “Reinforcement learning for solving the vehicle routing problem,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  20. J. Park, S. Bakhtiyar, and J. Park, “Schedulenet: Learn to solve multi-agent scheduling problems with reinforcement learning,” arXiv preprint arXiv:2106.03051, 2021.
  21. E. Khalil, H. Dai, Y. Zhang, B. Dilkina, and L. Song, “Learning combinatorial optimization algorithms over graphs,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  22. Y. Wu, W. Song, Z. Cao, J. Zhang, and A. Lim, “Learning improvement heuristics for solving routing problems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 9, pp. 5057–5069, 2021.
  23. R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, pp. 229–256, 1992.
  24. T. Fu, S. Su, Y. Lu, and C. Wang, “iSLAM: Imperative SLAM,” IEEE Robotics and Automation Letters (RA-L), 2024.
  25. F. Yang, C. Wang, C. Cadena, and M. Hutter, “iPlanner: Imperative path planning,” in Robotics: Science and Systems (RSS), 2023.
  26. Z. Zhan, D. Gao, Y.-J. Lin, Y. Xia, and C. Wang, “iMatching: Imperative correspondence learning,” arXiv preprint arXiv:2312.02141, 2023.
  27. M. Figurnov, S. Mohamed, and A. Mnih, “Implicit reparameterization gradients,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  28. B. L. Nelson, “Control variate remedies,” Operations Research, vol. 38, no. 6, pp. 974–992, 1990.
  29. L. Perron and F. Didier, “ORTools routing options,” Google, 2023. [Online]. Available: https://developers.google.com/optimization/routing/routing_options
  30. O. Cheikhrouhou and I. Khoufi, “A comprehensive survey on the multiple traveling salesman problem: Applications, approaches and taxonomy,” Computer Science Review, vol. 40, p. 100369, 2021.
  31. C. Wei, Z. Ji, and B. Cai, “Particle swarm optimization for cooperative multi-robot task allocation: a multi-objective approach,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2530–2537, 2020.
  32. P. Kitjacharoenchai, M. Ventresca, M. Moshref-Javadi, S. Lee, J. M. Tanchoco, and P. A. Brunese, “Multiple traveling salesman problem with drones: Mathematical model and heuristic approach,” Computers and Industrial Engineering, vol. 129, pp. 14–30, 2019.
  33. C. C. Murray and R. Raj, “The multiple flying sidekicks traveling salesman problem: Parcel delivery with multiple drones,” Transportation Research Part C: Emerging Technologies, vol. 110, pp. 368–398, 2020.
  34. D. E. Rumelhart, G. E. Hinton, R. J. Williams et al., “Learning internal representations by error propagation,” 1985.
  35. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  36. E. Greensmith, P. L. Bartlett, and J. Baxter, “Variance reduction techniques for gradient estimates in reinforcement learning.” Journal of Machine Learning Research, vol. 5, no. 9, 2004.
  37. W. Grathwohl, D. Choi, Y. Wu, G. Roeder, and D. Duvenaud, “Backpropagation through the void: Optimizing control variates for black-box gradient estimation,” arXiv preprint arXiv:1711.00123, 2017.
  38. S. S. Gu, T. Lillicrap, R. E. Turner, Z. Ghahramani, B. Schölkopf, and S. Levine, “Interpolated policy gradient: Merging on-policy and off-policy gradient estimation for deep reinforcement learning,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  39. H. Liu, R. Socher, and C. Xiong, “Taming MAML: Efficient unbiased meta-reinforcement learning,” in International Conference on Machine Learning.   PMLR, 2019, pp. 4061–4071.
  40. S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold: Stochastic controlled averaging for federated learning,” in International Conference on Machine Learning.   PMLR, 2020, pp. 5132–5143.
  41. J. Baker, P. Fearnhead, E. B. Fox, and C. Nemeth, “Control variates for stochastic gradient MCMC,” Statistics and Computing, vol. 29, pp. 599–615, 2019.
  42. T. Geffner and J. Domke, “Using large ensembles of control variates for variational inference,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  43. W. Guo, S. Wang, P. Ding, Y. Wang, and M. I. Jordan, “Multi-source causal inference using control variates,” arXiv preprint arXiv:2103.16689, 2021.
  44. J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International Conference on Machine Learning.   PMLR, 2017, pp. 1263–1272.
  45. H. Gao, X. Zhou, X. Xu, Y. Lan, and Y. Xiao, “AMARL: An attention-based multiagent reinforcement learning approach to the min-max multiple traveling salesmen problem,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com