Decision-Oriented Learning Using Differentiable Submodular Maximization for Multi-Robot Coordination (2310.01519v2)
Abstract: We present a differentiable, decision-oriented learning framework for cost prediction in a class of multi-robot decision-making problems, in which the robots need to trade off the task performance with the costs of taking actions when they select actions to take. Specifically, we consider the cases where the task performance is measured by a known monotone submodular function (e.g., coverage, mutual information), and the cost of actions depends on the context (e.g., wind and terrain conditions). We need to learn a function that maps the context to the costs. Classically, we treat such a learning problem and the downstream decision-making problem as two decoupled problems, i.e., we first learn to predict the cost function without considering the downstream decision-making problem, and then use the learned function for predicting the cost and using it in the decision-making problem. However, the loss function used in learning a prediction function may not be aligned with the downstream decision-making. We propose a decision-oriented learning framework that incorporates the downstream task performance in the prediction phase via a differentiable optimization layer. The main computational challenge in such a framework is to make the combinatorial optimization, i.e., non-monotone submodular maximization, differentiable. This function is not naturally differentiable. We propose the Differentiable Cost Scaled Greedy algorithm (D-CSG), which is a continuous and differentiable relaxation of CSG. We demonstrate the efficacy of the proposed framework through numerical simulations. The results show that the proposed framework can result in better performance than the traditional two-stage approach.
- D. Rojas Viloria, E. L. Solano-Charris, A. Muñoz-Villamizar, and J. R. Montoya-Torres, “Unmanned aerial vehicles/drones in vehicle routing problems: a literature review,” International Transactions in Operational Research, vol. 28, no. 4, pp. 1626–1657, 2021.
- A. B. Asghar, S. L. Smith, and S. Sundaram, “Multi-robot routing for persistent monitoring with latency constraints,” in 2019 American Control Conference (ACC). IEEE, 2019, pp. 2620–2625.
- B. Wilder, B. Dilkina, and M. Tambe, “Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 1658–1665.
- A. Ferber, B. Wilder, B. Dilkina, and M. Tambe, “Mipaal: Mixed integer program as a layer,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 02, 2020, pp. 1504–1511.
- J. Mandi, P. J. Stuckey, T. Guns et al., “Smart predict-and-optimize for hard combinatorial optimization problems,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 02, 2020, pp. 1603–1610.
- G. Shi and P. Tokekar, “Decision-oriented learning with differentiable submodular maximization for vehicle routing problem,” 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
- Y. Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinatorial optimization: a methodological tour d’horizon,” European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021.
- G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher, “An analysis of approximations for maximizing submodular set functions—I,” Mathematical programming, vol. 14, pp. 265–294, 1978.
- S. M. Nikolakaki, A. Ene, and E. Terzi, “An efficient framework for balancing submodularity and cost,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 1256–1266.
- N. Mathew, S. L. Smith, and S. L. Waslander, “Multirobot rendezvous planning for recharging in persistent tasks,” IEEE Transactions on Robotics, vol. 31, no. 1, pp. 128–142, 2015.
- P. Maini and P. Sujit, “On cooperation between a fuel constrained uav and a refueling ugv for large scale mapping applications,” in 2015 international conference on unmanned aircraft systems (ICUAS). IEEE, 2015, pp. 1370–1377.
- B. Amos and J. Z. Kolter, “Optnet: Differentiable optimization as a layer in neural networks,” in International Conference on Machine Learning. PMLR, 2017, pp. 136–145.
- A. Agrawal, B. Amos, S. Barratt, S. Boyd, S. Diamond, and J. Z. Kolter, “Differentiable convex optimization layers,” Advances in neural information processing systems, vol. 32, 2019.
- S. Muntwiler, K. P. Wabersich, and M. N. Zeilinger, “Learning-based moving horizon estimation through differentiable convex optimization layers,” in Learning for Dynamics and Control Conference. PMLR, 2022, pp. 153–165.
- B. Amos, I. Jimenez, J. Sacks, B. Boots, and J. Z. Kolter, “Differentiable mpc for end-to-end planning and control,” Advances in neural information processing systems, vol. 31, 2018.
- S. Chen, K. Saulnier, N. Atanasov, D. D. Lee, V. Kumar, G. J. Pappas, and M. Morari, “Approximating explicit model predictive control using constrained neural networks,” in 2018 Annual American control conference (ACC). IEEE, 2018, pp. 1520–1527.
- M. Bhardwaj, B. Boots, and M. Mukadam, “Differentiable gaussian process motion planning,” in 2020 IEEE international conference on robotics and automation (ICRA). IEEE, 2020, pp. 10 598–10 604.
- M. V. Pogančić, A. Paulus, V. Musil, G. Martius, and M. Rolinek, “Differentiation of blackbox combinatorial solvers,” in International Conference on Learning Representations, 2019.
- J. Djolonga and A. Krause, “Differentiable learning of submodular models,” Advances in Neural Information Processing Systems, vol. 30, 2017.
- S. Sakaue, “Differentiable greedy algorithm for monotone submodular maximization: Guarantees, gradient estimators, and applications,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2021, pp. 28–36.
- C. Harshaw, M. Feldman, J. Ward, and A. Karbasi, “Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications,” in International Conference on Machine Learning. PMLR, 2019, pp. 2634–2643.
- E. Jang, S. Gu, and B. Poole, “Categorical reparametrization with gumble-softmax,” in International Conference on Learning Representations (ICLR 2017). OpenReview. net, 2017.
- J. Seguro and T. Lambert, “Modern estimation of the parameters of the weibull wind speed distribution for wind energy analysis,” Journal of wind engineering and industrial aerodynamics, vol. 85, no. 1, pp. 75–84, 2000.
- T. Horel and Y. Singer, “Maximization of approximately submodular functions,” Advances in neural information processing systems, vol. 29, 2016.
- F. B. Sorbelli, F. Corò, S. K. Das, and C. M. Pinotti, “Energy-constrained delivery of goods with drones under varying wind conditions,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 9, pp. 6048–6060, 2020.
- G. Shi, N. Karapetyan, A. B. Asghar, J.-P. Reddinger, J. Dotterweich, J. Humann, and P. Tokekar, “Risk-aware uav-ugv rendezvous with chance-constrained markov decision process,” in 2022 IEEE 61st Conference on Decision and Control (CDC). IEEE, 2022, pp. 180–187.
- A. B. Asghar, G. Shi, N. Karapetyan, J. Humann, J.-P. Reddinger, J. Dotterweich, and P. Tokekar, “Risk-aware recharging rendezvous for a collaborative team of uavs and ugvs,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 5544–5550.
- G. Özcan, A. Moharrer, and S. Ioannidis, “Submodular maximization via taylor series approximation,” in Proceedings of the 2021 SIAM International Conference on Data Mining (SDM). SIAM, 2021, pp. 423–431.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.