Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MPCGPU: Real-Time Nonlinear Model Predictive Control through Preconditioned Conjugate Gradient on the GPU (2309.08079v3)

Published 15 Sep 2023 in cs.RO and cs.DC

Abstract: Nonlinear Model Predictive Control (NMPC) is a state-of-the-art approach for locomotion and manipulation which leverages trajectory optimization at each control step. While the performance of this approach is computationally bounded, implementations of direct trajectory optimization that use iterative methods to solve the underlying moderately-large and sparse linear systems, are a natural fit for parallel hardware acceleration. In this work, we introduce MPCGPU, a GPU-accelerated, real-time NMPC solver that leverages an accelerated preconditioned conjugate gradient (PCG) linear system solver at its core. We show that MPCGPU increases the scalability and real-time performance of NMPC, solving larger problems, at faster rates. In particular, for tracking tasks using the Kuka IIWA manipulator, MPCGPU is able to scale to kilohertz control rates with trajectories as long as 512 knot points. This is driven by a custom PCG solver which outperforms state-of-the-art, CPU-based, linear system solvers by at least 10x for a majority of solves and 3.6x on average.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. F. R. Hogan, E. R. Grau, and A. Rodriguez, “Reactive planar manipulation with convex hybrid mpc,” in 2018 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 247–253.
  2. J.-P. Sleiman, F. Farshidian, M. V. Minniti, and M. Hutter, “A unified mpc framework for whole-body dynamic locomotion and manipulation,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4688–4695, 2021.
  3. M. Tranzatto, F. Mascarich, L. Bernreiter, C. Godinho, M. Camurri, S. Khattak, T. Dang, V. Reijgwart, J. Loeje, D. Wisth, S. Zimmermann, H. Nguyen, M. Fehr, L. Solanka, R. Buchanan, M. Bjelonic, N. Khedekar, M. Valceschini, F. Jenelten, M. Dharmadhikari, T. Homberger, P. D. Petris, L. Wellhausen, M. Kulkarni, T. Miki, S. Hirsch, M. Montenegro, C. Papachristos, F. Tresoldi, J. Carius, G. Valsecchi, J. Lee, K. Meyer, X. Wu, J. Nieto, A. Smith, M. Hutter, R. Siegwart, M. Mueller, M. Fallon, and K. Alexis, “Cerberus: Autonomous legged and aerial robotic exploration in the tunnel and urban circuits of the darpa subterranean challenge,” arXiv preprint arXiv:2201.07067, 2022.
  4. P. M. Wensing, M. Posa, Y. Hu, A. Escande, N. Mansard, and A. Del Prete, “Optimization-based control for dynamic legged robots,” arXiv preprint arXiv:2211.11644, 2022.
  5. S. Kuindersma, “Taskable agility: Making useful dynamic behavior easier to create,” Princeton Robotics Seminar, April 2023.
  6. R. Bellman, Dynamic Programming.   Dover.
  7. D. Q. Mayne, “A second-order gradient method of optimizing non- linear discrete time systems,” vol. 3, p. 8595.
  8. D. H. Jacobson and D. Q. Mayne, “Differential dynamic programming,” 1970.
  9. J. T. Betts and W. P. Huffman, “Trajectory optimization on a parallel processor,” vol. 14, no. 2, pp. 431–439.
  10. H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger, “Dark Silicon and the End of Multicore Scaling,” in Proceedings of the 38th Annual International Symposium on Computer Architecture, ser. ISCA ’11.   ACM, pp. 365–376.
  11. G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor, “Conservation Cores: Reducing the Energy of Mature Computations,” in Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS XV.   ACM, pp. 205–218.
  12. T. Antony and M. J. Grant, “Rapid Indirect Trajectory Optimization on Highly Parallel Computing Architectures,” vol. 54, no. 5, pp. 1081–1091.
  13. Z. Pan, B. Ren, and D. Manocha, “Gpu-based contact-aware trajectory optimization using a smooth force model,” in Proceedings of the 18th Annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation, ser. SCA ’19.   New York, NY, USA: ACM, 2019, pp. 4:1–4:12.
  14. B. Plancher, S. M. Neuman, T. Bourgeat, S. Kuindersma, S. Devadas, and V. J. Reddi, “Accelerating robot dynamics gradients on a cpu, gpu, and fpga,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2335–2342, 2021.
  15. S. M. Neuman, B. Plancher, T. Bourgeat, T. Tambe, S. Devadas, and V. J. Reddi, “Robomorphic computing: A design methodology for domain-specific accelerators parameterized by robot morphology,” ser. ASPLOS 2021.   New York, NY, USA: Association for Computing Machinery, 2021, p. 674–686. [Online]. Available: https://doi-org.ezp-prod1.hul.harvard.edu/10.1145/3445814.3446746
  16. B. Plancher, S. M. Neuman, R. Ghosal, S. Kuindersma, and V. J. Reddi, “GRiD: GPU-Accelerated Rigid Body Dynamics with Analytical Gradients,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, pp. 6253–6260. [Online]. Available: https://ieeexplore.ieee.org/document/9812384/
  17. Y. Lee, M. Cho, and K.-S. Kim, “Gpu-parallelized iterative lqr with input constraints for fast collision avoidance of autonomous vehicles,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 4797–4804.
  18. S. M. Neuman, R. Ghosal, T. Bourgeat, B. Plancher, and V. J. Reddi, “Roboshape: Using topology patterns to scalably and flexibly deploy accelerators across robots,” in Proceedings of the 50th Annual International Symposium on Computer Architecture, ser. ISCA ’23.   New York, NY, USA: Association for Computing Machinery, 2023. [Online]. Available: https://doi.org/10.1145/3579371.3589104
  19. Y. Yang, X. Chen, and Y. Han, “Rbdcore: Robot rigid body dynamics accelerator with multifunctional pipelines,” arXiv preprint arXiv:2307.02274, 2023.
  20. M. Giftthaler, M. Neunert, M. Stäuble, J. Buchli, and M. Diehl, “A Family of Iterative Gauss-Newton Shooting Methods for Nonlinear Optimal Control.” [Online]. Available: http://arxiv.org/abs/1711.11006
  21. F. Farshidian, E. Jelavic, A. Satapathy, M. Giftthaler, and J. Buchli, “Real-time motion planning of legged robots: A model predictive control approach,” in 2017 IEEE-RAS 17th International Conference on Humanoid Robotics.
  22. D. Kouzoupis, R. Quirynen, B. Houska, and M. Diehl, “A Block Based ALADIN Scheme for Highly Parallelizable Direct Optimal Control,” in Proceedings of the American Control Conference.
  23. Y. Jiang, J. Oravec, B. Houska, and M. Kvasnica, “Parallel mpc for linear systems with input constraints,” IEEE Transactions on Automatic Control, vol. 66, no. 7, pp. 3401–3408, 2020.
  24. B. Plancher and S. Kuindersma, “A Performance Analysis of Parallel Differential Dynamic Programming on a GPU,” in International Workshop on the Algorithmic Foundations of Robotics (WAFR).
  25. ——, “Realtime model predictive control using parallel ddp on a gpu,” in Toward Online Optimal Control of Dynamic Robots Workshop at the 2019 International Conference on Robotics and Automation (ICRA), Montreal, Canada, May. 2019.
  26. S. C. Eisenstat, “Efficient implementation of a class of preconditioned conjugate gradient methods,” SIAM Journal on Scientific and Statistical Computing, vol. 2, no. 1, pp. 1–4, 1981.
  27. B. Plancher, “Gpu acceleration for real-time, whole-body, nonlinear model predictive control,” Ph.D. dissertation, Harvard University, Cambridge, MA, USA, April 2022.
  28. R. Helfenstein and J. Koko, “Parallel preconditioned conjugate gradient algorithm on GPU,” vol. 236, no. 15, pp. 3584–3590. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0377042711002196
  29. M. Schubiger, G. Banjac, and J. Lygeros, “Gpu acceleration of admm for large-scale quadratic programming,” Journal of Parallel and Distributed Computing, vol. 144, pp. 55–67, 2020.
  30. J. H. Jung and D. P. O’Leary, “Cholesky decomposition and linear programming on a gpu,” Scholarly Paper, University of Maryland, 2006.
  31. D. Yang, G. D. Peterson, and H. Li, “Compressed sensing and cholesky decomposition on fpgas and gpus,” Parallel Computing, vol. 38, no. 8, pp. 421–437, 2012.
  32. I. E. Venetis, A. Kouris, A. Sobczyk, E. Gallopoulos, and A. H. Sameh, “A direct tridiagonal solver based on givens rotations for gpu architectures,” Parallel Computing, vol. 49, pp. 101–116, 2015.
  33. J. D. Hogg, E. Ovtchinnikov, and J. A. Scott, “A sparse symmetric indefinite direct solver for gpu architectures,” ACM Transactions on Mathematical Software (TOMS), vol. 42, no. 1, pp. 1–25, 2016.
  34. X. Hu, C. C. Douglas, R. Lumley, and M. Seo, “Gpu accelerated sequential quadratic programming,” in 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES).   IEEE, 2017, pp. 3–6.
  35. S. N. Yeralan, T. A. Davis, W. M. Sid-Lakhdar, and S. Ranka, “Algorithm 980: Sparse qr factorization on the gpu,” ACM Transactions on Mathematical Software (TOMS), vol. 44, no. 2, pp. 1–29, 2017.
  36. K. Świrydowicz, E. Darve, W. Jones, J. Maack, S. Regev, M. A. Saunders, S. J. Thomas, and S. Peleš, “Linear solvers for power grid optimization problems: a review of gpu-accelerated linear solvers,” Parallel Computing, vol. 111, p. 102870, 2022.
  37. D. Cole, S. Shin, F. Pacaud, V. M. Zavala, and M. Anitescu, “Exploiting gpu/simd architectures for solving linear-quadratic mpc problems,” in 2023 American Control Conference (ACC).   IEEE, 2023, pp. 3995–4000.
  38. S. Shin, F. Pacaud, and M. Anitescu, “Accelerating optimal power flow with gpus: Simd abstraction of nonlinear programs and condensed-space interior-point methods,” arXiv preprint arXiv:2307.16830, 2023.
  39. F. Pacaud, S. Shin, M. Schanen, D. A. Maldonado, and M. Anitescu, “Accelerating condensed interior-point methods on simd/gpu architectures,” Journal of Optimization Theory and Applications, pp. 1–20, 2023.
  40. J. Bolz, I. Farmer, E. Grinspun, and P. Schröoder, “Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid,” in ACM SIGGRAPH 2003 Papers, ser. SIGGRAPH ’03.   ACM, pp. 917–924. [Online]. Available: http://doi.acm.org/10.1145/1201775.882364
  41. H. Liu, J.-H. Seo, R. Mittal, and H. H. Huang, “Gpu-accelerated scalable solver for banded linear systems,” in 2013 IEEE International Conference on Cluster Computing (CLUSTER).   IEEE, 2013, pp. 1–8.
  42. H. Anzt, M. Gates, J. Dongarra, M. Kreutzer, G. Wellein, and M. Köhler, “Preconditioned krylov solvers on gpus,” Parallel Computing, 05 2017.
  43. H. Anzt, M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, “Optimization and performance evaluation of the idr iterative krylov solver on gpus,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, 2018.
  44. G. Flegar et al., “Sparse linear system solvers on gpus: Parallel preconditioning, workload balancing, and communication reduction,” Ph.D. dissertation, Universitat Jaume I, 2019.
  45. M. Tiwari and S. Vadhiyar, “Strategies for efficient execution of pipelined conjugate gradient method on gpu systems,” in International Conference on High Performance Computing.   Springer, 2022, pp. 77–89.
  46. L.-W. Chang, J. A. Stratton, H.-S. Kim, and W.-M. W. Hwu, “A scalable, numerically stable, high-performance tridiagonal solver using gpus,” in SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis.   IEEE, 2012, pp. 1–11.
  47. A. P. Diéguez, M. Amor, and R. Doallo, “New tridiagonal systems solvers on gpu architectures,” in 2015 IEEE 22nd International Conference on High Performance Computing (HiPC).   IEEE, 2015, pp. 85–94.
  48. A. Lamas Daviña and J. Roman, “Mpi-cuda parallel linear solvers for block-tridiagonal matrices in the context of slepc’s eigensolvers,” Parallel computing, vol. 74, pp. 118–135, 2018.
  49. M. Naumov, “Incomplete-lu and cholesky preconditioned iterative methods using cusparse and cublas,” Nvidia white paper, vol. 3, 2011.
  50. S. Heinrich, A. Zoufahl, and R. Rojas, “Real-time trajectory optimization under motion uncertainty using a GPU,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3572–3577.
  51. Q. Wu, F. Xiong, F. Wang, and Y. Xiong, “Parallel particle swarm optimization on a graphics processing unit with application to trajectory optimization,” Engineering Optimization, vol. 48, no. 10, pp. 1679–1692, 2016.
  52. G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,” Journal of Guidance, Control, and Dynamics, vol. 40, no. 2, pp. 344–357, 2017.
  53. D.-K. Phung, B. Hérissé, J. Marzat, and S. Bertrand, “Model Predictive Control for Autonomous Navigation Using Embedded Graphics Processing Unit,” vol. 50, no. 1, pp. 11 883–11 888. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S2405896317319614
  54. P. Hyatt and M. D. Killpack, “Real-time evolutionary model predictive control using a graphics processing unit,” in 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).   IEEE, 2017, pp. 569–576.
  55. S. Ohyama and H. Date, “Parallelized nonlinear model predictive control on gpu,” in 2017 11th Asian Control Conference (ASCC).   IEEE, 2017, pp. 1620–1625.
  56. K. M. M. Rathai, O. Sename, and M. Alamir, “Gpu-based parameterized nmpc scheme for control of half car vehicle with semi-active suspension system,” IEEE Control Systems Letters, vol. 3, no. 3, pp. 631–636, 2019.
  57. Y. Wang, X. Luo, F. Zhang, and S. Wang, “Gpu-based model predictive control for continuous casting spray cooling control system using particle swarm optimization,” Control Engineering Practice, vol. 84, pp. 349–364, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S096706611830710X
  58. P. Hyatt, C. S. Williams, and M. D. Killpack, “Parameterized and gpu-parallelized real-time model predictive control for high degree of freedom robots,” arXiv preprint arXiv:2001.04931, 2020.
  59. J. V. Frasch, M. Vukov, H. J. Ferreau, and M. Diehl, “A dual newton strategy for the efficient solution of sparse quadratic programs arising in sqp-based nonlinear mpc,” Optimization Online 3972, 2013.
  60. A. Astudillo, J. Gillis, G. Pipeleers, W. Decré, and J. Swevers, “Speed-up of nonlinear model predictive control for robot manipulators using task and data parallelism,” in 2022 IEEE 17th International Conference on Advanced Motion Control (AMC), 2022, pp. 201–206.
  61. Y. Gang and L. Mingguang, “Acceleration of mpc using graphic processing unit,” in Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.   IEEE, 2012, pp. 1001–1004.
  62. N. F. Gade-Nielsen, J. B. Jørgensen, and B. Dammann, “Mpc toolbox with gpu accelerated optimization algorithms,” in 10th European workshop on advanced control and diagnosis.   Technical University of Denmark, 2012.
  63. Y. Huang, K. V. Ling, and S. See, “Solving Quadratic Programming Problems on Graphics Processing Unit.”
  64. L. Yu, A. Goldsmith, and S. Di Cairano, “Efficient Convex Optimization on GPUs for Embedded Model Predictive Control,” in Proceedings of the General Purpose GPUs, ser. GPGPU-10.   ACM, pp. 12–21. [Online]. Available: http://doi.acm.org/10.1145/3038228.3038234
  65. R. Featherstone, Rigid Body Dynamics Algorithms.   Springer.
  66. A. Wächter and L. T. Biegler, “On the Implementation of an Interior-point Filter Line-search Algorithm for Large-scale Nonlinear Programming,” vol. 106, no. 1, pp. 25–57. [Online]. Available: https://doi.org/10.1007/s10107-004-0559-y
  67. P. E. Gill, W. Murray, and M. A. Saunders, “SNOPT: An SQP Algorithm for Large-scale Constrained Optimization,” vol. 47, no. 1, pp. 99–131.
  68. M. Schubiger, G. Banjac, and J. Lygeros, “GPU Acceleration of ADMM for Large-Scale Quadratic Programming,” vol. 144, pp. 55–67. [Online]. Available: http://arxiv.org/abs/1912.04263
  69. J. Nickolls, I. Buck, M. Garland, and K. Skadron, “Scalable Parallel Programming with CUDA,” vol. 6, pp. 40–53.
  70. X. Bu and B. Plancher, “Symmetric stair preconditioning of linear systems for parallel trajectory optimization,” in IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, May. 2024.
  71. B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd, “OSQP: an operator splitting solver for quadratic programs,” Mathematical Programming Computation, vol. 12, no. 4, pp. 637–672, 2020. [Online]. Available: https://doi.org/10.1007/s12532-020-00179-2
  72. C. Mastalli, R. Budhiraja, W. Merkt, G. Saurel, B. Hammoud, M. Naveau, J. Carpentier, L. Righetti, S. Vijayakumar, and N. Mansard, “Crocoddyl: An efficient and versatile framework for multi-contact optimal control,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 2536–2542.
  73. S. Kleff, A. Meduri, R. Budhiraja, N. Mansard, and L. Righetti, “High-frequency nonlinear model predictive control of a manipulator,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 7330–7336.
  74. T. Howell, B. Jackson, and Z. Manchester, “Altro: A fast solver for constrained trajectory optimization,” in Proceedings of (IROS) IEEE/RSJ International Conference on Intelligent Robots and Systems, November 2019, pp. 7674 – 7679.
  75. Z. Zhou and Y. Zhao, “Accelerated admm based trajectory optimization for legged locomotion with coupled rigid body dynamics.”   IEEE, 2020, pp. 5082–5089.
  76. M. Ditty, “Nvidia orin system-on-chip,” in 2022 IEEE Hot Chips 34 Symposium (HCS).   IEEE Computer Society, 2022, pp. 1–17.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Emre Adabag (1 paper)
  2. Miloni Atal (1 paper)
  3. William Gerard (2 papers)
  4. Brian Plancher (21 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.