Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PAAM: A Framework for Coordinated and Priority-Driven Accelerator Management in ROS 2 (2404.06452v1)

Published 9 Apr 2024 in cs.RO, cs.SY, and eess.SY

Abstract: This paper proposes a Priority-driven Accelerator Access Management (PAAM) framework for multi-process robotic applications built on top of the Robot Operating System (ROS) 2 middleware platform. The framework addresses the issue of predictable execution of time- and safety-critical callback chains that require hardware accelerators such as GPUs and TPUs. PAAM provides a standalone ROS executor that acts as an accelerator resource server, arbitrating accelerator access requests from all other callbacks at the application layer. This approach enables coordinated and priority-driven accelerator access management in multi-process robotic systems. The framework design is directly applicable to all types of accelerators and enables granular control over how specific chains access accelerators, making it possible to achieve predictable real-time support for accelerators used by safety-critical callback chains without making changes to underlying accelerator device drivers. The paper shows that PAAM also offers a theoretical analysis that can upper bound the worst-case response time of safety-critical callback chains that necessitate accelerator access. This paper also demonstrates that complex robotic systems with extensive accelerator usage that are integrated with PAAM may achieve up to a 91\% reduction in end-to-end response time of their critical callback chains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware on board: Enabling autonomous vehicles with embedded systems,” in ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), 2018.
  2. The Robot Report, “Open Robotics developing Space ROS with Blue Origin, NASA,” https://www.therobotreport.com/open-robotics-developing-space-ros/, accessed October 2022.
  3. D. Casini, T. Blaß, I. Lütkebohle, and B. Brandenburg, “Response-time analysis of ROS 2 processing chains under reservation-based scheduling,” in Euromicro Conference on Real-Time Systems (ECRTS), 2019.
  4. Y. Tang, Z. Feng, N. Guan, X. Jiang, M. Lv, Q. Deng, and W. Yi, “Response time analysis and priority assignment of processing chains on ROS2 executors,” in IEEE Real-Time Systems Symposium (RTSS), 2020.
  5. T. Blass, D. Casini, S. Bozhko, and B. B. Brandenburg, “A ROS 2 response-time analysis exploiting starvation freedom and execution-time variance,” in IEEE Real-Time Systems Symposium (RTSS), 2021.
  6. T. Blass, A. Hamann, R. Lange, D. Ziegenbein, and B. B. Brandenburg, “Automatic latency management for ROS 2: Benefits, challenges, and open problems,” in IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2021.
  7. H. Choi, Y. Xiang, and H. Kim, “PiCAS: New design of priority-driven chain-aware scheduling for ROS2,” in 2021 IEEE 27th Real-Time and Embedded Technology and Applications Symposium (RTAS).   IEEE, 2021, pp. 251–263.
  8. H. Sobhani, H. Choi, and H. Kim, “Timing Analysis and Priority-driven Enhancements of ROS 2 Multi-threaded Executors,” in IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2023.
  9. A. A. Arafat, S. Vaidhun, K. M. Wilson, J. Sun, and Z. Guo, “Response time analysis for dynamic priority scheduling in ROS2,” in Proceedings of the 59th ACM/IEEE Design Automation Conference, 2022, pp. 301–306.
  10. (accessed March 2022) ROS2 Real-Time Working Group: Reference system. https://github.com/ros-realtime/reference-system. [Online]. Available: https://github.com/ros-realtime/reference-system
  11. H. Choi, D. Enright, H. Sobhani, Y. Xiang, and H. Kim, “Priority-driven real-time scheduling in ros 2: Potential and challenges,” RAGE 2022, p. 28, 2022.
  12. Y. Tang, N. Guan, Z. Feng, X. Jiang, and W. Yi, “Response time analysis of lazy round robin,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2021.
  13. D. De Niz, K. Lakshmanan, and R. Rajkumar, “On the scheduling of mixed-criticality real-time task sets,” in IEEE Real-Time Systems Symposium (RTSS), 2009.
  14. N. Capodieci, R. Cavicchioli, M. Bertogna, and A. Paramakuru, “Deadline-based scheduling for GPU with preemption support,” in IEEE Real-Time Systems Symposium (RTSS), 2018.
  15. J. Bakita and J. H. Anderson, “Hardware Compute Partitioning on NVIDIA GPUs,” in 2023 IEEE 29th Real-Time and Embedded Technology and Applications Symposium (RTAS).   IEEE, 2023, pp. 54–66.
  16. M. Yang, N. Otterness, J. H. Anderson, and F. D. Smith, “Avoiding pitfalls when using nvidia gpus for real-time tasks in autonomous systems,” in Proceedings of the 30th Euromicro Conference on Real-Time Systems, 2018.
  17. (accessed Oct 2023) Issue #645: Allowing multiple users use a single loaded model on CORAL TPU. [Online]. Available: https://github.com/google-coral/edgetpu/issues/645
  18. H. Kim, P. Patel, S. Wang, and R. R. Rajkumar, “A server-based approach for predictable gpu access control,” in 2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA).   IEEE, 2017, pp. 1–10.
  19. H. Kim et al., “A server-based approach for predictable GPU access with improved analysis,” Journal of Systems Architecture, vol. 88, pp. 97–109, 2018.
  20. G. A. Elliott, B. C. Ward, and J. H. Anderson, “GPUSync: A framework for real-time GPU management,” in IEEE Real-Time Systems Symposium (RTSS), 2013.
  21. P. Patel, I. Baek, H. Kim, and R. Rajkumar, “Analytical enhancements and practical insights for MPCP with self-suspensions,” in IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2018.
  22. “Eclipse iceoryx - true zero-copy inter-process-communication,” https://github.com/eclipse-iceoryx/iceoryx, accessed March 2022.
  23. “Eclipse Cyclone DDS,” https://github.com/eclipse-cyclonedds/cyclonedds, accessed October 2022.
  24. Y. Choi and M. Rhu, “Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units,” in IEEE International Symposium on High Performance Computer Architecture (HPCA), 2020.
  25. N. Otterness, M. Yang, S. Rust, E. Park, J. H. Anderson, F. D. Smith, A. Berg, and S. Wang, “An evaluation of the NVIDIA TX1 for supporting real-time computer-vision workloads,” in IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2017.
  26. Y. Xiang and H. Kim, “Pipelined data-parallel cpu/gpu scheduling for multi-dnn real-time inference,” in 2019 IEEE Real-Time Systems Symposium (RTSS).   IEEE, 2019, pp. 392–405.
  27. T. Amert, N. Otterness, M. Yang, J. H. Anderson, and F. D. Smith, “GPU scheduling on the NVIDIA TX2: Hidden details revealed,” in 2017 IEEE Real-Time Systems Symposium (RTSS).   IEEE, 2017, pp. 104–115.
  28. “Nvidia multi-process service,” https://docs.nvidia.com/deploy/mps/index.html, accessed March 2022.
  29. M. Bertogna, M. Cirinei, and G. Lipari, “Schedulability analysis of global scheduling algorithms on multiprocessor platforms,” IEEE Transactions on parallel and distributed systems, vol. 20, no. 4, pp. 553–566, 2008.
  30. H. Kim, D. de Niz, B. Andersson, M. Klein, and J. Lehoczky, “Addressing multi-core timing interference using co-runner locking,” in IEEE Real-Time Systems Symposium (RTSS), 2021.
  31. K. Bletsas, N. Audsley, W.-H. Huang, J.-J. Chen, and G. Nelissen, “Errata for three papers (2004-05) on fixed-priority scheduling with self-suspensions,” CISTER-Research Centre in Realtime and Embedded Computing Systems, Tech. Rep., 2015.
  32. J.-J. Chen, G. Nelissen, W.-H. Huang, M. Yang, B. Brandenburg, K. Bletsas, C. Liu, P. Richard, F. Ridouard, N. Audsley et al., “Many suspensions, many problems: a review of self-suspending tasks in real-time systems,” Real-Time Systems, vol. 55, no. 1, pp. 144–207, 2019.
  33. R. Davis, L. George, and P. Courbin, “Quantifying the sub-optimality of uniprocessor fixed priority non-pre-emptive scheduling,” in 18th International Conference on Real-Time and Network Systems, 2010, pp. 1–10.
  34. (accessed May 2022) Autoware Foundation. https://gitlab.com/autowarefoundation/autoware.auto. [Online]. Available: https://gitlab.com/autowarefoundation/autoware.auto
  35. (accessed Oct 2023) Eclipse iceoryx: Measuring the latency of different IPC mechanisms. [Online]. Available: https://iceoryx.io/v1.0.1/getting-started/examples/iceperf/
  36. “Ros 2 default rmw tsc reports,” accessed October 2023. [Online]. Available: https://osrf.github.io/TSC-RMW-Reports/
  37. R. Henia, A. Hamann, M. Jersak, R. Racu, K. Richter, and R. Ernst, “System level performance analysis - the symta/s approach,” Computers and Digital Techniques, IEE Proceedings -, vol. 152, pp. 148 – 166, 04 2005.
  38. V. Mayoral-Vilches, S. M. Neuman, B. Plancher, and V. J. Reddi, “Robotcore: An open architecture for hardware acceleration in ros 2,” 2022. [Online]. Available: https://arxiv.org/abs/2205.03929
  39. Y.-P. Wang, W. Tan, X.-Q. Hu, D. Manocha, and S.-M. Hu, “Tzc: Efficient inter-process communication for robotics middleware with partial serialization,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 7805–7812.
  40. I. Baek, M. Harding, A. Kanda, K. R. Choi, S. Samii, and R. R. Rajkumar, “Carss: Client-aware resource sharing and scheduling for heterogeneous applications,” in 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2020, pp. 324–335.
  41. Y. Liang, H. P. Huynh, K. Rupnow, R. S. M. Goh, and D. Chen, “Efficient gpu spatial-temporal multitasking,” IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 3, pp. 748–760, 2015.
  42. I. Tanasic, I. Gelado, J. Cabezas, A. Ramirez, N. Navarro, and M. Valero, “Enabling preemptive multiprogramming on gpus,” in 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), 2014, pp. 193–204.
  43. D. Casini, P. Pazzaglia, A. Biondi, and M. D. Natale, “Optimized partitioning and priority assignment of real-time applications on heterogeneous platforms with hardware acceleration,” Journal of Systems Architecture, vol. 124, p. 102416, mar 2022. [Online]. Available: https://doi.org/10.1016%2Fj.sysarc.2022.102416
  44. R. Li, T. Hu, X. Jiang, L. Li, W. Xing, Q. Deng, and N. Guan, “Rosgm: A real-time gpu management framework with plug-in policies for ros 2,” in 2023 IEEE 29th Real-Time and Embedded Technology and Applications Symposium (RTAS), 2023, pp. 93–105.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com