ROS 2: Next-Gen Robotics Middleware
- ROS 2 is a modular, distributed robotics middleware that enables real-time, secure, and scalable communication from embedded systems to industrial clusters.
- It integrates layered abstractions with DDS-based QoS policies for deterministic publish/subscribe semantics and dynamic node composition.
- The framework advances robotics with robust execution models, comprehensive security protocols, and optimized resource management for multi-robot and industrial applications.
The Robot Operating System 2 (ROS 2) is a modular, distributed, middleware framework for robotics, developed as a ground-up redesign of the original ROS 1 to meet the rigorous performance, reliability, security, and interoperability requirements of modern, production-grade robotic systems. By coupling a client-library abstraction with the OMG Data Distribution Service (DDS) standard, ROS 2 supports robust peer-to-peer discovery and communication across a wide variety of platforms, from embedded MCUs to large-scale industrial clusters. The ecosystem encompasses a diverse set of components, including composable nodes, real-time scheduling infrastructure, layered security, and pluggable middleware interfaces, and is deployed in domains such as industrial automation, multi-robot fleets, medical robotics, and high-speed autonomous vehicles.
1. Architectural Foundations and Abstraction Layers
ROS 2 implements a multi-tiered architectural stack, encapsulating user applications down to DDS-based network transport. The principal abstraction layers are:
- rclcpp/rclpy: client APIs for C++ and Python, providing node, publisher, subscriber, executor, and parameter APIs.
- rcl: the language-agnostic, C-based ROS Client Library, offering a uniform façade for middleware interactions.
- rmw: the ROS Middleware Interface, abstracting over multiple DDS vendor plugins.
- DDS/RTPS: Data Distribution Service (DDS) implementations such as Eclipse CycloneDDS, eProsima Fast DDS, and RTI Connext DDS, which perform real-time publish/subscribe messaging and discovery via the standardized RTPS wire protocol.
- Network interface: supports both multicast and unicast, with QoS-driven delivery.
A node in ROS 2 encapsulates publishers, subscribers, services, actions, timers, and parameters. Nodes can be instantiated in dedicated processes or assembled as composable components within a single container. Executors, provided in single-threaded, static, and multi-threaded variants, schedule callbacks attached to communication events, timers, or user-defined triggers. Callback groups provide concurrency control within an executor (Macenski et al., 2022Casini et al., 22 Dec 2025).
2. Quality of Service, Pub/Sub Semantics, and Real-Time Communication
The core pub/sub semantics in ROS 2 are tightly bound to DDS QoS policies, enabling per-topic or per-node configuration of reliability (RELIABLE or BEST_EFFORT), durability (VOLATILE or TRANSIENT_LOCAL), history (KEEP_LAST, KEEP_ALL, with depth), deadline, lifespan, liveliness, and resource limits. These tunable QoS contracts enable deterministic delivery, deadline monitoring, and resource-adaptive communication in a range of scenarios—from lossy, lossy Wi-Fi networks to high-bandwidth industrial links (Macenski et al., 2022Erős et al., 2019).
DDS Simple Discovery Protocol (SDP) eradicates the single-point-of-failure problem of ROS 1's master node, supporting dynamic, peer-to-peer discovery and multi-machine graphs without centralized coordination. This design is critical for scalable, robust operation in collaborative or geographically distributed automation, as demonstrated in industrial and multi-robot case studies (Erős et al., 2019Testa et al., 2020).
Timing and throughput metrics have been rigorously benchmarked:
- Sub-10 ms 99%-ile latency for 50 Hz state topics in distributed industrial cells (Erős et al., 2019).
- 95%-ile latency λ₉₅(S, pattern) and effective throughput ρ_eff(S) functions published for varying message size and network load (Macenski et al., 2022Macenski et al., 2023).
- Hardware-accelerated ROS 2 on FPGA achieves <2.5 μs latency, >62× throughput, and >500× energy efficiency improvement over software stacks (Mayoral-Vilches et al., 2024).
3. Node Composition, Modularity, and Multi-Process Models
Node composition in ROS 2 enables multiple nodes (components) to be instantiated and executed within a single OS process, facilitating intra-process zero-copy communication, shared resource overhead, and fine-grained scheduling (Macenski et al., 2023Scheunemann et al., 2019). Two principal models are supported:
- Manual composition: Components explicitly instantiated in application code; maximal control but fixed at compile time.
- Dynamic composition: Components loaded at run-time into generic containers via ROS 2's component API and launch infrastructure; supports hot-swapping and run-time topological reconfiguration.
Empirical results indicate:
- Multi-process memory grows O(N²) with N nodes (DDS participant cost), while composition caps growth around 11–13 MB for 20 nodes.
- Intra-process IPC composition reduces CPU usage by up to 10× and latency by ~70× for large (≥1 MB) messages.
- Component-based systems maintain full modularity and can be dynamically distributed for fault isolation or hardware mapping (Macenski et al., 2023).
Node composition is foundational for computational graphs in resource-constrained embedded robotics, high-rate sensor pipelines, and runtime-pluggable algorithm architectures (Scheunemann et al., 2019Macenski et al., 2023).
4. Security Infrastructure: DDS-Security, SROS2, and Intrusion Mitigation
ROS 2 integrates security at multiple layers through the DDS-Security 1.1 standard (Kim et al., 2018Vilches et al., 2022). The primary mechanisms are:
- Mutual authentication: X.509 certificates and PKI for node identity, managed by SROS2 keystore and policy tooling.
- Confidentiality and integrity: All DDS RTPS data protected under AES-GCM with session keys derived via ephemeral ECDH.
- Access control: Fine-grained XML policies (permissions.xml, governance.xml) permitting topic-wise publication/subscription boundaries and explicit domain scoping.
- DevSecOps workflow: SROS2 supports graph introspection, policy generation, deployment, and monitoring, with automated artifact management.
Latency and throughput penalties of DDS-Security are empirically low: ≈11% latency overhead for 5.2 ms → 5.8 ms on ARM SBCs, with negligible CPU cost (Vilches et al., 2022). Static analysis, formal protocol verification (ProVerif), and NIST/FIPS cryptographic audits are applied to ensure compliance (Kim et al., 2018Soriano-Salvador et al., 2024).
Intrusion prevention systems such as RIPS for ROS 2 exploit graph/traffic introspection and custom DSLs for declarative access rules, supporting rapid (subsecond) mitigation of actions such as unauthorized subscriber detection and malicious payload injection. Detection latency scales as where Δ is polling interval (Soriano-Salvador et al., 2024).
Extensions like FogROS2-SGC introduce global, cryptographically-addressed ROS 2 overlays, supporting secure, efficient pub/sub across WANs with per-topic AES/DTLS, 256-bit identifiers, and seamless DDS interoperation. FogROS2-SGC exhibits 19× lower latency than rosbridge in cloud-robotics telemetry (Chen et al., 2023).
5. Real-Time Scheduling, Executors, and Deterministic Execution
ROS 2 exposes real-time properties through its executor models and DDS transport modes. Executors (single-threaded, static, multi-threaded, custom) control callback scheduling, with callback-group semantics for race and exclusivity control. The scheduling policy (default, fixed-priority, deadline-based, EDF) and the periodic or sporadic arrival of messages are central for analysis (Casini et al., 22 Dec 2025).
Key timing metrics:
- Response Time (RT): release-to-completion time for a callback.
- Reaction Time (MRT): external stimulus to first callback.
- Data Age (MDA): latency from data origin to sink.
- End-to-end latency: .
- Worst-case end-to-end latency bound by callback execution time, DDS delivery, and executor polling window.
Recent research has enhanced deterministic execution:
- Priority- and chain-aware custom executors (e.g., PiCAS, EDF, RTeX).
- LET-inspired double-server queueing to achieve near-zero jitter and bounded worst-case communication delays (Casini et al., 22 Dec 2025).
- Micro-ROS runners for static real-time scheduling on MCUs.
On specialized hardware, FPGA-resident ROS 2 pipelines implement each stack layer in HDL, achieving brain-like single-digit μs latencies and isochronous, statically arbitrated message delivery. E.g., round-trip mean latencies: vs. ; energy per message: 1.78 μJ vs. 518 μJ on CPU (Mayoral-Vilches et al., 2024).
6. Application Ecosystems: Navigation, Multi-Robot, and Industrial Automation
ROS 2 is the backbone of contemporary robotic navigation (Nav2), distributed control, and multi-robot architectures:
- Navigation2 (Nav2): pluginized BT-based orchestrator, modernized global planners (Smac A*, Hybrid-A*, State Lattice), multiple local controllers (DWB, TEB, MPPI, RPP), path smoothers, costmap layers, and extensive benchmarking (Macenski et al., 2023Bradford et al., 2023).
- Mobile robots: ROS 2 bridges disjoint frameworks and accelerates state-of-the-art, with <5 cm localization RMSE, sub-second path replanning, and high-frequency (4–10 kHz) feedback (Macenski et al., 2022).
- Industrial cells: Used in large-scale collaborative assembly, with stateless command/state topics, multi-machine hubs, sub-10 ms message latencies, zero packet loss, and transparent ROS 1–ROS 2 bridging (Erős et al., 2019).
- Multi-robot distributed control: ROS 2 enables graph-based optimization/consensus, distributed model-predictive control, dynamic task allocation (e.g., ChoiRbot), and automatic namespace partitioning for peer-to-peer discovery (no master node) (Testa et al., 2020).
- Heterogeneous swarms: Modular UAV architectures layer PX4 autopilots with ROS 2 components for control, vision, gimbal actuation; decentralized, QoS-strict communication enables scalable formation flight and coordinated missions (Pommeranz et al., 31 Oct 2025).
Community-driven and commercial stacks leverage ROS 2’s lifecycle nodes, real-time guarantees, security, and portability from microcontrollers to GPUs.
7. Limitations, Open Challenges, and Future Directions
Several research and engineering challenges persist:
- Coverage/completeness of DDS features on hardware-embedded ROS 2 pipelines—current FPGA implementations offer a DDS/RTPS subset lacking, e.g., full discovery and content-filtered topics (Mayoral-Vilches et al., 2024).
- DDS-Security/PKI complexity, life-cycle management, ephemeral resource discovery remain active points of refinement (Vilches et al., 2022Soriano-Salvador et al., 2024).
- Predictable real-time scheduling: executor models, callback prioritization, and OS integration are the focus of ongoing profiling, analysis, and mainline refactoring (Casini et al., 22 Dec 2025).
- Multi-robot, large-scale, cross-domain federation depends on tooling for homogeneous QoS policy, topic-namespace harmonization, and secure overlays (Chen et al., 2023).
- Three-dimensional mapping, visual-SLAM scalability, auto-tuning of navigation/planning algorithms, and full-feature 3D planning remain open roadmap items (Macenski et al., 2023).
- Usability barriers remain in security and infrastructure deployment—graphical policy editors, automated certificate rotation, and multi-machine security provisioning are under active development (Vilches et al., 2022).
A plausible implication is that the trajectory of ROS 2 is toward fully hardware-accelerated, dynamically composable, secure, and formally analyzable computation graphs, supporting dynamic orchestration and hard-real-time execution on heterogeneous robotic fleets. The ecosystem’s layered, open, and strictly modular structure continues to support rapid research, industrial adoption, and cross-domain innovation.