Large-Scale Mobile Robots (LSMR)

Updated 9 January 2026

LSMRs are autonomous mobile robot systems engineered for expansive operations, combining advanced perception, mapping, and navigation in dynamic and unstructured settings.
They integrate heterogeneous sensor arrays, modular computing architectures, and robust software pipelines to support semantic mapping, multi-agent exploration, and real-time SLAM.
Applications span industrial assembly, environmental monitoring, and subterranean exploration, highlighting the systems' scalability and resilient performance under uncertainty.

A Large-Scale Mobile Robot (LSMR) is an autonomous or semi-autonomous mobile robotic platform engineered for perception, mapping, manipulation, and navigation tasks in spatial domains that significantly exceed the operational and computational scope of standard mobile robots. LSMRs are characterized by the ability to operate over extended areas (from several thousand to tens of thousands of square meters), often in highly dynamic, unstructured, or multi-agent environments, and frequently incorporate heterogeneous sensor suites, adaptive planning frameworks, and scalable software architectures to support tasks such as semantic mapping, coordinated exploration, industrial assembly, and robust operation under uncertainty.

1. System Architectures and Representative Deployments

LSMR system architectures exhibit modularity at both hardware and software levels to meet scalability and robustness demands:

Robotic Platforms: Examples include differential-drive indoor robots equipped with pan-tilt sensor heads (e.g., Fetch robot with Hokuyo/Velodyne 2D LiDAR and RGB-D sensors (Allu et al., 2024)) for semantic mapping; legged-and-wheeled robots for mobile manipulation (e.g., Centauro, combining 4 DoF legs with steerable wheels and 15 DoF anthropomorphic arms (Klamt et al., 2019)); or high-speed outdoor mapping platforms using UGVs loaded onto autonomous vehicles for kilometer-scale data acquisition (Lin et al., 2024).
Compute and Sensing: These platforms typically utilize on-board CPUs for low-latency tasks and off-loaded GPUs (e.g., NVIDIA RTX 4090) for high-throughput perception (open-vocabulary detection, real-time segmentation), often integrated with precise RTK-GPS/INS and multi-modal arrays (LiDAR, event cameras, stereo vision) (Lin et al., 2024, Rose et al., 17 Jul 2025).
Software Pipelines: LSMR software stacks leverage middleware such as ROS, integrating modules for SLAM, semantic perception (e.g., GroundingDINO, MobileSAM), navigation (move_base, model-predictive control), robust data synchronization, and high-frequency safety monitoring (Allu et al., 2024, Shahna et al., 2 Jan 2026).

These architectures are validated in environments ranging from 800 m² indoor spaces (university building corridors (Allu et al., 2024)), dynamic public areas with dense pedestrian flows (THUD++/THUD datasets (Li et al., 2024, Tang et al., 2024)), to large subterranean facilities (multi-kilometer mines (Chang et al., 2022)) and outdoor agricultural testbeds (Rose et al., 17 Jul 2025).

2. Mapping, Perception, and Semantic Representation

Large-scale operation mandates robust mapping and perception, bridging geometry and semantics:

Occupancy Grid Mapping: LSMRs model world geometry using occupancy grids, with each cell $m_i$ as a binary RV (occupied/free), updated via recursive Bayes filtering in log-odds form and projected to occupancy probability $p_t(m_i=1)$ (Allu et al., 2024). For larger environments, multiresolution (voxel/point cloud) representations and adaptive voxel filtering control memory/compute load (Chang et al., 2022).
Semantic Mapping: LSMRs integrate object-instance graphs atop geometric maps; detected objects become nodes $v_i$ in a topological graph $G_T$ , annotated by category, $\mathbb{R}^3$ position (map frame centroid), and detection confidence. Semantics are dynamically updated as the environment changes, with objects added/removed based on spatial proximity and detector confidence (Allu et al., 2024).
Data Association: For each object detection, spatial association is performed by minimum centroid distance within the same category; a detection exceeding a threshold $\delta_\text{category}$ adds a new node, otherwise, node parameters are updated (Allu et al., 2024).
Dynamic Mapping: Seamless tracking of non-static objects and occlusion resilience are supported by repeated traversals with field-of-view-based deletion or update of semantic nodes.

The hybrid grid–graph map is both memory- and compute-efficient for large, cluttered environments, enabling rapid semantic updates by modifying graph nodes rather than intensive voxel operations (Allu et al., 2024).

Autonomous maneuvering and decision-making over large and/or dynamic workspaces use advanced strategies:

Exploration: Dynamic-window frontier exploration identifies candidate frontiers within local/global search radii. Exploration terminates by threshold or time-out, with subsequent environment traversal planned via waypoint sampling and greedy TSP, ensuring O(N) planning rather than NP-hard combinatorial complexity (Allu et al., 2024).
Multi-Agent Exploration: Formulations based on Dec-POMDPs and Graph Attention Networks (as in MARVEL) offer decentralized, scalable exploration policies for teams with constrained field-of-view sensors. These leverage self-attention over graph-encoded spatial/action spaces and information-driven action pruning to address large action spaces and multi-robot collaboration in $90 \times 90~\text{m}$ environments (Chiun et al., 27 Feb 2025).
Model Predictive and Adaptive Control: On challenging terrains, state-of-the-art LSMRs use NMPC stacked with robust deep learning-based controllers (RSDNN) for real-time trajectory tracking and slip compensation. Online safety is enforced by logarithmic barrier functions, and the control stack is theoretically shown to be ISS and exponentially convergent (Shahna et al., 2 Jan 2026).
Manipulation and Teleoperation: Full-body telepresence suits, semi-autonomous planners, and force-feedback exoskeletons enable LSMRs to perform complex manipulation in cluttered or hazardous environments, integrating hybrid autonomy levels (Klamt et al., 2019).

4. Robust Localization and SLAM in Large-Scale Environments

Scalable localization and mapping pipelines are essential for LSMR deployment over expansive and perceptually challenging domains:

Deep Probabilistic Global Localization: LSMRs combine learned global pose estimation (deep kernels over ResNet features, GP regression) with fast Monte Carlo Localization (MCL) seeded by deep priors, achieving median 0.8 s relocalization to 0.75 m precision in $0.5~\text{km}^2$ environments (Sun et al., 2020).
Factor Graph-based Multi-Robot SLAM: Systems such as LAMP 2.0 modularize front-end scan/odometry ingestion and keyframe selection, perform candidate loop closure via certifiable initialization (TEASER++, SAC-IA) with Generalized-ICP local refinement, and optimize global pose via robustified factor graphs using Graduated Non-Convexity (GNC). This enables <1 m per-robot absolute trajectory error over $\sim$ 2 km traversals and strong outlier resilience (Chang et al., 2022).
Resilient Loop Closure and Prioritization: Adaptive candidate radius rules, graph-based information gain prioritization, and robust outlier rejection enable tractable O(N) scaling rather than O(N²), making multi-robot, kilometer-class SLAM practically deployable in real time (Chang et al., 2022).

5. LSMR Datasets, Simulation Environments, and Benchmarking

Benchmark datasets and simulation platforms play a pivotal role in LSMR perception and navigation research:

THUD/THUD++: Large-scale, dynamic scene benchmarks containing 90K RGB-D frames, ~20M 2D/3D boxes, and thousands of pedestrian trajectories in both real and synthetic indoor environments. These datasets stress relocalization, semantic segmentation, and socially-aware navigation in dense dynamic scenes, highlighting sharply degraded perception accuracy as crowd density increases and enabling sim-to-real transfer studies through consistent annotation formats and physics-based robot navigation emulation (Li et al., 2024, Tang et al., 2024).
High-Speed Mapping Data Collection: Platforms such as the ShanghaiTech Mapping Robot on hydraulic flatbeds enable rapid data acquisition over >10 km, combining multi-modal sensor arrays, robust odometry/IMU synchronization, and ground texture recording for high-bandwidth SLAM benchmarks (Lin et al., 2024).
Evaluation Metrics: Task-specific benchmarks include mAP for 3D detection, MIoU for semantic segmentation, translation/rotation error for relocalization, ADE/FDE for trajectory prediction, and navigation success/collision rates. Results consistently demonstrate large accuracy drops for dynamic objects and higher-density environments, pointing to significant open research challenges for LSMR systems operating in real-world settings.

6. Applications, Scalability, and Limitations

LSMRs address application domains requiring both physical and algorithmic scalability:

Industrial Assembly: Automated large-scale manufacturing is approached with algorithmic stacks combining global radial layout planning, mixed-integer graph repair, greedy/preemptive task allocation, and distributed reactive policies for multi-robot transport, enabling sub-3-minute planning for $>1800$ -part assemblies and robust 250-robot fleet simulations (Brown et al., 2023).
Environmental Monitoring: Platforms such as MoistureMapper integrate adaptive GP-driven sampling policies, direct-push TDR sensing, and UGV autonomy to acquire high-resolution soil property maps in large outdoor fields, achieving up to 30% reduction in travel vs. greedy benchmarks (Rose et al., 17 Jul 2025).
Limitations: Common challenges include sensitivity to occlusion, perception errors in poor lighting, bandwidth and compute bottlenecks for real-time open-vocabulary detection, and distribution shift for deep-learning-based localization under seasonal or scene changes. Robustness to dynamic environments, lifelong learning, and data association under uncertainty remain open challenges (Allu et al., 2024, Sun et al., 2020, Li et al., 2024).

7. Future Directions and Research Opportunities

Emerging directions highlighted by deployed and benchmarked LSMR systems include:

Active Perception: Incorporating view planning and interactive recognition for improved semantic recall under occlusion and lighting variability (Allu et al., 2024).
Multi-Agent Decentralization: Graph-based or MARL-driven coordination for large robot teams in exploration, assembly, and navigation tasks (Chiun et al., 27 Feb 2025, Brown et al., 2023).
Lifelong and Adaptive Mapping: Online learning or continual GP updates for dynamic or evolving environments (Sun et al., 2020).
Benchmark Expansion: Inclusion of additional sensor modalities (3D LiDAR, event cameras), activity/intent labels, and adversarial scene variations in future datasets (Li et al., 2024).
Lifelong Safety and Certification: Integration of mathematical safety certificates (barrier functions), cross-module safety monitoring, and resilient fallback in all levels of autonomy (Shahna et al., 2 Jan 2026).

Ongoing efforts in systematic dataset release and scalable software/hardware design collectively support the field's progress toward real-world deployment of LSMRs in industrial, service, and research applications spanning scales from dynamic campuses and subterranean networks to precision agriculture and autonomous manufacturing (Li et al., 2024, Chang et al., 2022, Rose et al., 17 Jul 2025, Brown et al., 2023).