Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 398 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Dynamic Object SLAM

Updated 18 October 2025
  • Dynamic Object SLAM is a method that models moving objects by integrating semantic segmentation with geometric techniques to enhance camera and object trajectory estimation.
  • The approach leverages joint optimization frameworks and multi-modal sensor data to accurately reconstruct 4D scenes and manage dynamic elements within the environment.
  • Applications include autonomous driving, mobile robotics, and augmented reality, where real-time dynamic object tracking and robust mapping are critical.

Dynamic object SLAM refers to a class of simultaneous localization and mapping (SLAM) methodologies that explicitly model, detect, track, and/or leverage moving objects in the environment, as opposed to filtering such data as outliers or adhering to a strict static-world assumption. These systems are architected to achieve robust camera (or vehicle) pose estimation and mapping while maintaining accurate representations of dynamic scene elements. Recent advancements in dynamic object SLAM span a diverse set of sensor modalities, architectural assumptions, data association strategies, and optimization frameworks, as evidenced by the literature.

1. Problem Definition and Historical Context

Traditional SLAM algorithms—whether visual, LiDAR, RGB-D, or multi-modal—have generally operated under the static-world assumption, treating moving objects as error sources to be filtered or rejected. This paradigm restricts applicability in domains such as autonomous driving, robotics in populated areas, and augmented reality, where dynamic content is prevalent and often critical to task completion.

Dynamic object SLAM extends the standard framework by modeling the states (pose, shape, motion) of moving objects, enabling simultaneous trajectory estimation for both the sensor platform and dynamic agents. Early works either masked out suspected dynamic regions using heuristics or instance segmentation outputs, while contemporary strategies employ joint optimization, motion and rigidity constraints, dense volumetric modeling, and explicit motion prediction (Yang et al., 2018, Strecke et al., 2019, Zhang et al., 2020, Qiu et al., 2021).

2. Core Methodologies and System Architectures

Dynamic object SLAM approaches can be categorized along several axes:

2.1 Semantic and Geometric Integration

Many contemporary systems combine deep learning–based instance/semantic segmentation (e.g., with Mask R-CNN, SOLOv2, YolactEdge) with classical geometric techniques (e.g., multi-view triangulation, rigid-body motion segmentation). Semantic modules provide pixel-wise dynamic object masks, while geometric clustering (e.g., HDBSCAN, Euclidean clustering in LiDAR) or motion segmentation (optical/depth flow, planar segmentation) provides redundancy against segmentation imperfections and enables recovery when semantic predictions are ambiguous (Wang et al., 2022, Krishna et al., 2023).

2.2 Data Association and Motion Segmentation

Feature correspondence and association of observations over time—crucial for tracking dynamic objects—are handled via:

  • Optical flow and scene flow for dense short-term point associations (Zhang et al., 2020, Wadud et al., 2022)
  • Multi-model motion segmentation (e.g., labeling feature tracks by residual consistency to parametric ego and object motion models) (Wang et al., 2020)
  • Sliding window data association using historical trajectories and polynomial fitting (Tian et al., 2022)
  • Probabilistic data association using soft assignment likelihoods in an EM framework (Strecke et al., 2019)

Table 1: Representative Data Association Techniques

Approach Data Association Method Modality
EM-Fusion (Strecke et al., 2019) EM soft assignment via pixel likelihoods RGB-D
DymSLAM (Wang et al., 2020) Multi-model geometric residual clustering Stereo vision
DL-SLOT (Tian et al., 2022) Trajectory prediction + assignment via polynomial fitting LiDAR
VDO-SLAM (Zhang et al., 2020) Optical flow-based dense feature association Monocular/RGB-D

2.3 Object Motion and Representation Models

Dynamic objects are represented using various models:

3. Joint Optimization and Backend Architectures

Dynamic object SLAM systems generally employ a joint optimization or bundle adjustment backend where the state vector aggregates:

  • Camera (or ego-platform) poses
  • Static 3D feature points and/or background map structures
  • Dynamic object poses (typically SE(3) trajectories), shapes, and associated dynamic point features

Objective functions are constructed as the sum of measurement and regularization terms such as:

minC,O,Pmeasurementsreprojection or data association errors2+motion constraintsdeviation from predicted dynamics2\min_{C, O, P} \sum_{\text{measurements}} \|\text{reprojection or data association errors}\|^2 + \sum_{\text{motion constraints}} \|\text{deviation from predicted dynamics}\|^2

Specialized loss terms regularize object motions, favor rigidity between parts, encourage feature points to remain inside object boundaries, and constrain the temporal evolution of dynamic elements (Yang et al., 2018, Qiu et al., 2021, Li et al., 6 Jun 2025). When applied to explicit representations like Gaussian splats, color and depth rendering losses are combined and weighted differently for static and dynamic map components to suppress transient interference and occlusion artifacts (Li et al., 6 Jun 2025, Liu et al., 31 Aug 2025).

4. Treatment of Dynamic Features: Robustness Strategies

Handling dynamic observations is central. The literature demonstrates approaches that either:

  • Explicitly track and model dynamics, preserving dynamic points in the optimization by assigning them to object frames and enforcing inter-frame motion consistency (Yang et al., 2018, Zhang et al., 2020, Wadud et al., 2022, Li et al., 15 Mar 2025).
  • Remove or inpaint dynamic regions pre-SLAM, e.g., via deep video inpainting guided by flow-based masks, and then apply static SLAM on cleaned frames (Uppala et al., 2023, Habibpour et al., 2 Oct 2025).
  • Combine adaptive feature extraction, mask refinement via prior information (e.g., through recursive static background models or morphological corrections), and dynamic sampling to maintain optimization constraints despite the exclusion of dynamic regions (Liu et al., 31 Aug 2025).

These strategies yield a spectrum of solutions from full joint modeling to preemptive filtering, depending on task requirements and available computational resources.

5. Performance Evaluation and Benchmarks

Evaluation metrics vary according to the scope of dynamic handling:

Strong empirical results are reported on benchmark datasets such as KITTI, TUM RGB-D, BONN RGB-D, and indoor environments with large dynamic occlusions (Yang et al., 2018, Krishna et al., 2023, Li et al., 6 Jun 2025, Liu et al., 31 Aug 2025). Recent systems demonstrate both robust camera tracking and high-fidelity map reconstruction—dynamic objects are either clearly distinguished from static parts or their motion and structure are estimated for 3D scene understanding and prediction (Li et al., 15 Mar 2025, Li et al., 6 Jun 2025).

6. Applications, Implications, and Open Directions

Dynamic object SLAM systems have enabled:

Continued research emphasizes:

7. Summary Table of Representative Approaches

System Dynamic Object Modeling Map Representation Sensing Modality Joint Optimization Real-Time
CubeSLAM (Yang et al., 2018) Cuboid + motion model Sparse/cuboid map Mono camera Yes Yes
EM-Fusion (Strecke et al., 2019) TSDF with EM data association Dense SDF volumes RGB-D Yes No
DymSLAM (Wang et al., 2020) Geometric motion segmentation Dense stereo + 4D map Stereo camera Yes Yes
VDO-SLAM (Zhang et al., 2020) SE(3) pose for objects via scene flow Spatiotemporal map Mono/RGB-D Yes Yes
DL-SLOT (Tian et al., 2022) Sliding window graph for all objects LiDAR pose-graph LiDAR Yes Yes
DynaGSLAM (Li et al., 15 Mar 2025) Time-varying Gaussian splats Photorealistic 3DGS RGB-D visual Decoupled Yes
Dy3DGS-SLAM (Li et al., 6 Jun 2025) Mask-fused dynamic suppression Photorealistic 3DGS Monocular RGB Yes Yes

These systems collectively demonstrate the trajectory of dynamic object SLAM towards architectures that are robust to scene variability, support rich scene reconstructions, and facilitate advanced robotics and perception applications.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Dynamic Object SLAM.