Continuous-Time Fusion

Updated 20 April 2026

Continuous-Time Fusion is the integration of asynchronous, multi-rate, and heterogeneous sensor data using smooth, continuous trajectory representations like splines and Gaussian processes.
It employs methods such as sliding-window optimization and factor-graph approaches to compute analytic interpolations and derivatives for robust, real-time sensor fusion.
Applications span SLAM, autonomous vehicles, dense mapping, and neural policy learning, demonstrating improved accuracy and efficiency in complex dynamic environments.

Continuous-time fusion is the methodology of integrating, modeling, and optimizing asynchronous, multi-rate, and heterogeneous sensor observations in a state estimation or control system through continuous-time trajectory representations. This paradigm removes discrete-time alignment constraints, enabling fusion of sensor data with arbitrary timestamps by providing a continuous latent trajectory—often parameterized by splines or Gaussian processes—that can be efficiently queried at any instant. Continuous-time fusion has proven to be highly effective in SLAM, state estimation, multi-modal sensor fusion, and end-to-end policy learning in robotics and machine learning across a wide range of real-world engineering problems.

1. Trajectory Representations for Continuous-Time Fusion

Continuous-time fusion is predicated on representing the evolving system state as a smooth, differentiable function of time rather than as a sequence of discrete “poses.” The most common trajectory parameterizations are:

Cubic B-splines in Euclidean and Lie groups: Translation and rotation are modeled as cubic B-splines. For translations, standard cumulative or uniform cubic B-splines are used, e.g., $p(t) = \sum_j \beta_j(u)p_{i+j}$ . For SO(3) or SE(3), control vectors generate a smooth rotational curve via exponentials of tangent vectors, as in $R(t) = R_{i-1} \prod_{j=0}^3 \exp(\beta_j(u)\phi_{i+j}^\wedge)$ (Liu et al., 16 Apr 2026, Li et al., 2023, Lv et al., 2023, 1711.01691).
Cumulative B-splines with quaternions: Quaternion cumulative splines generalize cubic splines to SO(3). Quaternion increments are composed recursively, yielding closed-form orientation, angular velocity, and acceleration (Li et al., 2023).
Gaussian-process (GP) priors: GP trajectory representations model the trajectory as a sample from a GP over SE(3), typically with a white-noise-on-acceleration or white-noise-on-jerk prior. Interpolation and query are performed via GP regression with explicitly derived mean and covariance expressions, supporting efficient factor-graph optimization (Zhang et al., 2023, Nguyen et al., 2024).

Closed-form derivatives (velocity, acceleration, jerk) and analytic Jacobians are available for these representations, providing direct access to the continuous-time state and its differentials, which is fundamental for asynchronous, high-rate, or heterogeneous sensor fusion.

2. Multi-Sensor Fusion Methodologies in Continuous Time

Continuous-time fusion enables measurement models and factors to be formed directly at the true sensor timestamps, which may widely differ across modalities (LiDAR, camera, IMU, UWB, GNSS, radar). Key methodological frameworks include:

Sliding-window or fixed-lag smoothing: The trajectory is optimized over a moving time window of control knots, fusing all LiDAR, IMU, camera, and other observations from their native timestamps, and marginalizing out older states to maintain bounded computational complexity (Lv et al., 2023, Li et al., 2023, Liu et al., 16 Apr 2026).
Factor-graph optimization: Factors are constructed for each observation, with measurement residuals depending on interpolated continuous-time states. This applies to point-to-plane residuals for LiDAR, re-projection errors for cameras, IMU integration terms, or range measurements for UWB/GNSS (Zhang et al., 2023, Liu et al., 16 Apr 2026, Lv et al., 2023).
Gaussian-process factor graphs: Measurements connect to state nodes at arbitrary times via GP interpolation. This enables “true” asynchronous multi-modal fusion, where each measurement factor only involves the local pair of adjacent knots (Nguyen et al., 2024, Zhang et al., 2023).
Online neural latent integration: In ML-based control, continuous-time policies are learned via Neural Controlled Differential Equations (CDEs), which fuse asynchronous high-frequency proprioceptive and low-frequency visual signals into a single evolving latent, driving policy outputs at native control rates (Singh et al., 2022).

These methods universally enable seamless fusion and temporal consistency for all measurements, regardless of their acquisition rates or sensing modalities.

3. Asynchronous and Multi-Rate Sensor Stream Handling

Continuous-time fusion provides systematic and lossless handling of asynchronous and multi-frequency sensor data by:

Direct query of state at measurement time: Each observation (LiDAR scan, image, UWB range, GNSS fix) invokes the trajectory representation at its precise timestamp without any need for bucketing or resampling (Lv et al., 2023, Zhang et al., 2023, Nguyen et al., 7 Oct 2025, Liu et al., 16 Apr 2026).
Analytic interpolation/derivatives: Sensor models require states and their derivatives (velocity, angular velocity, acceleration) at arbitrary times. B-spline, quaternion-spline, and GP representations deliver analytic expressions for these quantities, supporting efficient factor computation and optimization (Li et al., 2023, Nguyen et al., 2024, Ng et al., 2022).
Multi-modal temporal alignment: This architecture naturally handles time offset calibration as part of the optimization, enabling estimation or correction for unknown time shifts between camera, IMU, LiDAR, and other clocks (Lv et al., 2023, Liu et al., 16 Apr 2026).

This approach yields improved temporal consistency and estimation accuracy, particularly in settings with significant sensor-asynchrony or irregular arrivals.

4. Applications and Empirical Results

Continuous-time fusion frameworks have demonstrated efficacy in diverse domains:

SLAM and outdoor/indoor localization: Continuous-time LiDAR-inertial, LiDAR-IMU-camera, radar-inertial, visual-inertial-UWB, and GNSS-IMU-LiDAR fusion achieve high-frequency pose estimation with sub-decimeter RMSE, outperforming discrete-time and filter-based methods under real-world deployment (Liu et al., 16 Apr 2026, Lv et al., 2023, Zhang et al., 2023, Ng et al., 2022, Li et al., 2023, Nguyen et al., 7 Oct 2025, 1711.01691).
Automotive and autonomous vehicle odometry: Spline-based radar-inertial fusion achieves lower velocity and attitude errors than discrete-time methods, attaining 2D odometry errors around 1% on extended trajectories and robust performance in challenging environments (Ng et al., 2022).
Dense mapping and lifelong operation: Elastic LiDAR Fusion achieves globally consistent, dense surfel maps with sub-decimeter drift, maintaining real-time performance via map-centric deformation rather than global trajectory optimization (1711.01691).
End-to-end visuomotor policy learning: In continuous-time neural control, the InFuser architecture demonstrates robust policy performance under sparse and dropped visual/force observations, outperforming discrete-time RNN and windowed approaches in highly dynamic manipulation benchmarks (Singh et al., 2022).
Coordinate-consistent multi-session mapping: Continuous-time SLAM–UWB fusion establishes a consistent world frame via sliding-window optimization, maintaining <0.15 m absolute trajectory error across large-scale UAV benchmarks (Nguyen et al., 7 Oct 2025).

A summary table of selected applications and empirical metrics is given below:

Framework	Modalities	Key Result / Metric	Reference
CLIC	LiDAR, IMU, Camera	0.035 m APE, real-time	(Lv et al., 2023)
CT-VIR	Visual-Inertial-UWB	0.082 m ATE (EuRoC), robust NLOS	(Liu et al., 16 Apr 2026)
SFUISE	UWB, IMU	0.09–0.12 m RMSE, robust in NLOS	(Li et al., 2023)
Elastic LiDAR	LiDAR	0.05–0.12 m RMSE, dense surfel mapping	(1711.01691)
GNSS-FGO	GNSS, IMU, LiDAR	0.48 m mean 2D error on 17 km trajectory	(Zhang et al., 2023)
InFuser	Visuomotor	>80% success under sparse images	(Singh et al., 2022)

5. Algorithmic and Computational Considerations

Continuous-time fusion algorithms are typically characterized by:

Sparse block-banded linear systems: Each measurement or prior factor only involves adjacent control points or spline knots, leading to exactly-sparse, efficiently-solvable Hessians in Gauss–Newton or Levenberg–Marquardt optimization (Lv et al., 2023, Li et al., 2023, Nguyen et al., 2024).
Sliding-window/marginalization schemes: To guarantee real-time operation, the oldest states are marginalized after each window slide, preserving prior information via the Schur complement or square-root information approach. The problem size remains constant, allowing real-time inference on commodity CPUs (Lv et al., 2023, Li et al., 2023, Nguyen et al., 7 Oct 2025).
Closed-form analytic Jacobians: Extensive use of analytic Jacobians for all measurements, trajectory evaluation, and GP interpolation yields significant runtime improvement versus automatic differentiation (Nguyen et al., 2024, Lv et al., 2023, Li et al., 2023).
Adaptive knot spacing and window sizing: Knot interval selection and window length impact the trade-off between smoothness/accuracy and computational cost. Fixed knot rates (e.g., 10–20 Hz) with windows covering ~4–10 s are typical (Li et al., 2023, Liu et al., 16 Apr 2026).

These algorithmic features collectively deliver high computational efficiency, scalability for long trajectories, and the capacity to fuse thousands of heterogeneous measurements in real time.

6. Variants: Probabilistic, Map-Centric, and Learning-Based Fusion

Beyond standard smoothing/graph optimization, several variants leverage continuous-time fusion:

Map-centric deformation: In Elastic LiDAR Fusion, global batch trajectory optimization is replaced by map-centric deformation graphs—correcting the dense surfel map via local affine warps triggered at loop closures, yielding globally consistent yet scalable mapping (1711.01691).
Surfel-based Bayesian fusion: Probabilistic updates of dense map surfels reduce positional uncertainty by several orders of magnitude, exploiting the statistics of many continuous-time interpolated LiDAR returns per surfel.
Neural CDE-based fusion: Learning approaches such as InFuser employ neural controlled differential equations to fuse multi-rate modalities into a dynamically evolving latent representation (Singh et al., 2022).
GP regression and closed-form interpolation: High-order GP priors (e.g., white-noise-on-jerk) on SO(3) × ℝ³ or SE(3) provide closed-form kinematics, unified factor modeling, and improved accuracy, especially in high-speed or maneuvering scenarios (Nguyen et al., 2024).

These variants provide alternative mechanisms to achieve fusion objectives—seamless multi-asynchronous integration, robust mapping, and data-driven control—within the continuous-time paradigm.

7. Current Limitations and Open Challenges

Despite its demonstrated strengths, continuous-time fusion presents ongoing research challenges:

Uniform-knot constraints: Most B-spline or GP-based methods employ fixed, uniformly spaced knots. Adaptive or data-driven knot allocation remains under active exploration (Li et al., 2023).
Computational scaling: While constant-cost sliding window approaches are efficient, very high-rate environments or extremely long-duration operations may push real-time limits, motivating further parallelization, sparsification, or GPU acceleration (Lv et al., 2023, Nguyen et al., 2024).
Hyperparameter sensitivity: GP motion prior covariance selection, knot rates, and smoothing windows impact both estimation fidelity and robustness, especially in dynamic or poorly observed scenarios (Zhang et al., 2023).
Multi-session, multi-system consistency: Achieving globally consistent frames across multiple disconnected sessions (e.g., by anchor calibration) is nontrivial; recent methods leverage batch calibration then sliding-window fusion (Nguyen et al., 7 Oct 2025).
Learning-based method stability: Continuous-time neural implementations such as Neural CDEs are not guaranteed to outperform strong RNN or attention-based baselines absent careful architecture and loss construction (Singh et al., 2022, Karas et al., 2022).

Future research directions involve scalable fusion in high-dimensional state spaces, tighter integration/decomposition of factor graphs, advanced GP/NN-based latent representations, and real-world deployment studies across challenging modalities and environments.

References:

(Singh et al., 2022, Karas et al., 2022, Li et al., 2023, Lv et al., 2023, Zhang et al., 2023, Nguyen et al., 2024, Nguyen et al., 7 Oct 2025, Liu et al., 16 Apr 2026, 1711.01691, Ng et al., 2022)