Differentiable SpaTiaL: Symbolic Learning and Reasoning with Geometric Temporal Logic for Manipulation Tasks

Published 3 Apr 2026 in cs.RO | (2604.02643v2)

Abstract: Executing complex manipulation in cluttered environments requires satisfying coupled geometric and temporal constraints. Although Spatio-Temporal Logic (SpaTiaL) offers a principled specification framework, its use in gradient-based optimization is limited by non-differentiable geometric operations. Existing differentiable temporal logics focus on the robot's internal state and neglect interactive object-environment relations, while spatial logic approaches that capture such interactions rely on discrete geometry engines that break the computational graph and preclude exact gradient propagation. To overcome this limitation, we propose Differentiable SpaTiaL, a fully tensorized toolbox that constructs smooth, autograd-compatible geometric primitives directly over polygonal sets. To the best of our knowledge, this is the first end-to-end differentiable symbolic spatio-temporal logic toolbox. By analytically deriving differentiable relaxations of key spatial predicates--including signed distance, intersection, containment, and directional relations--we enable an end-to-end differentiable mapping from high-level semantic specifications to low-level geometric configurations, without invoking external discrete solvers. This fully differentiable formulation unlocks two core capabilities: (i) massively parallel trajectory optimization under rigorous spatio-temporal constraints, and (ii) direct learning of spatial logic parameters from demonstrations via backpropagation. Experimental results validate the effectiveness and scalability of the proposed framework.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper proposes a fully differentiable framework that uses smooth relaxations to convert discrete spatial predicates into autograd-compatible operations.
It demonstrates robust trajectory optimization and specification parameter learning by enabling gradient propagation through both geometric and temporal logic.
Experimental results validate high geometric accuracy and conservative smoothing guarantees, ensuring safety-critical performance in manipulation tasks.

Differentiable SpaTiaL: Symbolic Learning and Reasoning with Geometric Temporal Logic for Manipulation Tasks

Introduction and Motivation

Robotic manipulation in unstructured environments frequently demands satisfaction of intertwined geometric and temporal constraints. Traditional planning and learning pipelines struggle to bridge the gap between high-level, human-interpretable specifications—formalized using spatio-temporal logic (SpaTiaL)—and low-level, gradient-based optimization, due to non-differentiable geometric operations characteristic of conventional spatial logics. Prior work either focuses on differentiable temporal logics operating over low-dimensional robot states, neglecting explicit spatial relations, or relies on discrete geometry engines and collision checkers, leading to broken computational graphs that preclude gradient propagation.

Differentiable SpaTiaL introduces the first fully end-to-end differentiable, tensorized toolbox for spatio-temporal logic, enabling symbolic planning, learning, and reasoning for manipulation by formulating smooth relaxations of key spatial predicates on polygonal objects. The framework enables direct analytic gradient flow from logical specifications to object-level geometric states, unlocking scalable, gradient-based trajectory optimization and formal parameter learning over geometric relations.

Figure 1: Overview of Differentiable SpaTiaL, which replaces discrete geometry engines with a fully tensorized architecture for end-to-end trajectory optimization under formal spatio-temporal specifications.

Technical Formulation

The core innovation of Differentiable SpaTiaL lies in reformulating the spatial component of SpaTiaL. The approach analytically derives smooth relaxations for core geometric predicates—such as distance, intersection, penetration depth, containment, and directional relations—by replacing $\min/\max$ and discrete Boolean operations with autograd-compatible tensor operations using LogSumExp (LSE) smoothing and soft spatial boundary sampling.

Spatial predicates are evaluated directly on vertex tensors representing convex polygons, avoiding external discrete solvers. Key constructions include:

Smooth Separating Axis Theorem (SAT) Penetration: Penetration depth between polygons is formulated as the smooth minimum of projection overlaps across separating axes, replacing discrete queries with analytically differentiable operators.
Figure 2: Differentiable penetration depth via Smooth SAT; exact discrete overlap is relaxed into a continuously differentiable repulsion field.
Signed Distance Field via Boundary Sampling: For non-overlapping configurations, the minimum distance between polygons is computed using smooth, soft-minimum aggregation over sampled boundary points.
Figure 3: Geometric interpretation of compositional predicates such as EnclIn and leftOf; all spatial relations are reduced to differentiable tensor operations.

This smooth geometric foundation is then composed with temporal logic operators—also smoothed using LSE approximations—to produce a fully differentiable robustness metric for any spatio-temporal specification $\phi$ evaluated over a trajectory $\xi$ .

Applications: Trajectory Optimization and Learning

The differentiable robustness metric makes it possible to backpropagate gradients through both temporal structure and spatial semantics in a unified manner. Two primary capabilities are demonstrated:

Spatio-Temporal Trajectory Optimization

Trajectory synthesis under formal specifications becomes a continuous optimization problem: maximizing the (smoothed) robustness of the specification with respect to trajectory states, under possible system constraints and dynamics. Gradients propagated through differentiable spatial semantics provide dense feedback in both penetrative and separated regimes, overcoming the local minima and stagnation encountered with discrete collision signals.

Figure 4: Spatio-temporal trajectory optimization; the initial trajectory violates the specification, but through iterative optimization, the final trajectory satisfies geometric and temporal requirements, as quantified by robustness analysis.

Empirically, the optimization process exhibits distinct phases: collision resolution, constraint shaping, and task completion, with robustness improving monotonically as shown by the plotted metrics.

Specification Parameter Learning

Thanks to full differentiability, the system supports joint learning of spatial logic parameters directly from demonstration data via gradient descent. For a given logical structure, learnable parameters (e.g., safety margins in predicates such as $farFrom$ , $LeftOf$ , etc.) are optimized to fit all examples, simultaneously ensuring the specification is satisfied and the inferred margins are tight.

Figure 5: Learning spatial specification parameters from demonstrations through robustness backpropagation, visualized in oblique and planar projections.

This end-to-end framework not only discovers interpretable specifications invariantly satisfied by the data, but also quantitatively estimates the largest feasible margins.

Numerical and Empirical Validation

Geometric Accuracy

Quantitative agreement with classical geometry engines (such as Shapely) is evaluated over randomized convex polygon pairs, measuring accuracy of differentiable unsigned/signed distance, penetration depth, and containment predicates.

Figure 6: Evaluation of geometric accuracy of differentiable predicates against the Shapely engine across randomized geometric configurations.

The results demonstrate strong empirical fidelity, with numerical errors tunable via boundary sampling density and smoothing temperature $\tau$ ; smaller $\tau$ and denser sampling yield higher accuracy at the expense of gradient smoothness.

Smoothing and Boundary Tradeoffs

A key hyperparameter is the smoothing temperature $\tau$ . Its influence on the precision of spatial predicate approximations and gradient informativeness is systematically explored.

Figure 7: Effect of the smoothing temperature $\tau$ and boundary sampling density on the accuracy of signed distance and corresponding gradients.

As $\tau \to 0$ and sampling density increases, the differentiable relaxations asymptotically approach discrete, exact values, although with potentially reduced gradient smoothness.

Theoretical Guarantees

The framework is mathematically underpinned by conservative smoothing properties: for most predicates, the smoothed robustness never overestimates the exact value, thus maintaining soundness in safety-critical contexts. In particular, the error bound due to smoothing vanishes as $\phi$ 0 (where $\phi$ 1 is the boundary sampling spacing). Directional and orientation predicates are shown to be exact under the smooth relaxations used.

Implications and Future Directions

Differentiable SpaTiaL closes the gap between formal spatio-temporal logic and differentiable robotics, providing an analytic, parallelizable foundation for geometric reasoning and symbolic learning. Its batch-native, autograd-compatible design is well positioned for integration with modern GPU-based pipelines, large-scale learning tasks, and reinforcement learning frameworks operating over high-dimensional geometric states.

Practically, the framework permits simultaneous, data-driven refinement of logical task specifications and geometric parameters, unlocking new avenues for interpretable policy learning, end-to-end demonstration-based learning of symbolic programs, and specification-constrained deep planning. The tensorized architecture also positions SpaTiaL as an ideal substrate for scaling to large numbers of interactive objects while maintaining tractable optimization.

Theoretically, this approach suggests new directions for unifying symbolic and differentiable representations of geometric and temporal logic, potentially extending to richer object types (e.g., 3D meshes), more complex spatial calculi, and hybrid neuro-symbolic learning pipelines.

Conclusion

Differentiable SpaTiaL advances the integration of symbolic specification with continuous, gradient-based reasoning in robotic manipulation. By reformulating SpaTiaL predicates as smooth, tensorized computations, the approach enables reliable backpropagation through spatial and temporal logic, supporting both robust trajectory optimization and data-driven specification parameter learning. Numerical accuracy is competitive with discrete geometry engines, and conservative smoothing properties guarantee specification soundness. The autograd-compatible toolbox therefore establishes a scalable, interpretable, and practical foundation for symbolic learning and reasoning in robotics and beyond (2604.02643).

Markdown Report Issue