- The paper proposes a fully differentiable framework that uses smooth relaxations to convert discrete spatial predicates into autograd-compatible operations.
- It demonstrates robust trajectory optimization and specification parameter learning by enabling gradient propagation through both geometric and temporal logic.
- Experimental results validate high geometric accuracy and conservative smoothing guarantees, ensuring safety-critical performance in manipulation tasks.
Differentiable SpaTiaL: Symbolic Learning and Reasoning with Geometric Temporal Logic for Manipulation Tasks
Introduction and Motivation
Robotic manipulation in unstructured environments frequently demands satisfaction of intertwined geometric and temporal constraints. Traditional planning and learning pipelines struggle to bridge the gap between high-level, human-interpretable specificationsโformalized using spatio-temporal logic (SpaTiaL)โand low-level, gradient-based optimization, due to non-differentiable geometric operations characteristic of conventional spatial logics. Prior work either focuses on differentiable temporal logics operating over low-dimensional robot states, neglecting explicit spatial relations, or relies on discrete geometry engines and collision checkers, leading to broken computational graphs that preclude gradient propagation.
Differentiable SpaTiaL introduces the first fully end-to-end differentiable, tensorized toolbox for spatio-temporal logic, enabling symbolic planning, learning, and reasoning for manipulation by formulating smooth relaxations of key spatial predicates on polygonal objects. The framework enables direct analytic gradient flow from logical specifications to object-level geometric states, unlocking scalable, gradient-based trajectory optimization and formal parameter learning over geometric relations.
Figure 1: Overview of Differentiable SpaTiaL, which replaces discrete geometry engines with a fully tensorized architecture for end-to-end trajectory optimization under formal spatio-temporal specifications.
The core innovation of Differentiable SpaTiaL lies in reformulating the spatial component of SpaTiaL. The approach analytically derives smooth relaxations for core geometric predicatesโsuch as distance, intersection, penetration depth, containment, and directional relationsโby replacing min/max and discrete Boolean operations with autograd-compatible tensor operations using LogSumExp (LSE) smoothing and soft spatial boundary sampling.
Spatial predicates are evaluated directly on vertex tensors representing convex polygons, avoiding external discrete solvers. Key constructions include:
- Smooth Separating Axis Theorem (SAT) Penetration: Penetration depth between polygons is formulated as the smooth minimum of projection overlaps across separating axes, replacing discrete queries with analytically differentiable operators.
Figure 2: Differentiable penetration depth via Smooth SAT; exact discrete overlap is relaxed into a continuously differentiable repulsion field.
- Signed Distance Field via Boundary Sampling: For non-overlapping configurations, the minimum distance between polygons is computed using smooth, soft-minimum aggregation over sampled boundary points.
Figure 3: Geometric interpretation of compositional predicates such as EnclIn and leftOf; all spatial relations are reduced to differentiable tensor operations.
This smooth geometric foundation is then composed with temporal logic operatorsโalso smoothed using LSE approximationsโto produce a fully differentiable robustness metric for any spatio-temporal specification ฯ evaluated over a trajectory ฮพ.
Applications: Trajectory Optimization and Learning
The differentiable robustness metric makes it possible to backpropagate gradients through both temporal structure and spatial semantics in a unified manner. Two primary capabilities are demonstrated:
Spatio-Temporal Trajectory Optimization
Trajectory synthesis under formal specifications becomes a continuous optimization problem: maximizing the (smoothed) robustness of the specification with respect to trajectory states, under possible system constraints and dynamics. Gradients propagated through differentiable spatial semantics provide dense feedback in both penetrative and separated regimes, overcoming the local minima and stagnation encountered with discrete collision signals.



Figure 4: Spatio-temporal trajectory optimization; the initial trajectory violates the specification, but through iterative optimization, the final trajectory satisfies geometric and temporal requirements, as quantified by robustness analysis.
Empirically, the optimization process exhibits distinct phases: collision resolution, constraint shaping, and task completion, with robustness improving monotonically as shown by the plotted metrics.
Specification Parameter Learning
Thanks to full differentiability, the system supports joint learning of spatial logic parameters directly from demonstration data via gradient descent. For a given logical structure, learnable parameters (e.g., safety margins in predicates such as farFrom, LeftOf, etc.) are optimized to fit all examples, simultaneously ensuring the specification is satisfied and the inferred margins are tight.


Figure 5: Learning spatial specification parameters from demonstrations through robustness backpropagation, visualized in oblique and planar projections.
This end-to-end framework not only discovers interpretable specifications invariantly satisfied by the data, but also quantitatively estimates the largest feasible margins.
Numerical and Empirical Validation
Geometric Accuracy
Quantitative agreement with classical geometry engines (such as Shapely) is evaluated over randomized convex polygon pairs, measuring accuracy of differentiable unsigned/signed distance, penetration depth, and containment predicates.
Figure 6: Evaluation of geometric accuracy of differentiable predicates against the Shapely engine across randomized geometric configurations.
The results demonstrate strong empirical fidelity, with numerical errors tunable via boundary sampling density and smoothing temperature ฯ; smaller ฯ and denser sampling yield higher accuracy at the expense of gradient smoothness.
Smoothing and Boundary Tradeoffs
A key hyperparameter is the smoothing temperature ฯ. Its influence on the precision of spatial predicate approximations and gradient informativeness is systematically explored.
Figure 7: Effect of the smoothing temperature ฯ and boundary sampling density on the accuracy of signed distance and corresponding gradients.
As ฯโ0 and sampling density increases, the differentiable relaxations asymptotically approach discrete, exact values, although with potentially reduced gradient smoothness.
Theoretical Guarantees
The framework is mathematically underpinned by conservative smoothing properties: for most predicates, the smoothed robustness never overestimates the exact value, thus maintaining soundness in safety-critical contexts. In particular, the error bound due to smoothing vanishes as ฯ0 (where ฯ1 is the boundary sampling spacing). Directional and orientation predicates are shown to be exact under the smooth relaxations used.
Implications and Future Directions
Differentiable SpaTiaL closes the gap between formal spatio-temporal logic and differentiable robotics, providing an analytic, parallelizable foundation for geometric reasoning and symbolic learning. Its batch-native, autograd-compatible design is well positioned for integration with modern GPU-based pipelines, large-scale learning tasks, and reinforcement learning frameworks operating over high-dimensional geometric states.
Practically, the framework permits simultaneous, data-driven refinement of logical task specifications and geometric parameters, unlocking new avenues for interpretable policy learning, end-to-end demonstration-based learning of symbolic programs, and specification-constrained deep planning. The tensorized architecture also positions SpaTiaL as an ideal substrate for scaling to large numbers of interactive objects while maintaining tractable optimization.
Theoretically, this approach suggests new directions for unifying symbolic and differentiable representations of geometric and temporal logic, potentially extending to richer object types (e.g., 3D meshes), more complex spatial calculi, and hybrid neuro-symbolic learning pipelines.
Conclusion
Differentiable SpaTiaL advances the integration of symbolic specification with continuous, gradient-based reasoning in robotic manipulation. By reformulating SpaTiaL predicates as smooth, tensorized computations, the approach enables reliable backpropagation through spatial and temporal logic, supporting both robust trajectory optimization and data-driven specification parameter learning. Numerical accuracy is competitive with discrete geometry engines, and conservative smoothing properties guarantee specification soundness. The autograd-compatible toolbox therefore establishes a scalable, interpretable, and practical foundation for symbolic learning and reasoning in robotics and beyond (2604.02643).