Papers
Topics
Authors
Recent
2000 character limit reached

Topology-Constrained 2D-DTW Algorithm

Updated 6 December 2025
  • The paper introduces a topology-constrained 2D-DTW that preserves grid integrity by enforcing a monotonic column mapping via dynamic programming.
  • It employs column-wise 1D-DTW to compute dissimilarity measures, yielding robust correspondences despite perspective distortions and occlusions.
  • Experimental validation shows improved matching accuracy, lower 3D reconstruction error (≈1 mm), and real-time performance on mobile platforms.

Topology-constrained two-dimensional dynamic time warping (2D-DTW) is an algorithmic framework designed for robust matching between a structured, ideal 2D grid and its observed, spatially deformed version—particularly under nontrivial conditions encountered during structured-light terrain sensing. Distinguished by a global monotonic consistency constraint, 2D-DTW preserves the topological integrity of the grid while aligning columns using dynamic programming. The methodology yields accurate correspondences even with perspective distortion and partial occlusion, enabling resource-efficient 3D reconstruction from smartphone-based projection systems (Nobuaki, 29 Nov 2025).

1. Formal Structure and Problem Statement

The central objective is to align two discrete surfaces: ARp×qA\in\mathbb{R}^{p\times q}, representing the ideal grid (e.g., projected by a smartphone), and BRr×sB\in\mathbb{R}^{r\times s}, representing the detected, possibly distorted, grid captured in the camera image. Each column ii of AA is profiled as Ai=[A1,i,...,Ap,i]TA_i=[A_{1,i},...,A_{p,i}]^T, and each column jj of BB as Bj=[B1,j,...,Br,j]TB_j=[B_{1,j},...,B_{r,j}]^T. The mapping seeks to assign columns of AA to those of BB such that triangulation from these correspondences maintains rectilinear grid connectivity and yields consistent 3D geometry.

This column-centric formulation is justified by the axis-aligned nature of the projected grid (with the UI “north-up”). The procedure thus emphasizes warping along the column dimension within a monotonic mapping framework, preventing nonphysical foldovers or crossings.

2. Cost Function and Dynamic Programming Recurrence

The alignment process comprises two main stages:

Step 1: Column-wise 1D-DTW Computation

Each pair (i,j)(i,j) is compared via 1D-DTW between the profiles CiAC^A_i and CjBC^B_j, yielding a dissimilarity measure:

d(CiA,CjB)=minw(k,)wAi[k]Bj[]2d(C^A_i, C^B_j) = \min_w \sum_{(k,\ell)\in w} \|A_i[k] - B_j[\ell]\|^2

where ww denotes a valid warping path subject to boundary, monotonicity, and step-size constraints. These pairwise distances populate the matrix DRq×sD\in\mathbb{R}^{q\times s}.

Step 2: Extraction of a Globally Consistent Path

Dynamic programming is applied to DD, accumulating costs in FRq×sF\in\mathbb{R}^{q\times s}:

Fi,j=Di,j+min{Fi1,j,Fi,j1,Fi1,j1}F_{i,j} = D_{i,j} + \min\{F_{i-1,j}, F_{i,j-1}, F_{i-1,j-1}\}

Initialization:

F1,1=D1,1,Fi,1=Di,1+Fi1,1,F1,j=D1,j+F1,j1F_{1,1} = D_{1,1}, \quad F_{i,1}=D_{i,1}+F_{i-1,1}, \quad F_{1,j}=D_{1,j}+F_{1,j-1}

The optimal correspondence path ww^* is recovered by tracing the minimum-cost path from Fq,:F_{q,:} backward to the start, subject to allowed moves (1,0),(0,1),(1,1)(1,0), (0,1), (1,1) in the (i,j)(i,j) space.

3. Enforcement of Topological Consistency

The dynamic programming constraints—only permitting right, down, or down-right steps—ensure monotonic progression in both the display grid index ii and the observed grid index jj. This one-to-one order-preserving mapping inherently maintains grid connectivity without additional penalty functions, as violations such as crossings or foldbacks become infeasible by construction. A plausible implication is that this preserves the rectangular structure essential for structured-light triangulation.

4. Robustness to Perspective Distortion and Occlusion

Perspective distortion introduces nonuniform row spacing among grid intersections, while partial occlusion can eliminate entire detected intersections. The column-profile 1D-DTW calculation is robust to such nonuniformity—warping along the row indices matches salient structural features regardless of scale variation. In cases of occlusion or missing data, DTW “skips” indices (utilizing allowed step transitions), inflating local costs but not requiring custom penalties. Global river path extraction strategically avoids high-cost pairings, naturally circumventing severely occluded regions.

5. Computational Complexity and Resource Efficiency

The runtime for computing the DD matrix scales as O(qspr)O(qs\cdot pr), which, for prqsNp\approx r\approx q\approx s\approx N, yields O(N4)O(N^4) overall. Dynamic programming over FF introduces an additional O(N2)O(N^2) cost. Memory requirements are bounded by storage of DD and FF (O(N2)O(N^2) each); no 4D DP array is constructed, maintaining practical memory consumption for NN in the 20–30 range typical for smartphone grids.

A lightweight greedy alternative reduces complexity further by tracing local minima paths in DD, but this sacrifices global alignment optimality.

6. Algorithmic Workflow

The following outlines the full procedure:

  1. Capture camera image and detect grid intersections via LoG filtering, skeletonization, and intersection detection.
  2. Organize detected intersections into column profiles BjB_j; profiles AiA_i are predetermined by grid geometry.
  3. Compute d(CiA,CjB)d(C^A_i, C^B_j) for each (i,j)(i,j), populating DD.
  4. Execute dynamic programming over DD to fill FF and backtrack for the optimal correspondence path ww^*.
  5. Derive a continuous column mapping a(i)=meana(i) = \text{mean} of jj-values along ww^*.
  6. Form correspondences for each grid intersection (i,k)(i,k) to its column-matched observed counterpart.
  7. Apply triangulation to matched pairs to recover the 3D ground-plane position.

7. Experimental Performance Validation

Evaluation across three terrain types—high-texture random-dot, medium-texture tile/wood, and low-texture vinyl—demonstrated the following:

  • Superior intersection matching success rates on low-texture floors compared to ORB+RANSAC stereo and nearest-neighbor matching.
  • Lower height-reconstruction RMSE (≈1 mm) versus feature-based triangulation (≈3 mm) for medium/low-texture scenes.
  • Achieved real-time inference (~50 ms/frame) on Android hardware, contrasted with multi-second runtimes for exhaustive non-topological 2D-DTW and bundle adjustment.

These results confirm that exploiting grid topology and enforcing monotonic constraints in 2D-DTW delivers robustness to perspective distortion and occlusion while retaining computational efficiency suitable for mobile, resource-constrained platforms (Nobuaki, 29 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Topology-Constrained Two-Dimensional Dynamic Time Warping (2D-DTW).