Glyph2Cloud: Affine-Invariant Gesture Recognition

Updated 6 December 2025

Glyph2Cloud module is a system that transforms user-drawn glyphs into fixed-length point clouds for efficient, robust gesture recognition.
It employs the Squiggle algorithm with triangle-based affine alignment to ensure invariance to rotation, scale, skew, and reflection.
The module processes glyphs through steps like path regularization, affine mapping, and error metric evaluation to deliver high accuracy and sub-millisecond performance.

The Glyph2Cloud module provides a systematic approach for transforming user-drawn glyphs into regularized, fixed-length point clouds suitable for robust, affine-invariant gesture recognition. Grounded in the Squiggle algorithm, it supports recognition invariant to rotation, scale, skew, and reflection, and is optimized for real-time feedback with sub-millisecond latency on modern hardware. The key pipeline comprises milestone point extraction, triangle-based affine alignment, candidate selection, affine mapping, error metric computation, and a rigorous recognition loop. Affine transformations for template matching, robust filtering of degenerate cases, and reflection symmetries are precisely handled to enable high accuracy and efficient computation in gesture-based interfaces (Lee, 2011).

1. Milestone Point Cloud Construction

Raw input glyphs, typically sequences of pen or touch points $P = [p_0, p_1, ..., p_k]$ , are first regularized to remove positional jitter and enforce consistent inter-sample distance. The procedure involves:

Path Regularization: Generate a new polyline $R$ from $P$ such that points are spaced approximately $\delta$ pixels apart, using interpolation whenever the accumulated segment length exceeds $\delta$ . This is realized by a stepwise traversal and point insertion at length thresholds (referred to as path_regularize(P, \delta)).
Down-sampling to Milestone Points: From the regularized path $R$ , down-sample to exactly $n$ milestone points (typically $n=16$ ), securing uniform arc-length spacing. Arc-length parameterization $i/(n-1)$ ( $i=0...n-1$ ) selects sample locations, with linear interpolation bridging intermediate positions (path_interpolate(R, n)).
Cloud Representation: The resulting $g = [g_0, g_1, ..., g_{n-1}],\; g_i \in \mathbb{R}^2$ , encodes the glyph for further processing.

2. Triangle-Based Affine Alignment Framework

Affine alignment between input and template glyphs leverages triangles defined by ordered indices:

Path Length Calculation: For any cloud $p$ , total path length $\lambda(p) = \sum_{i} \|p_{i+1} - p_i\|$ .
Triangle Edge Matrix: For each index triplet $0 \leq a < b < c < n$ , form $M(p)_{abc} = [p_b - p_a,\; p_c - p_a]$ , a $2 \times 2$ edge matrix.
Normalized Determinant Matrix: Each triangle’s signed, normalized area $D(p)_{abc}$ , defined as

$D(p)_{abc} = \frac{4\, \det\left(\begin{bmatrix} p_b.x - p_a.x & p_c.x - p_a.x \ p_b.y - p_a.y & p_c.y - p_a.y \end{bmatrix}\right)}{\lambda(p)^2}$

encodes scale-invariant geometric features.

Degeneracy and Glyph Dimensionality: $\max_{abc} |G_{abc}| < \varepsilon_1$ (with $\varepsilon_1 \approx 0.004$ ) designates the glyph as essentially 1-D ("line glyph"); values above indicate 2-D structure.

3. Candidate Triangle Selection and Robust Alignment

To optimize alignment quality and computational efficiency:

Triangle Robustness Filtering: Among all possible index triplets, select $m$ triangles (commonly $m=10$ ) with the highest $|G_{abc}|$ values. This maximizes area robustness and minimizes degeneracies (pivot-select prioritizes efficiency over full sorting).
Candidate Alignment Generation: Each chosen triangle $[a,b,c]$ serves as a frame for affine transformation estimation between point clouds.

4. Affine Map Construction and Metric Evaluation

Affine transformations are constructed from triangle correspondences:

Transformation Matrix Construction: Using homogeneous coordinates, the 3×3 matrix for points $[a,b,c]$ in cloud $p$ is:

$\hat{p}_{abc} = \begin{bmatrix} p_b.x - p_a.x & p_c.x - p_a.x & p_a.x \ p_b.y - p_a.y & p_c.y - p_a.y & p_a.y \ 0 & 0 & 1 \end{bmatrix}$

Map Application: For input $g$ and template $h$ , and triangle $[a,b,c]$ :

$T_{abc} = (\hat{h}_{abc})^{-1} \cdot \hat{g}_{abc}$

Then, each template point $h_i$ is projected as $r_i = T_{abc} \cdot [h_i.x;\; h_i.y;\; 1]$ .

Error Metric: The sum-of-squared-error (without square-root) is computed:

$\text{metric}(g, r) = \sum_{i=0}^{n-1} \left[(g_i.x - r_i.x)^2 + (g_i.y - r_i.y)^2\right]$

preserving ordering and computational efficiency.

5. Recognition Pipeline and Invariance Properties

The recognition process iterates over template glyphs and candidate triangles, enforcing invariance and optimizing match quality:

Recognition Loop Core: For each candidate triangle and template:
- Dimensionality Consistency: Skip if input and template differ in 1-D/2-D classification.
- Degeneracy Checks: Discard nearly degenerate triangles ( $|nd_g| < \varepsilon_2$ or $|nd_h| < \varepsilon_2$ ).
- Reflection Control: If $\text{sign}(nd_g \cdot nd_h) < 0$ and template prohibits mirroring, skip.
- Orientation Constraints: Optionally restrict on orientation similarity using
$\text{triSimilarity}(g, h, [a,b,c]) = \cos\theta_{g_a g_b, h_a h_b} + \cos\theta_{g_b g_c, h_b h_c} + \cos\theta_{g_c g_a, h_c h_a}$ - Affine Mapping, Projection, Metric: Construct affine map, transform template, compute metric, and retain the template yielding lowest error.
Invariance Guarantees:
- Translation: $p_a$ anchors the transformation.
- Rotation, Scaling, Skew: Encoded in $[p_b-p_a, p_c-p_a]$ .
- Reflection: Determinant sign changes under mirroring, with explicit logic to allow or bar mirrored matches.

6. Computational Properties and Empirical Results

The computational and empirical performance of the Glyph2Cloud pipeline is characterized by:

Complexity: For $n=16$ $n = 16$ , $T=32$ $T = 32$ , $m=10$ $m = 10$ :
- $O(n^3)$ determinant computations ($560$ for $n=16$ ) for $G_{abc}$ .
- Expected $O(n^3)$ for robust triangle selection without full sorting.
- Per-recognition cost: $O(n^3 + T \cdot m \cdot n)$ , or $30\,000$ – $40\,000$ multiply-accumulate operations.
Latency: Achieves sub-millisecond recognition on modern CPUs. JavaScript implementation reports $\sim 0.66$ ms per gesture (aggregate: $3.3$ s for $4\,950$ gestures).
Accuracy (on \$1 dataset, 4\,950 gestures, 15 templates):

| Recognizer | Accuracy (%) | Runtime (s, total) | |--------------------|-------------|--------------------| | $1 Recognizer | 95.56 | 2.038 | | Squiggle | 95.09 | 3.292 | | Protractor | 92.87 | 0.254 |

Decision Correlation:
- Squiggle vs $1 Recognizer $:$ 94.85\% $</li> <li>Squiggle vs Protractor:$ 92.77\%$

7. Practical Usage and Real-Time Feedback

Because the Squiggle-based Glyph2Cloud module operates in screen coordinates and applies affine alignment identical to rendering pipelines, it naturally supports overlaying visual template "shadows" during drawing. This enables real-time, accurate feedback for user gestures, exploiting the direct geometry-to-visual mapping. Template preprocessing (H_{abc} storage) and modular pipeline composition further simplify deployment in gesture-based UI systems (Lee, 2011).

Markdown Report Issue Upgrade to Chat

References (1)

Squiggle - A Glyph Recognizer for Gesture Input (2011)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Glyph2Cloud Module.

Glyph2Cloud: Affine-Invariant Gesture Recognition

1. Milestone Point Cloud Construction

2. Triangle-Based Affine Alignment Framework

3. Candidate Triangle Selection and Robust Alignment

4. Affine Map Construction and Metric Evaluation

5. Recognition Pipeline and Invariance Properties

6. Computational Properties and Empirical Results

7. Practical Usage and Real-Time Feedback

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Glyph2Cloud: Affine-Invariant Gesture Recognition

1. Milestone Point Cloud Construction

2. Triangle-Based Affine Alignment Framework

3. Candidate Triangle Selection and Robust Alignment

4. Affine Map Construction and Metric Evaluation

5. Recognition Pipeline and Invariance Properties

6. Computational Properties and Empirical Results

7. Practical Usage and Real-Time Feedback

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research