Gradient Adjustment with Phase-guidance (GAP)
- GAP is a family of algorithms that modulate gradient updates using phase or state signals to enhance learning in dynamic environments.
- It integrates phase-aware masking and deterministic phase corrections to balance multimodal inputs in robotics and improve optical coherence.
- Empirical results show significant performance gains, achieving up to 100% success in robotic tasks and nearly ideal SNR in optical combining with minimal overhead.
Gradient Adjustment with Phase-guidance (GAP) encompasses a family of algorithms that leverage dynamically modulated gradient updates by exploiting phase or state information to guide learning or signal processing. The GAP principle has been introduced in both robotic policy learning—specifically, vision-proprioception fusion—and in coherent receiver phase alignment for free-space optical communication. Implementations are problem-specific but share the core principle of adaptively attenuating or steering gradients according to temporal or physical "phase" estimates, in order to improve robustness and achieve optimal integration of heterogeneous signals (Lu et al., 12 Feb 2026, Chen et al., 19 Jun 2026).
1. General Principle and Motivation
Gradient Adjustment with Phase-guidance (GAP) is defined as the strategy of dynamically modulating the magnitude or direction of gradient-based updates (e.g., in SGD or phase-locked loops) using a phase-related or transition-aware signal. The overarching goal is to avoid domination by misleading or overly concise modalities during critical transition windows, or to enhance convergence towards optimal configurations in systems suffering from distributed phase disturbances. GAP unifies two distinct threads: (A) fine-grained balancing of sensory gradient contributions by phase-aware masking (robotics), and (B) deterministic phase-corrective gradient ascent for multi-aperture receiver combining (communications).
2. GAP in Robotic Vision-Proprioception Policy Learning
In vision-proprioception policy fusion, standard joint training is susceptible to proprioceptive signal dominance, especially during robot motion transitions where visual information is crucial but learns slowly due to gradient suppression. GAP operates by estimating, for each trajectory timestep, the probability of a "motion-transition" phase using a learned LSTM-based soft indicator (), trained to predict phase boundaries derived from dynamic programming-based change-point detection (CPD) on proprioceptive state differences. The update for proprioceptive encoder parameters at sample is then modulated:
where is a global scaling, and is the behavior cloning loss. This suppression of proprioceptive gradients during motion transitions enables robust, generalizable learning of vision cues, as evidenced by substantial improvements over standard baselines and fusion architectures (Lu et al., 12 Feb 2026).
3. GAP Algorithmic Structure and Implementation in Robotics
The GAP procedure in the robotic domain consists of: (1) CPD-based phase segmentation to identify motion-consistent segments; (2) LSTM-based probabilistic smoothing to yield soft per-timestep phase transition probabilities; and (3) sample-wise gradient scaling during policy learning according to . The approach is modular and readily integrates into behavior cloning, actor-critic, and vision-language-action (e.g., Octo) frameworks by simply wrapping the proprioceptive encoder's backward pass. Empirically, optimal lies in , and best results are achieved by applying GAP in early (first half) epochs only. Ablative tests confirm the LSTM-soft approach outperforms fixed-masks or smooth Gaussian alternatives.
4. GAP for Blind Phase Alignment in Multi-Aperture Coherent Combining
In coherent digital combining for multi-aperture free-space optical reception, phase misalignment between signal branches significantly degrades combining gain. The BGAPA (Blind Gradient-Ascent Phase Alignment) algorithm is a concrete realization of GAP for this context: it iteratively maximizes the instantaneous combined output power directly by a closed-form deterministic gradient step on each aperture's phase, requiring no training symbols, pilots, or decision-directed feedback. For dual-polarization QPSK signals from apertures, the objective is:
The per-aperture phase correction 0 is updated by
1
where 2 is a step size tuned per operating point. BGAPA's fully blind nature—using only instantaneous received fields—equips it with strong robustness to large phase excursions and 3 per-sample complexity (Chen et al., 19 Jun 2026).
5. Comparative Performance and Empirical Benchmarks
Robotics Domain: GAP delivers 90–95% success rates on simulated tasks, compared to ~65% for naïve concatenation and 60–75% for masked/auxiliary-loss methods. In real-world one- and dual-arm setups, GAP achieves up to 100% (press), ~85% (cube), and outperforming vision-only and multimodal baselines. GAP persistently improves out-of-distribution (OOD) generalization, bridging or exceeding visual-only performance by +10% (Lu et al., 12 Feb 2026).
Multi-Aperture Combining: With 4 quadrupled from 64 to 256, BGAPA achieves post-CPR SNR gains of ~5.7 dB—very close to the ideal 6.02 dB upper bound. Its performance persists up to per-aperture RMS phase fluctuations on the order of 278 rad, substantially outperforming decision-directed LMS which fails at much smaller phase excursions (Chen et al., 19 Jun 2026).
| Domain | Standard Baseline (Success/Performance) | GAP Result (Success/Performance) |
|---|---|---|
| Simulated robotics | ~65–85% | 90–95% |
| Real dual-arm | ~16/20 | 18–20/20 |
| FSO combining SNR | < ideal (4×N: ≤4.5 dB) | 5.7 dB (ideal: 6.0 dB) |
6. Complexity, Tuning, and Limitations
GAP's computational overhead is minimal in both contexts: 5 for phase alignment (no multitap filters) and negligible in large-scale policy learning, as it requires only forward passes through a small LSTM and per-sample gradient scaling. Practical deployment necessitates hyperparameter selection, especially the gradient scaling coefficient (6) and, for BGAPA, the step size (7), both tuned via offline sweeps at new operating points. Both employ strong phase-awareness assumptions; for BGAPA, only phase disturbance is modeled, excluding amplitude and polarization effects. Global convergence guarantees under stochastic variation are not provided—empirical robustness is demonstrated under controlled conditions.
7. Broader Implications and Extensions
GAP demonstrates that phase/time-aware gradient modulation can effectively prevent overfitting or mode-collapse in multimodal learning and ensures near-ideal gain in distributed array processing. Its principled, modular construction lends itself to future work in other settings where modality dominance or distributed phase error is problematic. A plausible implication is that GAP-like strategies may generalize to broader sensor fusion, adaptive filtering, or distributed control scenarios where domain knowledge can inform dynamic gradient scaling (Lu et al., 12 Feb 2026, Chen et al., 19 Jun 2026).