Joint Temporal Lipschitz-Guided Attacks
- J-TLGA is an adversarial attack that jointly perturbs the router and expert modules in video MoE systems by maximizing temporal Lipschitz constants.
- Experimental results show that J-TLGA can reduce robust accuracy significantly (e.g., from 11.10% to 2.54% on UCF-101) by exploiting module interactions.
- The approach motivates new defenses like Joint Temporal Lipschitz Adversarial Training (J-TLAT) to improve robustness while retaining computational efficiency.
Joint Temporal Lipschitz-Guided Attacks (J-TLGA) constitute a class of adversarial attacks designed to expose and exploit collaborative vulnerabilities inherent in video Mixture-of-Experts (MoE) architectures. Unlike conventional attacks treating MoE as unitary, J-TLGA targets both the router and the expert modules jointly via perturbations constructed to maximize their temporal Lipschitz constants, amplifying adversarial effects through their interaction. This methodology uncovers failure modes unaddressed by prior attacks, providing new insights into the adversarial robustness landscape of temporally structured MoE systems (Wang et al., 1 Feb 2026).
1. Adversarial Weaknesses in Video Mixture-of-Experts
Video MoE models decompose computation into a lightweight router and a set of expert networks , where the router selects a subset of experts per video clip, and the experts produce final class logits. Traditional gradient-based attacks such as Projected Gradient Descent (PGD) view MoE as a unified function , seeking to maximize the cross-entropy loss under an -norm constraint on the perturbation . However, such attacks overlook distinct vulnerabilities: (1) the router’s independent fragility, manifesting as “routing collapse” even for small , and (2) collaborative weaknesses that emerge from combined router mis-steering and expert module instability. Empirically, PGD yields limited robust accuracy reductions (e.g., 54% clean vs. 11% under attack at on UCF-101), leaving major weaknesses undetected (Wang et al., 1 Feb 2026).
2. Temporal Lipschitz Constant and Its Role in Attack Design
For temporal data, the local Lipschitz constant for a mapping is estimated by
and is extended to a temporal version to capture per-frame dynamics:
A large indicates that minor temporal input perturbations lead to pronounced output swings. This property is exploited as a lever for enhanced attack objectives, as temporal volatility can trigger compounded errors in both routing and expert inference in MoE.
3. Joint Attack Objective and Optimization
J-TLGA is formalized by a joint loss that perturbs once and propagates to both router and experts:
where is cross-entropy loss for MoE output and ground-truth label ; is computed as the maximum temporal Lipschitz estimate over the currently activated (top-) experts. The attack is designed to implicitly push the router toward the weakest expert (lowest clean prediction confidence), while simultaneously destabilizing expert outputs temporally. The use of “temporal adaptive step-sizes” in the PGD-like update is integral, with per-frame momentum and log-scaling ensuring efficient exploitation of transient vulnerabilities (Wang et al., 1 Feb 2026).
J-TLGA Pseudocode (excerpt):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Input: clean clip x, label y, MoE (R, E₁…E_M), ε, step K, λ_r, λ_e, α_base, μ. Initialize δ←0, V←0 (shape T×…) for k in 0…K−1 do # 1) compute joint loss and gradient out_MoE = sum_i R(x+δ)_i * E_i(x+δ) L1 = CrossEntropy(out_MoE, y) L2 = λ_r * finite_diff_temporal(R, x, δ) L3 = λ_e * finite_diff_temporal(E, x, δ) Loss = L1 + L2 + L3 g = ∇_δ Loss # 2) temporal momentum & adaptive steps for t in 1…T do V_t = μ·V_t + ‖g_t‖₂ α_t = α_base · log(1 + V_t) end # 3) gradient‐sign update δ = Project_∞[δ + sign(g) ⊙ α] end return adversarial example x_adv = x + δ |
4. Empirical Results and Attack Effectiveness
Experimental evaluation on action recognition datasets (UCF-101, HMDB-51) and architectures (3D ResNet-18, TSM, SlowFast, R(2+1)D; with Top-1 MLP routers; experts, active per forward) demonstrates the severity of vulnerabilities exposed by J-TLGA. For (), robust accuracy results on UCF-101 (3D-ResNet-18 expert) are summarized below:
| Attack | Robust Accuracy (%) |
|---|---|
| PGD | 11.10 |
| TLA-M | 4.73 |
| J-TLA | 4.73 |
| J-TLGA | 2.54 |
[J-TLGA achieves the lowest robust accuracy across all tested setups and is much more effective than PGD or modular (TLA-M) baselines.]
Black-box transferability is also enhanced: under J-TLGA, 3D-ResNet models retain only 20.15% robust accuracy versus 66.04% for TT, underlining broad applicability. Ablation studies show attack potency is maximized for , with longer input clips (higher ) and higher expert cardinality yielding marginal resilience but persistent severe drops in robust accuracy.
5. Insights Into MoE Vulnerability Structure
Analysis of experimental outcomes identifies the MoE's Achilles’ Heel as the coupled fragility of router and experts under temporally structured attacks. Key findings:
- Router vulnerability: Targeted router attacks (TLGA-R) degrade routing consistency IoU over PGD-R by more than 20%.
- Expert sensitivity: Expert module perturbation (TLA-E) is 10–15% more effective than traditional attacks targeting only experts.
- Joint weakness: J-TLGA leverages cascading failures; minor router mis-steering, combined with expert output disruptions, causes severe misclassifications not captured by component-agnostic adversarial training.
This coordination of vulnerabilities demonstrates that defenses must address both independent and collaborative weaknesses to achieve robustness.
6. Joint Temporal Lipschitz Adversarial Training (J-TLAT)
To mitigate the weaknesses revealed by J-TLGA, Joint Temporal Lipschitz Adversarial Training (J-TLAT) is introduced. J-TLAT hierarchically defends the MoE in three stages per training epoch:
- Router AT:
- Expert AT: (on weakest identified experts)
- Full MoE AT:
J-TLAT is “plug-and-play” and preserves the MoE’s efficiency (over 60% FLOP reduction compared to dense models). Empirically, J-TLAT increases robust accuracy under J-TLGA from 5.17% (AT-MoE) to 21.98% on UCF-101 at , with further improvements under other attacks. The approach yields the lowest realized Lipschitz constants (Lips-R = 0.823, Lips-J = 2.343), confirming increased smoothness.
7. Implications and Future Directions
J-TLGA establishes that the primary source of brittleness in video MoE is the combined effect of time-coupled router and expert errors under adversarial perturbations. The methodology reveals previously hidden failure modes and motivates the need for layered, component-sensitive adversarial training. A plausible implication is that similar coupling phenomena likely affect other modular video models or time-sensitive sparse architectures. J-TLAT exemplifies a defense paradigm that leverages Lipschitz conditioning and targeted module-wise adversarial training to harden models against structured attacks while retaining computational efficiency (Wang et al., 1 Feb 2026). The extension of J-TLGA and J-TLAT concepts to other domains with modular temporal inference remains an open and promising direction for robustness research.