Papers
Topics
Authors
Recent
Search
2000 character limit reached

Conflict-Aware Projected Gradient Ascent (CAPGA)

Updated 10 March 2026
  • The paper presents CAPGA's main contribution of using projection techniques to resolve conflicts among gradients in multi-task, federated unlearning, and multi-agent learning.
  • It introduces a projection operator that adjusts gradients to maintain protected objectives by eliminating detrimental updates and ensuring utility preservation and fairness.
  • Empirical evaluations across CIFAR-10, Census-Income, and fairness benchmarks demonstrate CAPGA’s efficiency, robust theoretical guarantees, and superior performance compared to alternative methods.

Conflict-Aware Projected Gradient Ascent (CAPGA) is a class of gradient optimization techniques designed to resolve and manage conflicting objectives in settings such as federated unlearning, multi-agent reinforcement learning, and multi-task learning. CAPGA employs projection-based mechanisms to ensure that parameter updates derived from gradients maintain desirable properties across multiple, and potentially competing, criteria: effective unlearning without utility degradation, fair cooperative agent behavior, or robust multi-task generalization. The core of CAPGA methods is identifying when objectives are in conflict at the level of parametric gradients and then projecting gradient steps to optimize target criteria while enforcing constraints such as utility preservation or fairness.

1. Foundational Problem Formulation

In CAPGA frameworks, optimization is performed over parameter vectors θRd\theta \in \mathbb{R}^d for one or more objectives. Conflicting objectives appear in three main guises:

  • Federated Unlearning (FedCARE): The objective is to erase the influence of a private “forget” set Du\mathcal{D}_u by maximizing its loss Lu(θ)L_u(\theta), while constraining the reference loss Lref(θ)L_{\mathrm{ref}}(\theta) (measured on a synthesized pseudo-dataset) to not increase, thereby preserving shared utility (Li et al., 30 Jan 2026).
  • Multi-Task Learning (GradOPS): Task-specific gradients θLi(θ)\nabla_\theta L_i(\theta) may conflict, causing trade-off instability; the goal is to produce updates that guarantee non-negative progress for all tasks, driving the solution toward a Pareto stationary point (Zhu et al., 5 Mar 2025).
  • Mixed-Motive Multi-Agent Cooperation: Agents' individual and collective rewards define gradients gi=θJi(θ)g_i = \nabla_\theta J_i(\theta) and gc=θJc(θ)g_c = \nabla_\theta J_c(\theta); conflicts (i.e., negative dot product) between these can undermine both social welfare and fairness unless handled by a conflict-aware adjustment (Kim et al., 25 Aug 2025).

A gradient conflict is said to occur between two objective gradients gig_i and gjg_j if gi,gj<0\langle g_i, g_j \rangle < 0. CAPGA methods generalize this principle to T>2T > 2 tasks, establishing a “conflict-free cone” wherein all projected updates do not degrade any objective.

2. Conflict Detection and Resolution via Projection

Conflict identification is performed at each optimization step by examining inner products between gradients of competing objectives:

  • Simple Pairwise Case: For two objectives gtarg_{tar} (target, e.g., forgetting) and grefg_{ref} (reference, e.g., utility), a conflict is present if gtar,gref>0\langle g_{tar}, g_{ref} \rangle > 0 in the unlearning scenario (Li et al., 30 Jan 2026).
  • Multi-Task Generalization: In multi-task settings, for each task ii, the task gradient gig_i is projected onto the orthogonal complement of the subspace spanned by all other task gradients {gj}ji\{g_j\}_{j \neq i} to guarantee non-negativity: gi=(IGi(GiTGi)1GiT)gig_i^\perp = (I - G_{-i}(G_{-i}^T G_{-i})^{-1} G_{-i}^T)g_i, where GiG_{-i} stacks all other gradients (Zhu et al., 5 Mar 2025).

The projection operator removes only those components of a gradient that adversely affect protected objectives. For two gradients, conflict resolution reduces to a half-space projection; for TT tasks, it becomes an orthogonal projection onto the relevant subspace.

3. CAPGA Update Rules and Algorithmic Procedures

CAPGA algorithms follow a structured procedure in gradient-based optimization cycles, which is summarized in the following canonical steps and illustrated in pseudocode.

Single-Constraint Case (FedCARE unlearning) (Li et al., 30 Jan 2026):

For target and reference gradients gtarg_{tar} and grefg_{ref} at iteration tt,

  • If gtar,gref0\langle g_{tar}, g_{ref} \rangle \leq 0, update in the pure forgetting direction: d=gtard = g_{tar}.
  • If gtar,gref>0\langle g_{tar}, g_{ref} \rangle > 0, project gtarg_{tar} to enforce gref,d0\langle g_{ref}, d \rangle \leq 0:

d=gtarmax(0,gtar,gref)gref22+ϵgrefd = g_{tar} - \frac{\max(0, \langle g_{tar}, g_{ref} \rangle)}{\|g_{ref}\|_2^2 + \epsilon} g_{ref}

  • Parameter update: θ(t+1)=θ(t)+ηuld\theta^{(t+1)} = \theta^{(t)} + \eta_{ul} d.

Multi-Objective Case (Multi-Task/Agent) (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025):

  • For each task ii, if gig_i conflicts with any gjg_j (jij\neq i), project as above; else, keep gig_i.
  • Aggregate deconflicted gradients (with optional task-specific weighting).
  • Apply aggregated update to θ\theta.

Pseudocode Example (FedCARE) (Li et al., 30 Jan 2026):

1
2
3
4
5
6
7
8
9
Input: θ⁽⁰⁾, forget set 𝒟ₙ, ref generator 𝒢, steps T, lr η_ul, batch sizes B_u, B_ref, ε>0.
1. Construct 𝒟_ref from synthesized samples for retained classes.
2. for t=0…T−1
    a. Sample Bᵤ^(t)⊂𝒟ᵤ,  B_ref^(t)⊂𝒟_ref.
    b. Compute g_tar, g_ref.
    c. if ⟨g_tar, g_ref⟩ ≤ 0 then d ← g_tar
       else d ← g_tar − [⟨g_tar,g_ref⟩ / (‖g_ref‖²+ε)]·g_ref
    d. θ ← θ + η_ul·d
3. Return final θ.
A generalization for multi-task settings appears in (Zhu et al., 5 Mar 2025), including optional trade-off weighting w.r.t. relative angle and magnitude.

4. Theoretical Guarantees

CAPGA frameworks establish two levels of formal guarantees:

  • First-Order Utility Protection: By construction, any small step in the projected direction dd preserves or improves the protected objective at first order, as Lref(θ+ηd)Lref(θ)+ηgref,dL_{\mathrm{ref}}(\theta+\eta d) \approx L_{\mathrm{ref}}(\theta) + \eta \langle g_{\mathrm{ref}}, d \rangle, with gref,d0\langle g_{\mathrm{ref}}, d \rangle \leq 0 (Li et al., 30 Jan 2026).
  • Pareto Stationarity and Fairness: In multi-objective settings, repeated CAPGA updates drive all projected gradients toward zero, converging to a Pareto stationary point (no further feasible improvement in any objective) (Zhu et al., 5 Mar 2025). For multi-agent fairness, CAPGA guarantees monotonic non-decreasing improvement in both individual and collective metrics and ensures gap convergence (JiJc0|J_i - J_c| \to 0) under conditions on step sizes and recurrence of conflicts (Kim et al., 25 Aug 2025).

5. Key Empirical Findings

Empirical evaluations on diverse domains demonstrate that CAPGA methods consistently achieve their formal objectives with practical efficiency:

Scenario Benchmark Key Metrics CAPGA Performance
Federated Unlearning CIFAR-10 (non-IID) R-Acc, U-Acc, runtime, FLOPs Matches retrain utility; R-Acc=81.30%, U-Acc=78.52%, Time=125.39s, 1.49×10141.49\times10^{14} FLOPs.
Multi-Task Learning Census-Income, NYUv2 Task AUCs, mIoU, Δₘ, mean rank Outperforms GradNorm, PCGrad, CAGrad; zero conflicts; Pareto front navigable via α\alpha.
Multi-Agent Fairness Cleanup, Harvest, Coin α\alpha-fairness, Gini, Jain's idx Highest min-agent reward (0.84), superior fairness and collective score over all baselines.

CAPGA methods demonstrate stable performance, strong fairness properties, and efficiency improvements over competing approaches such as PCGrad, CAGrad, and inequity aversion (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025, Li et al., 30 Jan 2026).

6. Hyperparameters, Algorithmic Complexity, and Applicability

Typical CAPGA schemes use only standard learning rates (possibly with Robbins–Monro decay), a stability constant (e.g., ϵ\epsilon), and, optionally, weights (β\beta, α\alpha) tuning trade-off between objectives. The computational overhead is O(d)O(d) per update, dominated by gradient calculations and simple vector projections.

Applicability spans:

A plausible implication is that CAPGA’s generality makes it a preferred principle for new gradient-based algorithms wherever real-time conflict resolution among multiple objectives or stakeholder constraints is critical.

CAPGA subsumes or extends several prominent prior approaches:

  • PCGrad [Yu et al., NeurIPS 2020] performs pairwise mutual projections to reduce but not eliminate conflicts; CAPGA and GradOPS ensure “strong” non-conflicting gradients globally (Zhu et al., 5 Mar 2025).
  • CAGrad, GradNorm, and related multi-objective optimizers are outperformed by CAPGA/GradOPS in both empirical utility and conflict resolution (Zhu et al., 5 Mar 2025).
  • CAPGA’s update operators are analytically tractable, and in the multi-task regime, the orthogonal projection structure (onto spans of all other gradients) is critical for simultaneous progress on all tasks.

Recent works demonstrate that, once conflict-free gradients are constructed, simple weighting strategies (exponent α\alpha or parameter β\beta) allow traversal of the Pareto frontier with stable, superior trade-offs.


CAPGA provides a rigorous, flexible, and efficient paradigm for handling conflicts in multi-objective, multi-agent, and unlearning optimization scenarios, with clear theoretical and empirical evidence supporting its adoption across federated learning, multi-agent systems, and multi-tasking models (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025, Li et al., 30 Jan 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Conflict-Aware Projected Gradient Ascent (CAPGA).