Conflict-Aware Projected Gradient Ascent (CAPGA)

Updated 10 March 2026

The paper presents CAPGA's main contribution of using projection techniques to resolve conflicts among gradients in multi-task, federated unlearning, and multi-agent learning.
It introduces a projection operator that adjusts gradients to maintain protected objectives by eliminating detrimental updates and ensuring utility preservation and fairness.
Empirical evaluations across CIFAR-10, Census-Income, and fairness benchmarks demonstrate CAPGA’s efficiency, robust theoretical guarantees, and superior performance compared to alternative methods.

Conflict-Aware Projected Gradient Ascent (CAPGA) is a class of gradient optimization techniques designed to resolve and manage conflicting objectives in settings such as federated unlearning, multi-agent reinforcement learning, and multi-task learning. CAPGA employs projection-based mechanisms to ensure that parameter updates derived from gradients maintain desirable properties across multiple, and potentially competing, criteria: effective unlearning without utility degradation, fair cooperative agent behavior, or robust multi-task generalization. The core of CAPGA methods is identifying when objectives are in conflict at the level of parametric gradients and then projecting gradient steps to optimize target criteria while enforcing constraints such as utility preservation or fairness.

1. Foundational Problem Formulation

In CAPGA frameworks, optimization is performed over parameter vectors $\theta \in \mathbb{R}^d$ for one or more objectives. Conflicting objectives appear in three main guises:

Federated Unlearning (FedCARE): The objective is to erase the influence of a private “forget” set $\mathcal{D}_u$ by maximizing its loss $L_u(\theta)$ , while constraining the reference loss $L_{\mathrm{ref}}(\theta)$ (measured on a synthesized pseudo-dataset) to not increase, thereby preserving shared utility (Li et al., 30 Jan 2026).
Multi-Task Learning (GradOPS): Task-specific gradients $\nabla_\theta L_i(\theta)$ may conflict, causing trade-off instability; the goal is to produce updates that guarantee non-negative progress for all tasks, driving the solution toward a Pareto stationary point (Zhu et al., 5 Mar 2025).
Mixed-Motive Multi-Agent Cooperation: Agents' individual and collective rewards define gradients $g_i = \nabla_\theta J_i(\theta)$ and $g_c = \nabla_\theta J_c(\theta)$ ; conflicts (i.e., negative dot product) between these can undermine both social welfare and fairness unless handled by a conflict-aware adjustment (Kim et al., 25 Aug 2025).

A gradient conflict is said to occur between two objective gradients $g_i$ and $g_j$ if $\langle g_i, g_j \rangle < 0$ . CAPGA methods generalize this principle to $T > 2$ tasks, establishing a “conflict-free cone” wherein all projected updates do not degrade any objective.

2. Conflict Detection and Resolution via Projection

Conflict identification is performed at each optimization step by examining inner products between gradients of competing objectives:

Simple Pairwise Case: For two objectives $g_{tar}$ (target, e.g., forgetting) and $g_{ref}$ (reference, e.g., utility), a conflict is present if $\langle g_{tar}, g_{ref} \rangle > 0$ in the unlearning scenario (Li et al., 30 Jan 2026).
Multi-Task Generalization: In multi-task settings, for each task $i$ , the task gradient $g_i$ is projected onto the orthogonal complement of the subspace spanned by all other task gradients $\{g_j\}_{j \neq i}$ to guarantee non-negativity: $g_i^\perp = (I - G_{-i}(G_{-i}^T G_{-i})^{-1} G_{-i}^T)g_i$ , where $G_{-i}$ stacks all other gradients (Zhu et al., 5 Mar 2025).

The projection operator removes only those components of a gradient that adversely affect protected objectives. For two gradients, conflict resolution reduces to a half-space projection; for $T$ tasks, it becomes an orthogonal projection onto the relevant subspace.

3. CAPGA Update Rules and Algorithmic Procedures

CAPGA algorithms follow a structured procedure in gradient-based optimization cycles, which is summarized in the following canonical steps and illustrated in pseudocode.

Single-Constraint Case (FedCARE unlearning) (Li et al., 30 Jan 2026):

For target and reference gradients $g_{tar}$ and $g_{ref}$ at iteration $t$ ,

If $\langle g_{tar}, g_{ref} \rangle \leq 0$ , update in the pure forgetting direction: $d = g_{tar}$ .
If $\langle g_{tar}, g_{ref} \rangle > 0$ , project $g_{tar}$ to enforce $\langle g_{ref}, d \rangle \leq 0$ :

$d = g_{tar} - \frac{\max(0, \langle g_{tar}, g_{ref} \rangle)}{\|g_{ref}\|_2^2 + \epsilon} g_{ref}$

Parameter update: $\theta^{(t+1)} = \theta^{(t)} + \eta_{ul} d$ .

Multi-Objective Case (Multi-Task/Agent) (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025):

For each task $i$ , if $g_i$ conflicts with any $g_j$ ( $j\neq i$ ), project as above; else, keep $g_i$ .
Aggregate deconflicted gradients (with optional task-specific weighting).
Apply aggregated update to $\theta$ .

Pseudocode Example (FedCARE) (Li et al., 30 Jan 2026):

Input: θ⁽⁰⁾, forget set 𝒟ₙ, ref generator 𝒢, steps T, lr η_ul, batch sizes B_u, B_ref, ε>0.
1. Construct 𝒟_ref from synthesized samples for retained classes.
2. for t=0…T−1
    a. Sample Bᵤ^(t)⊂𝒟ᵤ,  B_ref^(t)⊂𝒟_ref.
    b. Compute g_tar, g_ref.
    c. if ⟨g_tar, g_ref⟩ ≤ 0 then d ← g_tar
       else d ← g_tar − [⟨g_tar,g_ref⟩ / (‖g_ref‖²+ε)]·g_ref
    d. θ ← θ + η_ul·d
3. Return final θ.

A generalization for multi-task settings appears in (Zhu et al., 5 Mar 2025), including optional trade-off weighting w.r.t. relative angle and magnitude.

4. Theoretical Guarantees

CAPGA frameworks establish two levels of formal guarantees:

First-Order Utility Protection: By construction, any small step in the projected direction $d$ preserves or improves the protected objective at first order, as $L_{\mathrm{ref}}(\theta+\eta d) \approx L_{\mathrm{ref}}(\theta) + \eta \langle g_{\mathrm{ref}}, d \rangle$ , with $\langle g_{\mathrm{ref}}, d \rangle \leq 0$ (Li et al., 30 Jan 2026).
Pareto Stationarity and Fairness: In multi-objective settings, repeated CAPGA updates drive all projected gradients toward zero, converging to a Pareto stationary point (no further feasible improvement in any objective) (Zhu et al., 5 Mar 2025). For multi-agent fairness, CAPGA guarantees monotonic non-decreasing improvement in both individual and collective metrics and ensures gap convergence ( $|J_i - J_c| \to 0$ ) under conditions on step sizes and recurrence of conflicts (Kim et al., 25 Aug 2025).

5. Key Empirical Findings

Empirical evaluations on diverse domains demonstrate that CAPGA methods consistently achieve their formal objectives with practical efficiency:

Scenario	Benchmark	Key Metrics	CAPGA Performance
Federated Unlearning	CIFAR-10 (non-IID)	R-Acc, U-Acc, runtime, FLOPs	Matches retrain utility; R-Acc=81.30%, U-Acc=78.52%, Time=125.39s, $1.49\times10^{14}$ FLOPs.
Multi-Task Learning	Census-Income, NYUv2	Task AUCs, mIoU, Δₘ, mean rank	Outperforms GradNorm, PCGrad, CAGrad; zero conflicts; Pareto front navigable via $\alpha$ .
Multi-Agent Fairness	Cleanup, Harvest, Coin	$\alpha$ -fairness, Gini, Jain's idx	Highest min-agent reward (0.84), superior fairness and collective score over all baselines.

CAPGA methods demonstrate stable performance, strong fairness properties, and efficiency improvements over competing approaches such as PCGrad, CAGrad, and inequity aversion (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025, Li et al., 30 Jan 2026).

6. Hyperparameters, Algorithmic Complexity, and Applicability

Typical CAPGA schemes use only standard learning rates (possibly with Robbins–Monro decay), a stability constant (e.g., $\epsilon$ ), and, optionally, weights ( $\beta$ , $\alpha$ ) tuning trade-off between objectives. The computational overhead is $O(d)$ per update, dominated by gradient calculations and simple vector projections.

Applicability spans:

Client-, instance-, and class-level unlearning in federated learning with explicit utility-constrained updates (Li et al., 30 Jan 2026).
Multi-task learning with tunable trade-offs among tasks and explicit elimination of inter-task gradient conflicts (Zhu et al., 5 Mar 2025).
Mixed-motive reinforcement learning for provably fair cooperation and social welfare maximization (Kim et al., 25 Aug 2025).

A plausible implication is that CAPGA’s generality makes it a preferred principle for new gradient-based algorithms wherever real-time conflict resolution among multiple objectives or stakeholder constraints is critical.

CAPGA subsumes or extends several prominent prior approaches:

PCGrad [Yu et al., NeurIPS 2020] performs pairwise mutual projections to reduce but not eliminate conflicts; CAPGA and GradOPS ensure “strong” non-conflicting gradients globally (Zhu et al., 5 Mar 2025).
CAGrad, GradNorm, and related multi-objective optimizers are outperformed by CAPGA/GradOPS in both empirical utility and conflict resolution (Zhu et al., 5 Mar 2025).
CAPGA’s update operators are analytically tractable, and in the multi-task regime, the orthogonal projection structure (onto spans of all other gradients) is critical for simultaneous progress on all tasks.

Recent works demonstrate that, once conflict-free gradients are constructed, simple weighting strategies (exponent $\alpha$ or parameter $\beta$ ) allow traversal of the Pareto frontier with stable, superior trade-offs.

CAPGA provides a rigorous, flexible, and efficient paradigm for handling conflicts in multi-objective, multi-agent, and unlearning optimization scenarios, with clear theoretical and empirical evidence supporting its adoption across federated learning, multi-agent systems, and multi-tasking models (Zhu et al., 5 Mar 2025, Kim et al., 25 Aug 2025, Li et al., 30 Jan 2026).