Convex Adversarial Priors

Updated 25 May 2026

Convex adversarial priors are regularization methods that incorporate convex constraints into training, ensuring uniqueness and stability in inverse problems and robust classification.
They employ techniques like input-convex neural networks, convex relaxations, and SDP formulations to derive provable guarantees against adversarial perturbations.
These methodologies deliver actionable benefits such as certified robustness, quantifiable convergence rates, and improved performance in image reconstruction and adversarial defenses.

Convex adversarial priors are a class of regularization and training methodologies in machine learning and inverse problems that leverage convexity to ensure analytical guarantees, stability, and enhanced robustness to adversarial perturbations. They originate from the intersection of convex optimization, adversarial machine learning, robust statistics, and variational methods, and are instantiated in contexts such as adversarial convex regularization, convex outer adversarial polytopes, semidefinite programming (SDP) relaxations for robust training, and convex adversarial attack frameworks. These approaches introduce convex constraints, convex or input-convex neural architectures, or convex relaxations explicitly as a means to combine the flexibility of data-driven learning with the theoretical strengths of convex analysis, notably uniqueness, stability, and certifiability.

1. Core Principles of Convex Adversarial Priors

The defining feature of convex adversarial priors is the imposition of convex constraints or regularization into the modeling or training framework in order to realize both robust and provable guarantees. Specifically, these priors operate in two principal domains:

Inverse Problems: Convex priors are learned—often with input-convex neural networks (ICNNs)—in a variational framework for image reconstruction, with adversarial training driving the regularizer to distinguish between ground-truth and artifact-laden or unregularized reconstructions (Mukherjee et al., 2020, Mukherjee et al., 2021).
Robust Learning: The set of network weights, activations, or input perturbations are restricted to convex sets or relaxations (e.g., outer polytopes, spectrahedra), enabling robust min–max training with certificates on performance and adversarial robustness (Wong et al., 2017, Kuelbs et al., 2024, Bai et al., 2021).

Convexity guarantees uniqueness of solutions, stability under data perturbations, and lends the possibility of deriving provable convergence rates and error bounds. Adversariality is either encoded explicitly (via minimax or worst-case formulations) or via adversarial training loops.

2. Methodological Frameworks

Convex adversarial priors arise in several concrete frameworks, unified by their use of convex constraints, convex parameterizations, or convex relaxations.

2.1 Convex Regularization in Inverse Problems

Adversarial convex regularization (ACR) employs input-convex neural networks to parameterize the regularizer $R_\theta$ in a variational formulation:

$\hat x = \arg\min_{x\in X} \tfrac12 \|A x - y\|_2^2 + \lambda R_\theta(x),$

where $A$ is a known (typically linear) forward operator, and $R_\theta$ is convex in $x$ by construction. The ICNN architecture enforces convexity via nonnegative constraints on recurrent weight matrices. The adversarial training objective is

$\theta^* = \arg\min_\theta \mathbb{E}_{x\sim\pi_X}[R_\theta(x)] - \mathbb{E}_{x_u \sim \pi_{\text{unreg}}}[R_\theta(x_u)],$

subject to 1-Lipschitzness, with Lipschitz constraints enforced by gradient penalties (Mukherjee et al., 2020). This framework admits efficient subgradient-based reconstruction algorithms with guaranteed monotonic error decay, uniqueness of minimizers, and continuous dependence of solutions on the data (Mukherjee et al., 2020, Mukherjee et al., 2021).

2.2 Convex Relaxations for Adversarial Robustness

In adversarially robust classification, convex adversarial priors often take the form of convex relaxations of otherwise nonconvex feasible sets. Notably, for ReLU networks, the convex outer adversarial polytope approach constructs the tightest linear (polyhedral) relaxation of the set of activations reachable under norm-bounded adversarial inputs:

$\mathcal{Z}_\epsilon(x) = \{f_\theta(x+\delta) : \|\delta\|_p \leq \epsilon\} \approx \tilde{\mathcal{Z}}_\epsilon(x).$

Robust training is then performed by minimizing the worst-case loss over this convex outer region—a linear program whose dual is representable as a backward pass through a "dual network" (Wong et al., 2017). This enables provably robust training with certified upper bounds on adversarial risk.

Semidefinite program (SDP) relaxations for two-layer (polynomial or ReLU) networks lift the nonconvex training problem to an SDP over positive semidefinite matrices encoding adversarial margin constraints. The resulting spectrahedral feasible set imposes a convex adversarial prior over network weights, enforcing flatness and low-rank structure via the SDP's nuclear norm term (Kuelbs et al., 2024).

2.3 Convex Programming for Adversarial Attacks and Defenses

Convex adversarial priors are also leveraged to design adversarial attacks via perturbation analysis:

The generic first-order adversarial subproblem—minimizing a linearized "fooling loss" under convex constraints on the perturbation $\delta$ —admits closed-form solutions for common norm balls and admits seamless extensions to structured or group-sparse priors (Balda et al., 2018).
Exploiting this, robust training is extended to cover arbitrary convex priors, supplying both tractable generation of adversarial examples and the option for certified lower bounds on adversarial loss.

3. Theoretical Guarantees and Properties

A foundational motivation for convex adversarial priors is the suite of analytical guarantees they enable:

Unique Minimizers and Stability: Strong convexity of the regularizer (possibly with additional quadratic terms) yields unique minimizers, continuous dependence of reconstruction on noisy inputs, and convergence of the solution to the true value as noise diminishes and the penalty parameter is decayed appropriately (Mukherjee et al., 2020, Mukherjee et al., 2021).
Provable Robustness and Certificates: In convex relaxations of adversarial training problems, dual representations permit computation of guaranteed lower bounds on worst-case adversarial loss. For each test example, explicit certificates can be constructed—e.g., by checking if the dual objective exceeds zero for a given perturbation size, providing certifiably robust predictions (Wong et al., 2017, Kuelbs et al., 2024).
Quantitative Convergence: Under a variational source condition (SC), empirical convergence rates for reconstruction are derived in Bregman distance:

$D_{R_\theta}(x_\lambda, \tilde x) \leq \tfrac12 \lambda \|\tilde w\|^2 + \frac{\delta^2}{2\lambda}, \quad \text{with} \ \lambda \sim \delta.$

Thus, $D_{R_\theta}(x_\lambda, \tilde x) = \mathcal{O}(\delta)$ , and norm-error rates are derivable under additional assumptions (Mukherjee et al., 2021).

4. Algorithms and Practical Implementation

Across domains, convex adversarial priors are implemented via principled algorithms:

Adversarial Training of Convex Regularizers: Alternating stochastic optimization steps are deployed to minimize adversarial losses subject to convexity constraints—using Adam or similar, with zero-clipping of negative weights in ICNNs (Mukherjee et al., 2020).
Robust Optimization via Convex Programs: Convex relaxations (LP, SOCP, SDP) are solved exactly for moderate scale or approximately via stochastic pattern sampling in large problems (Bai et al., 2021). Randomization mitigates the exponential cost of representing all activation patterns in ReLU networks.
Provable Defenses: Mini-batch-based schemes compute layerwise pre-activation bounds and backpropagate dual variables to assemble robust loss functions integrable with standard stochastic gradient descent frameworks (Wong et al., 2017).

A representative table summarizing typical convex adversarial frameworks and their properties:

Domain	Convex Prior Type	Key Theoretical Guarantee
Inverse Problems	ICNN-based regularizer	Uniqueness, stability, $\hat x = \arg\min_{x\in X} \tfrac12 \\|A x - y\\|_2^2 + \lambda R_\theta(x),$ 0 convergence rates
Robust Classification	Polytope/SDP relaxations	Certified robustness under norm-bounded attacks
Adversarial Attack Design	Convex constraint on $\hat x = \arg\min_{x\in X} \tfrac12 \\|A x - y\\|_2^2 + \lambda R_\theta(x),$ 1	Closed-form and efficient solution, extensibility

5. Empirical Results and Applications

Empirical studies validate the effectiveness of convex adversarial priors across distinct modalities:

In limited-angle CT, adversarially learned convex regularizers yield PSNR ≈ 27.0 dB, outperforming total variation (≈25.7 dB) and non-convex alternatives (≈23.6 dB), and successfully suppress artifact amplification (Mukherjee et al., 2020).
In MNIST denoising, regularizers learned with source condition penalties (ACR-SC) achieve PSNR = 22.72 dB via gradient descent and 20.29 dB via Bregman iteration, substantially outperforming naive baselines and matching ACR, but with provable convergence rates (Mukherjee et al., 2021).
In robust neural network training, the convex SDP adversarial formulation preserves clean accuracy while substantially improving adversarial accuracy (e.g., ≈76% under strong attacks versus ≈15% for nonrobust baselines) (Kuelbs et al., 2024).
On real datasets (CIFAR-10, UCI tasks), convex methods for one-hidden-layer networks outperform standard and state-of-the-art robust training (FGSM/PGD), especially in adversarial transfer regimes (Bai et al., 2021).

6. Variants and Extensions

Convex adversarial priors are extensible to a variety of threat models and domains:

Generalized Priors: The flexibility to introduce arbitrary convex priors on perturbations or weights allows custom tailoring, including box constraints, total variation, group sparsity, and structured spectral constraints (Balda et al., 2018).
Source-Condition Regularized Learning: Explicit penalties enforcing the variational source condition during adversarial training enable quantifiable convergence rates in inverse tasks (Mukherjee et al., 2021).
Scalability: Approaches such as stochastic pattern-sampling reduce the computational overhead from exponential to linear in problem dimension, with high-confidence approximation guarantees (Bai et al., 2021).

A plausible implication is that as convex relaxations, input-convex neural architectures, and spectrahedral constraints become more computationally tractable, convex adversarial priors will broaden the scope of methodologies combining data adaptivity, provable robustness, and rigorous performance guarantees across high-dimensional tasks.