Papers
Topics
Authors
Recent
Search
2000 character limit reached

Universal Approximation of Partial Convex Functions

Updated 27 May 2026
  • Universal Approximation for Partial Convex Functions defines a framework in which neural architectures approximate continuous, structured, and partially convex functions on compact domains.
  • Parameterized methods such as PMA networks, GroupMax architectures, and ICNNs enforce convexity through finite supports, supporting hyperplanes, and lifting techniques.
  • Practical applications span optimization in auction design and computational mechanics, where theoretical guarantees translate into accurate function and gradient estimates.

A universal approximation theorem for partial convex functions addresses the ability of specific parameterizations, often realized via neural network architectures, to approximate any function in a prescribed class of partially convex (or “generalized convex”) functions to any specified accuracy with respect to the uniform norm. These theorems have been formulated in a variety of mathematical contexts, including generalized convexity relative to coupling functions, parameterized convexity in optimization and machine learning, as well as polyconvexity in the calculus of variations and computational mechanics. Recent research provides multiple constructive approaches and universal approximation guarantees for partial (i.e., coordinate-wise or structured) convexity.

1. Definitions and Characterizations of Partial/Generalized Convexity

Let XRnX\subset\mathbb{R}^n and YRkY\subset\mathbb{R}^k be compact sets. A surplus (or coupling) function φ:X×YR\varphi : X \times Y \to \mathbb{R} is a continuous (typically locally Lipschitz) function. Generalized convexity (also termed YY-convexity or partial convexity in xx) is defined through a transform framework:

  • XX-transform (fxf^x):

fx(y)=supxX[φ(x,y)f(x)]f^x(y) = \sup_{x\in X} \left[ \varphi(x,y) - f(x) \right]

  • YY-transform (gyg^y):

YRkY\subset\mathbb{R}^k0

A function YRkY\subset\mathbb{R}^k1 is YRkY\subset\mathbb{R}^k2-convex if there exists YRkY\subset\mathbb{R}^k3 such that YRkY\subset\mathbb{R}^k4 for all YRkY\subset\mathbb{R}^k5. When YRkY\subset\mathbb{R}^k6, classical convexity is recovered. Partial convexity refers to functions of several groups of variables that are convex in only some of them, or relative convexity notions such as polyconvexity (convexity after a nonlinear lifting, as in elasticity) (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025).

2. Universal Approximation Theorems for Partial Convexity

Universal approximation theorems for partial convexity rigorously assert the existence of specific parameterized function classes that are dense in the set of all continuous partial convex functions (under the uniform norm).

Define the set of all YRkY\subset\mathbb{R}^k7-convex functions:

YRkY\subset\mathbb{R}^k8

A finitely YRkY\subset\mathbb{R}^k9-convex function (i.e., max over finite support) is

φ:X×YR\varphi : X \times Y \to \mathbb{R}0

for φ:X×YR\varphi : X \times Y \to \mathbb{R}1 finite, φ:X×YR\varphi : X \times Y \to \mathbb{R}2.

The main result establishes that:

  • The class of finitely φ:X×YR\varphi : X \times Y \to \mathbb{R}3-convex functions φ:X×YR\varphi : X \times Y \to \mathbb{R}4 is dense in φ:X×YR\varphi : X \times Y \to \mathbb{R}5.
  • For every φ:X×YR\varphi : X \times Y \to \mathbb{R}6 and every φ:X×YR\varphi : X \times Y \to \mathbb{R}7, there exists φ:X×YR\varphi : X \times Y \to \mathbb{R}8 such that

φ:X×YR\varphi : X \times Y \to \mathbb{R}9

For a continuous function YY0, with YY1 ("condition") and YY2 ("decision") as described above,

YY3

The theorem states that parameterized max-affine (PMA) or log-sum-exp (PLSE) networks of the form

YY4

can approximate any continuous parameterized convex function to arbitrary precision on compact YY5. Here, YY6 are continuous functions of YY7.

The GroupMax architecture provides a different construction using grouped max operators and ReLU layers. For a function YY8, convex in YY9, the extended GroupMax network is able to approximate xx0 uniformly on compact sets while preserving convexity in xx1. The universal approximation theorem guarantees that, for any xx2, there exists a GroupMax network xx3 such that

xx4

for all xx5 in the prescribed domain, and xx6 is convex for all xx7.

In applications such as isotropic hyperelasticity, polyconvex energy densities xx8 factor as convex functions xx9 of the seven basic monomials of the signed singular values of the deformation gradient. Input Convex Neural Networks (ICNNs), symmetrized to enforce isotropy and frame-indifference, can approximate any such energy to arbitrary accuracy on compact sets.

3. Parameterization Strategies That Achieve Universality

A variety of parameterizations implement the universal approximation property:

Parameterization Key Structure Convexity Guarantee
Max-over-finite-support XX0 Convex in XX1 (or XX2)
PMA/PLSE Networks XX3, XX4 Convex in XX5 for each XX6
GroupMax Architecture Layered grouped-max operator Parameter flexibility, convexity
ICNN on lifted variables and symmetrized Convex in lifted variables Polyconvexity, isotropy, frame-indifference

These parametrizations enable explicit control of the convexity structure by restricting the optimization space to dual weights, coefficients, or neural net layers that are composed or symmetrized to enforce partial convexity constraints (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025, Warin, 2022).

4. Proof Strategies and Key Theoretical Mechanisms

Most universal approximation results are constructive, relying on the following elements:

  • Support Reduction: Arbitrary suprema (or integrals) over possibly infinite supports are approximated by maximization over finite XX7-nets (balls of radius XX8), using uniform continuity and Lipschitz properties of the surplus/coupling function (Nehzati, 30 Aug 2025).
  • Supporting Hyperplane Construction: For each fixed value of the non-convex variable, the convex slice is approximated by a finite collection of supporting hyperplanes (majorants/minorants) (Kim et al., 2022, Warin, 2022).
  • Continuous Parameter Selection: When the parameters of affine slices depend on a “conditioning” variable, continuous selection and standard neural network UAT results for vector-valued functions are applied (Kim et al., 2022, Warin, 2022).
  • Lifting/Invariance: Lifting variables (e.g., signed singular values and monomials in polyconvexity) reduce the structure to classical convexity, after which standard ICNN UATs and group-averaging recover invariance properties (Geuken et al., 12 Feb 2025).

Proofs universally require that the target function is continuous, the relevant subdomains are compact, and the surplus/lifting functions satisfy regularity conditions (usually local Lipschitz, convexity, or invariance).

5. Approximation of Gradients

Beyond function approximation, several theorems extend universality to gradients, which is essential in optimization and control:

  • For GCFs, if XX9 is semiconvex in fxf^x0, then all fxf^x1 are semiconvex, and uniform approximation in the sup-norm implies uniform convergence of gradients wherever gradients exist. The parameterized family fxf^x2 is dense in fxf^x3 (Nehzati, 30 Aug 2025).
  • For parameterized convex approximators, smoothness properties (e.g., in PLSE) further facilitate efficient and accurate gradient-based optimization (Kim et al., 2022).

A plausible implication is that these architectures allow for not just function-value but derivative-level fidelity, making them directly relevant for bilevel optimization and mechanism design tasks where derivative information is operationally leveraged.

6. Practical Implementations and Empirical Results

Numerical experiments in recent works validate the theoretical findings:

  • In multi-item auction revenue maximization, the finitely-supported max-parametrization enables mechanisms that match optimal revenue benchmarks to numerical precision for fxf^x4 goods, as implemented in the Python package gconvex (Nehzati, 30 Aug 2025).
  • For parameterized convex approximation, PLSE networks outperform classic feedforward neural networks and other convex-preserving architectures in both minimizer error and value error across multiple high-dimensional settings (Kim et al., 2022).
  • GroupMax architectures demonstrate competitive mean squared error on prototypical partially convex test functions, with increasing depth and number of grouped cuts improving approximation accuracy (Warin, 2022).
  • In computational mechanics, the ICNN-based approach matches the density of all frame-indifferent, isotropic polyconvex energies on compact sets of admissible deformation gradients (Geuken et al., 12 Feb 2025).

7. Limitations, Assumptions, and Scope of Universality

All universal approximation theorems for partial convexity require:

  • Compactness of domains and continuity of the target function.
  • Convexity structure enforced in the prescribed argument group (or after lifting).
  • For polyconvexity, invariance and structure must match the symmetrization/lifting scheme.
  • Growth conditions outside the approximation set or local approximation behavior are not generally covered.
  • In numerical implementations, growth constraints or boundary conditions may require explicit architectural augmentation.

These theorems do not explicitly guarantee convergence rates as a function of network width or depth; rather, they guarantee denseness in the functional topology. Empirical findings indicate practical convergence with reasonable architectural choices, but no general complexity bounds are provided (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025, Warin, 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Universal Approximation Theorem for Partial Convex Functions.