Universal Approximation of Partial Convex Functions
- Universal Approximation for Partial Convex Functions defines a framework in which neural architectures approximate continuous, structured, and partially convex functions on compact domains.
- Parameterized methods such as PMA networks, GroupMax architectures, and ICNNs enforce convexity through finite supports, supporting hyperplanes, and lifting techniques.
- Practical applications span optimization in auction design and computational mechanics, where theoretical guarantees translate into accurate function and gradient estimates.
A universal approximation theorem for partial convex functions addresses the ability of specific parameterizations, often realized via neural network architectures, to approximate any function in a prescribed class of partially convex (or “generalized convex”) functions to any specified accuracy with respect to the uniform norm. These theorems have been formulated in a variety of mathematical contexts, including generalized convexity relative to coupling functions, parameterized convexity in optimization and machine learning, as well as polyconvexity in the calculus of variations and computational mechanics. Recent research provides multiple constructive approaches and universal approximation guarantees for partial (i.e., coordinate-wise or structured) convexity.
1. Definitions and Characterizations of Partial/Generalized Convexity
Let and be compact sets. A surplus (or coupling) function is a continuous (typically locally Lipschitz) function. Generalized convexity (also termed -convexity or partial convexity in ) is defined through a transform framework:
- -transform ():
- -transform ():
0
A function 1 is 2-convex if there exists 3 such that 4 for all 5. When 6, classical convexity is recovered. Partial convexity refers to functions of several groups of variables that are convex in only some of them, or relative convexity notions such as polyconvexity (convexity after a nonlinear lifting, as in elasticity) (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025).
2. Universal Approximation Theorems for Partial Convexity
Universal approximation theorems for partial convexity rigorously assert the existence of specific parameterized function classes that are dense in the set of all continuous partial convex functions (under the uniform norm).
2.1 Generalized Convex Functions (Nehzati, 30 Aug 2025)
Define the set of all 7-convex functions:
8
A finitely 9-convex function (i.e., max over finite support) is
0
for 1 finite, 2.
The main result establishes that:
- The class of finitely 3-convex functions 4 is dense in 5.
- For every 6 and every 7, there exists 8 such that
9
2.2 Parameterized Partial Convexity (Kim et al., 2022)
For a continuous function 0, with 1 ("condition") and 2 ("decision") as described above,
3
The theorem states that parameterized max-affine (PMA) or log-sum-exp (PLSE) networks of the form
4
can approximate any continuous parameterized convex function to arbitrary precision on compact 5. Here, 6 are continuous functions of 7.
2.3 GroupMax and Other Network Architectures (Warin, 2022)
The GroupMax architecture provides a different construction using grouped max operators and ReLU layers. For a function 8, convex in 9, the extended GroupMax network is able to approximate 0 uniformly on compact sets while preserving convexity in 1. The universal approximation theorem guarantees that, for any 2, there exists a GroupMax network 3 such that
4
for all 5 in the prescribed domain, and 6 is convex for all 7.
2.4 Polyconvexity and Invariant Architectures (Geuken et al., 12 Feb 2025)
In applications such as isotropic hyperelasticity, polyconvex energy densities 8 factor as convex functions 9 of the seven basic monomials of the signed singular values of the deformation gradient. Input Convex Neural Networks (ICNNs), symmetrized to enforce isotropy and frame-indifference, can approximate any such energy to arbitrary accuracy on compact sets.
3. Parameterization Strategies That Achieve Universality
A variety of parameterizations implement the universal approximation property:
| Parameterization | Key Structure | Convexity Guarantee |
|---|---|---|
| Max-over-finite-support | 0 | Convex in 1 (or 2) |
| PMA/PLSE Networks | 3, 4 | Convex in 5 for each 6 |
| GroupMax Architecture | Layered grouped-max operator | Parameter flexibility, convexity |
| ICNN on lifted variables and symmetrized | Convex in lifted variables | Polyconvexity, isotropy, frame-indifference |
These parametrizations enable explicit control of the convexity structure by restricting the optimization space to dual weights, coefficients, or neural net layers that are composed or symmetrized to enforce partial convexity constraints (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025, Warin, 2022).
4. Proof Strategies and Key Theoretical Mechanisms
Most universal approximation results are constructive, relying on the following elements:
- Support Reduction: Arbitrary suprema (or integrals) over possibly infinite supports are approximated by maximization over finite 7-nets (balls of radius 8), using uniform continuity and Lipschitz properties of the surplus/coupling function (Nehzati, 30 Aug 2025).
- Supporting Hyperplane Construction: For each fixed value of the non-convex variable, the convex slice is approximated by a finite collection of supporting hyperplanes (majorants/minorants) (Kim et al., 2022, Warin, 2022).
- Continuous Parameter Selection: When the parameters of affine slices depend on a “conditioning” variable, continuous selection and standard neural network UAT results for vector-valued functions are applied (Kim et al., 2022, Warin, 2022).
- Lifting/Invariance: Lifting variables (e.g., signed singular values and monomials in polyconvexity) reduce the structure to classical convexity, after which standard ICNN UATs and group-averaging recover invariance properties (Geuken et al., 12 Feb 2025).
Proofs universally require that the target function is continuous, the relevant subdomains are compact, and the surplus/lifting functions satisfy regularity conditions (usually local Lipschitz, convexity, or invariance).
5. Approximation of Gradients
Beyond function approximation, several theorems extend universality to gradients, which is essential in optimization and control:
- For GCFs, if 9 is semiconvex in 0, then all 1 are semiconvex, and uniform approximation in the sup-norm implies uniform convergence of gradients wherever gradients exist. The parameterized family 2 is dense in 3 (Nehzati, 30 Aug 2025).
- For parameterized convex approximators, smoothness properties (e.g., in PLSE) further facilitate efficient and accurate gradient-based optimization (Kim et al., 2022).
A plausible implication is that these architectures allow for not just function-value but derivative-level fidelity, making them directly relevant for bilevel optimization and mechanism design tasks where derivative information is operationally leveraged.
6. Practical Implementations and Empirical Results
Numerical experiments in recent works validate the theoretical findings:
- In multi-item auction revenue maximization, the finitely-supported max-parametrization enables mechanisms that match optimal revenue benchmarks to numerical precision for 4 goods, as implemented in the Python package gconvex (Nehzati, 30 Aug 2025).
- For parameterized convex approximation, PLSE networks outperform classic feedforward neural networks and other convex-preserving architectures in both minimizer error and value error across multiple high-dimensional settings (Kim et al., 2022).
- GroupMax architectures demonstrate competitive mean squared error on prototypical partially convex test functions, with increasing depth and number of grouped cuts improving approximation accuracy (Warin, 2022).
- In computational mechanics, the ICNN-based approach matches the density of all frame-indifferent, isotropic polyconvex energies on compact sets of admissible deformation gradients (Geuken et al., 12 Feb 2025).
7. Limitations, Assumptions, and Scope of Universality
All universal approximation theorems for partial convexity require:
- Compactness of domains and continuity of the target function.
- Convexity structure enforced in the prescribed argument group (or after lifting).
- For polyconvexity, invariance and structure must match the symmetrization/lifting scheme.
- Growth conditions outside the approximation set or local approximation behavior are not generally covered.
- In numerical implementations, growth constraints or boundary conditions may require explicit architectural augmentation.
These theorems do not explicitly guarantee convergence rates as a function of network width or depth; rather, they guarantee denseness in the functional topology. Empirical findings indicate practical convergence with reasonable architectural choices, but no general complexity bounds are provided (Nehzati, 30 Aug 2025, Kim et al., 2022, Geuken et al., 12 Feb 2025, Warin, 2022).