Universal Approximation for Complex-Valued NNs
- Universal Approximation Theorem for Complex-Valued Neural Networks demonstrates that CVNNs can approximate any continuous complex function on compact sets under non-degenerate algebra and tailored activation functions.
- The framework details critical architectural requirements, emphasizing the need for split and non-holomorphic activations in shallow networks and broader permissible functions in deeper models.
- Quantitative insights reveal efficient approximation rates with optimized depth, width, and parameter growth, highlighting CVNNs' advantages over traditional real-valued networks.
A universal approximation theorem (UAT) for complex-valued neural networks (CVNNs) establishes that — under appropriate architectural and activation-function conditions — such networks can approximate any continuous complex-valued function on compact subsets of to arbitrary precision. The complex-valued setting demands fundamentally distinct algebraic, analytic, and constructive considerations compared to the real case, most notably in the structure of admissible activations, layer width and depth requirements, and the algebraic richness needed for dense approximation.
1. Algebraic Foundations and the Role of Non-degenerate Structure
CVNNs operate over the field of complex numbers, which forms a two-dimensional real algebra with basis and multiplication determined by . Many results on UAT for CVNNs extend to broader families of hypercomplex algebras, with universality contingent on algebraic non-degeneracy. Specifically, a real algebra with basis and multiplication defined by bilinear forms is non-degenerate if all associated matrices are invertible, implying that the bilinear product structure admits no nontrivial null space (Vital et al., 2022, Valle et al., 2024). For , the matrices and are both invertible, confirming that is non-degenerate and thus suitable for UAT constructions in the complex domain.
2. Statement of the Universal Approximation Theorem for CVNNs
The standard form of the UAT for a single-hidden-layer CVNN, specialized to non-degenerate algebras and in particular, is as follows (Vital et al., 2022, Valle et al., 2024, Voigtlaender, 2020):
- Network structure: for , with and activation .
- Activation: is a "split" version of a real continuous function: , where is continuous, non-constant, with (e.g., ReLU, logistic).
- Domain and metric: On any compact subset , with uniform (supremum) norm.
- Theorem: For every continuous and every , there exists , weights so that .
This existential result shows that such CVNNs can uniformly approximate any continuous complex function on compacts, providing a foundational theoretical guarantee for their use in function approximation, regression, and classification in complex domains.
3. Characterization of Universal Activation Functions
A hallmark feature of UAT for CVNNs is the nuanced dependence on activation function analyticity and polyharmonicity. Unlike the real case, where any continuous non-polynomial suffices, CVNNs encounter subtler obstructions (Voigtlaender, 2020, Geuchen et al., 2023):
- Shallow (1-hidden-layer) networks: Universality holds if and only if is not almost polyharmonic; that is, neither the real nor imaginary part of is polyharmonic of finite order. For instance, is admissible, but holomorphic functions (e.g., , ) or real-affine functions are not.
- Deep CVNNs (depth ): Universality is broader, requiring only that is not holomorphic, antiholomorphic, or (real-)affine; any non-(anti)holomorphic, non-real-affine, and non-polynomial activation suffices (Voigtlaender, 2020, Geuchen et al., 2023). This inclusion notably admits non-holomorphic activations such as split ReLU, modReLU, or phase-magnitude nonlinearities.
- Width and depth requirements: For networks of minimal width, universality demands ( input, output) in general, with improved bounds for specific non-polyharmonic activations (Geuchen et al., 2023).
These results fundamentally distinguish CVNNs from their real-valued counterparts and from the special case of holomorphic neural networks, where universality fails for the full space .
4. Proof Techniques and Density Arguments
UAT for CVNNs relies on complexified versions of the Stone–Weierstrass theorem and sophisticated algebraic constructions (Vital et al., 2022, Valle et al., 2024, Voigtlaender, 2020). The key proof steps are:
- Decomposition: Any continuous is written as , with .
- Real UAT Application: The classical Cybenko/Hornik theorem provides real MLPs that independently approximate and arbitrarily well on .
- Complex Synthesis via Split Activation: The real approximants are "assembled" into a complex-valued network using the split activation, ensuring that each complex neuron independently approximates real and imaginary parts.
- Alternative Direct Approach: For deep CVNNs, explicit construction of polynomial generators (e.g., , ) and conjugation via networks using non-(anti)holomorphic activations; subsequent application of Stone–Weierstrass guarantees uniform density.
- Algebraic Non-degeneracy: Ensures sufficient richness in the set of linear functionals to match real and imaginary parts through projections, critical for both representational completeness and proof closure.
5. Quantitative Approximation Rates and Efficiency
Beyond existential density, quantitative error estimates have been obtained for CVNNs with specific activations. For instance, with modReLU activation, any function on a compact can be approximated to error using networks of depth , size , and modest coefficient growth (Caragea et al., 2021). The approximation rate matches, up to log factors, the best-known rates for real-valued ReLU networks over , reflecting the effective doubling of input dimension.
For certain target classes, notably radial functions , complex-reaction networks with zReLU activation achieve polynomial parameter counts (width ), whereas real networks require exponential parameter growth to match accuracy (Zhang et al., 2021). This separation is enabled by the phase-magnitude coupling in zReLU together with complex-linear parameterization.
6. Architectural Implications and Landscape Properties
Several critical architectural and optimization-theoretic distinctions arise:
- Activation choice: Admissible activations must not be holomorphic, antiholomorphic, or real-affine; split ReLU, modReLU, and zReLU are theoretically justified, while standard holomorphic choices collapse the functional closure (Voigtlaender, 2020, Geuchen et al., 2023).
- Width and depth: A width of $2n+2m+5$ always suffices for universality in deep CVNNs for all input/output dimensions; this can be reduced to for activations with certain Wirtinger nonvanishing derivatives (Geuchen et al., 2023).
- Optimization landscape: The critical set for empirical risk in complex-reaction networks is strictly smaller than for real-valued analogues, due to additional holomorphic constraints restricting degeneracy directions. All global minima of the complex model are stationary for the real model, but the converse fails, implying enhanced "sharpness" in the complex landscape (Zhang et al., 2021).
7. Comparison with Real-valued Networks and Holomorphic Models
The UAT for CVNNs extends and departs from the classical real case in several core respects:
| Setting | Admissible Activations | Key Obstructions | Input Domain |
|---|---|---|---|
| Real, shallow | Non-polynomial, continuous | Polynomials only | |
| Complex, shallow | Not almost polyharmonic | Polyharmonicity | |
| Complex, deep () | Not (anti)holomorphic/affine | (Anti)holomorphic, real-affine | |
| Holomorphic NNs | Holomorphic | Only holomorphic targets possible |
A salient distinction is the essential role of non-holomorphicity for universality in the complex-valued setting (Voigtlaender, 2020). Furthermore, certain function classes (notably those with high radial Fourier content) can be learned efficiently by CVNNs but not by real networks, evidencing expressivity advantages in the presence of complex-structured data (Zhang et al., 2021).
References
- "Extending the Universal Approximation Theorem for a Broad Class of Hypercomplex-Valued Neural Networks" (Vital et al., 2022)
- "Universal Approximation Theorem for Vector- and Hypercomplex-Valued Neural Networks" (Valle et al., 2024)
- "The universal approximation theorem for complex-valued neural networks" (Voigtlaender, 2020)
- "Universal approximation with complex-valued deep narrow neural networks" (Geuchen et al., 2023)
- "Quantitative approximation results for complex-valued neural networks" (Caragea et al., 2021)
- "Towards Understanding Theoretical Advantages of Complex-Reaction Networks" (Zhang et al., 2021)