Complex-Valued Activation Functions
- Complex-valued activation functions are nonlinear mappings in CVNNs that preserve amplitude and phase, crucial for processing complex signals.
- They are designed by navigating holomorphic constraints, leading to either unbounded analytic or bounded non-holomorphic lifts such as modReLU and cardioid.
- Advanced implementations like kernel-based and rational-polynomial activations offer universal approximation and enhanced training stability in practical applications.
Complex-valued activation functions (CVAFs) are the nonlinear mapping functions essential to Complex-Valued Neural Networks (CVNNs), mediating learned representations between layers in domains where both amplitude and phase convey information. Unlike real-valued activations, the design and analysis of CVAFs are fundamentally shaped by the interplay between holomorphicity, boundedness, and phase preservation, as no non-constant function can be both entire (holomorphic everywhere) and bounded due to Liouville’s theorem, and phase–magnitude symmetry is often critical in physical and signal processing applications. Rigorous classification, expressive power, universality, computational properties, and engineering trade-offs are now well-characterized, and a rapidly expanding repertoire of CVAFs and nonparametric families enables deep, high-performing CVNNs across scientific domains.
1. Mathematical Foundations and Classification
The theoretical space of complex-valued activation functions is delimited by fundamental constraints from complex analysis:
- Holomorphicity and the Cauchy–Riemann Conditions: A function is holomorphic if it is complex-differentiable everywhere, i.e., it satisfies . Classic activations such as the complex logistic sigmoid and hyperbolic tangent are holomorphic except at isolated poles, but must be unbounded (Hammad, 27 Jul 2024, Abdalla, 2023, Bassey et al., 2021, Corte et al., 2014).
- Non-Holomorphic (“Split” or “Componentwise”) CVAFs: These apply real activation functions independently to either Cartesian components, (“rectangular split”), or polar form, (“polar split”) (Abdalla, 2023, Barrachina et al., 2023). Most widely used CVAFs, including modReLU, zReLU, cardioid, and various amplitude-phase functions, are non-holomorphic but bounded, or saturate at large .
- Phase-Amplitude Preserving Activations: These modulate only and leave intact (e.g., ; cardioid, modReLU, APTF/ASPF, CAP-family). They are non-holomorphic but retain essential physical structure in signal domains (Hammad, 27 Jul 2024, Virtue et al., 2017).
- Parametric and Non-Parametric Families: Recent work proposes trainable CVAFs by kernel expansion (KAFs, WL–KAFs) or hybrid parameterized forms generalizing ReLU, Tanh, Softplus, etc., for efficient and expressive phase-sensitive modelling (Young et al., 4 Apr 2025, Scardapane et al., 2018, Scardapane et al., 2019).
The coexistence and trade-off between these types are dictated by the need for universal approximation, training stability, and domain-specific constraints.
2. Expressivity, Universality, and Approximation Theory
The universal approximation property and quantitative error rates critically depend on the analytic structure of CVAFs:
- Universal Approximation—Characterization: For CVNNs with two or more hidden layers, universality is achieved if the activation is neither a polynomial in , holomorphic, nor antiholomorphic (i.e., not analytic in or alone), which excludes “degenerate” (e.g., only phase or modulus) activations and captures virtually all non-trivial phase–amplitude nonlinearities (modReLU, cardioid, split activations, etc.) (Geuchen et al., 2023, Voigtlaender, 2020).
- Sharp Complexity Bounds: For smooth, non-polyharmonic CVAFs (generalizing “non-polynomial” in the real case), the best achievable function approximation by a CVNN with hidden units is where is the complex input dimension (Geuchen et al., 2023). This matches the classical real-valued optimal rates up to constants and logarithmic factors (Caragea et al., 2021).
- Width and Depth Results: Deep CVNNs achieve universality with width $2n+2m+5$ for input-output dimensionalities , with potential improvement to for certain CVAFs (those with only one vanishing Wirtinger derivative at some point) (Geuchen et al., 2023).
- Shallow CVNNs: Universality for one-hidden-layer CVNNs occurs if and only if the CVAF is not almost polyharmonic (i.e., there does not exist with ) (Voigtlaender, 2020).
3. Principal Families and Examples of Complex-Valued Activation Functions
A wide variety of CVAFs is currently in active use. Representative families and their properties:
| Name / Notation | Formula / Main Property | Analyticity / Boundedness |
|---|---|---|
| Complex sigmoid | Holomorphic except poles, unbounded | |
| Tanh | Holomorphic except poles, unbounded | |
| modReLU | Non-holomorphic, phase-preserving, bounded below by | |
| Cardioid | Non-holomorphic, phase-preserving, bounded | |
| zReLU | if , $0$ else | Non-holomorphic, quadrant gating |
| Split-Sigmoid/Tanh | / | Non-holomorphic, bounded |
| Amplitude–Phase (APTF, APSF) | / | Non-holomorphic, bounded |
| CAP-family (PROPOSED) | See (Hammad, 27 Jul 2024): capping or saturating amplitude | Non-holomorphic, saturating, phase-preserving |
| Kernel Activation (KAF, WL–KAF) | (+ conjugate term) | Universal, non-holomorphic |
| Rational-polynomial | Non-holomorphic, bounded by denominator (Young et al., 4 Apr 2025) |
Notably, parametric and kernel-based activation families enable nonparametric adaptation and richer expressivity than either split or analytic scalar functions alone (Scardapane et al., 2018, Scardapane et al., 2019). Novel families generalizing reciprocal-polynomials and smooth switches offer high parameter-efficiency and continuous phase modulation (Young et al., 4 Apr 2025).
4. Backpropagation and Computational Implications
- Gradient Formalisms: Holomorphic functions admit classic complex gradients (one Wirtinger derivative, ), but most practical CVAFs (especially split-type) require full Wirtinger calculus, involving both and (Abdalla, 2023, Corte et al., 2014). Typical frameworks (TensorFlow, PyTorch) now support complex automatic differentiation, transparently handling the necessary partials for both types (Barrachina et al., 2023).
- Branch-Cut and Discontinuity Handling: CVAFs such as zReLU and quadrant-gated switches (e.g., cReImLU, cRecipMax) can exhibit zero derivative regions or discontinuities. Smooth variants (e.g., reciprocal-polynomial, soft-plus parametric forms) are often preferred for training stability (Young et al., 4 Apr 2025).
- Phase Preservation and Numerical Stability: activations that maintain (modReLU, cardioid, APTF, CAP-PLS) are vital for signal processing tasks; others (CReLU, split-Cartesian) destroy phase information and may cause unphysical feature maps in such domains (Virtue et al., 2017, Abdalla, 2023). Boundedness or smooth saturation is essential to avoid exploding gradients in deep architectures (Hammad, 27 Jul 2024).
5. Empirical Evidence and Application Domains
Empirical studies consistently validate theoretical expectations:
- Performance Gains: modReLU and cardioid activations improve accuracy and stability for MRI fingerprinting, audio, and communications tasks. For example, in MRI reconstruction, complex-cardioid net halved RMSE compared to both real ReLU and split-separable activations; modReLU yielded state-of-the-art RNN stability (Virtue et al., 2017, Bassey et al., 2021, Caragea et al., 2021).
- Kernel/Nonparametric Activations: Both split-KAF and fully-complex KAF outperform fixed-shape and real baselines on channel identification, wind prediction, and complex-MNIST (e.g., split-KAF achieves test accuracy on complex MNIST vs. for modReLU) (Scardapane et al., 2018, Scardapane et al., 2019).
- Hybrid Real–Complex Architectures: Introduction of generalized reciprocal- and switch-type CVAFs in hybrid neural networks yields lower parameter count and test loss compared to pure real-valued models on tasks with inherently complex structure (Young et al., 4 Apr 2025).
6. Advanced Design Methodologies and Best Practices
- Activation Selection Criteria: For depth-narrow universality, a CVAF must not be holomorphic, antiholomorphic, or -affine, and should have non-vanishing Wirtinger derivative at some point (Geuchen et al., 2023). Explicit phase preservation is recommended for physical signals; if analytic coupling is required (for mathematical convenience in backpropagation), fully-complex holomorphic functions may be justified (with care to avoid poles) (Corte et al., 2014, Hammad, 27 Jul 2024).
- Weight Initialization and Regularization: For bounded CVAFs (modReLU, cardioid, CAP-PLS), standard complex Glorot/He initialization adapted by a factor is recommended. Entire functions (tanh, sigmoid) require scaling input/weights to avoid amplification near poles (Barrachina et al., 2023).
- Parametric/Nonparametric Approaches: Kernel-based and widely linear KAFs offer universal approximation, adaptability and faster convergence, particularly where real–imaginary coupling is important (Scardapane et al., 2019). Rational-polynomial and switch-type activations can be tuned by architecture search (Young et al., 4 Apr 2025).
7. Open Problems and Future Directions
- Bounded Holomorphic CVAFs: No non-constant bounded entire analytic CVAF exists. Progress relies on functions analytic except at engineered isolated singularities, but controlling gradient explosion near these poles remains challenging (Abdalla, 2023, Hammad, 27 Jul 2024).
- Phase–Magnitude Tradeoffs: The design of adaptive CVAFs that can learn when to suppress/retain phase or amplitude remains an open engineering and modelling problem, especially as deep CVNNs are scaled to vision and speech domains (Bassey et al., 2021).
- Interpretability and Modularization: Understanding how CVAF choice encodes task-specific features (e.g., phase-coded information in electromagnetic or audio signals), and building interpretable CVNN toolkits in major frameworks is an ongoing concern (Bassey et al., 2021, Barrachina et al., 2023).
- Scaling and Architectures: Extension to large-scale hybrid real–complex and attention-based architectures will require further innovations in numerically stable, expressive, and computationally efficient CVAFs (Young et al., 4 Apr 2025).
Complex-valued activation functions now constitute a rich, theoretically grounded, and empirically validated foundation for neural architectures in domains with complex structure. Rigorous universality, optimal approximation rates, phase-dynamics, and flexible kernel-based nonlinearities underpin their adoption in modern CVNN frameworks (Geuchen et al., 2023, Caragea et al., 2021, Scardapane et al., 2018, Young et al., 4 Apr 2025, Abdalla, 2023, Hammad, 27 Jul 2024).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free