Complex-Valued Activation Functions
- Complex-valued activation functions (CVAFs) extend real-valued activations to the complex domain, balancing analyticity, boundedness, and phase dynamics.
- They are categorized into holomorphic, split (Cartesian), and magnitude–phase families, each offering unique trade-offs between numerical stability and phase preservation.
- CVAFs empower complex-valued neural networks in applications like signal processing, communications, medical imaging, and quantum information by enhancing model expressivity and performance.
Complex-valued activation functions (CVAFs) generalize the fundamental concept of nonlinearities from real-valued neural networks to architectures operating on complex numbers. They are central to the design and performance of complex-valued neural networks (CVNNs), which arise naturally in domains such as signal processing, communications, medical imaging, and quantum information, where the input, weights, or target functions are intrinsically complex. The extension from real to complex-valued activations presents unique mathematical and engineering challenges, particularly regarding analyticity, boundedness, and phase handling, compelling a distinct taxonomy of CVAFs and rigorous investigation of their theoretical properties and practical efficacy.
1. Mathematical Foundations and Taxonomy
The principal distinction in designing CVAFs is the interplay between analyticity (holomorphicity), boundedness, and the behavior with respect to phase and magnitude. Liouville’s theorem prohibits any nonconstant, bounded, entire (holomorphic) nonlinearity on , necessitating trade-offs for nontrivial activation design (Abdalla, 2023, Hammad, 27 Jul 2024). This has led to three major classes of CVAFs:
- Holomorphic (Fully Complex) Activations: Functions analytic in , such as , , polynomials, exponentials, and other entire transcendental functions. They admit gradients without Wirtinger cross-terms, facilitating analytic backpropagation. However, they are necessarily unbounded or feature poles/branch cuts, complicating numerical stability (Corte et al., 2014, Abdalla, 2023, Voigtlaender, 2020, Hammad, 27 Jul 2024).
- Split Real–Imaginary Activations (Cartesian or Type-A): Extend real nonlinearities separately to and , e.g.,
for some real activation . These are bounded (if is) and directly compatible with real-valued frameworks but fail Cauchy–Riemann conditions, breaking analyticity (Abdalla, 2023, Bassey et al., 2021, Hammad, 27 Jul 2024, Barrachina et al., 2023). They are simple to implement and robust, but phase information is not preserved, and coupling between real and imaginary components is neglected.
- Magnitude–Phase (Polar or Type-B) Activations: Functions depending on the modulus and potentially modulating the phase, often of the form
with real functions. This includes modReLU, cardioid, amplitude–phase (AP) families, and related constructs. Phase can be preserved, allowing for strong inductive biases in wave and frequency-domain problems, at the cost of analyticity (Abdalla, 2023, Virtue et al., 2017, Bassey et al., 2021, Hammad, 27 Jul 2024).
The following table organizes representative CVAFs by family, illustrating key distinguishing features:
| Family | Example Formula | Analyticity |
|---|---|---|
| Holomorphic | , | holomorphic |
| Split (Cartesian) | nonholomorphic | |
| Magnitude–Phase | , | nonholomorphic |
2. Classical and Novel CVAF Constructions
Holomorphic Families
Classic holomorphic activations include , , and polynomials . These retain all Cauchy–Riemann structure and are critical when true complex-differentiable behavior is needed, e.g., for Newton-type backpropagation, but their unboundedness or poles result in numerical instability and potential blow-up for large (Corte et al., 2014, Hammad, 27 Jul 2024). Polynomial Taylor truncations have been shown to accelerate Newton methods by improving Hessian conditioning and avoiding poles (Corte et al., 2014). Entire functions, such as exponentials and sigmoids, are used sparingly because of their global behavior and instability at large moduli (Abdalla, 2023, Hammad, 27 Jul 2024).
Split and Cartesian Functions
Split functions apply classical real nonlinearities to each channel:
- Split-ReLU:
- Split-Tanh:
Empirical studies confirm the practicality and stability of split-ReLU/tanh for generic tasks and show that, due to the lack of rotational equivariance, they may underperform on phase-sensitive tasks (Barrachina et al., 2023, Bassey et al., 2021, Abdalla, 2023, Hammad, 27 Jul 2024). Best practices suggest employing split-tanh or split-ELU for generic signal processing and image applications where phase invariance is not critical (Hammad, 27 Jul 2024).
Magnitude–Phase and Phase-Preserving Constructions
A broad range of nonholomorphic, phase-aware CVAFs dominate recent literature due to their favorable inductive bias for complex signal domains:
- modReLU: (phase-preserving, magnitude-gating, widely used in unitary RNNs and stable for deep nets) (Caragea et al., 2021, Abdalla, 2023, Bassey et al., 2021). It is continuous, piecewise smooth, 1-Lipschitz, but non-differentiable on , and not holomorphic.
- Cardioid: , phase gating by θ, recovers ReLU on the real axis, smooth phase interpolation, shown to outperform split and magnitude gating on MRI fingerprinting (Virtue et al., 2017, Abdalla, 2023, Bassey et al., 2021).
- zReLU: if and , $0$ otherwise; quadrant gating, non-smooth (Trabelsi et al.).
- Reciprocal–polynomial and polynomial–reciprocal families: , enabling parameter-efficient, phase-aware nonlinear shaping; phase preserved by real-valued gain applied to (Young et al., 4 Apr 2025).
Amplitude–phase and phase-sensitive activations are now diverse, including parametric forms such as modSoftplus, modSwish, CAP–type piecewise/composite gates, with explicit parameterization for sharpness and shape (Hammad, 27 Jul 2024, Young et al., 4 Apr 2025). These forms enhance expressive power and parameter efficiency, evidenced by significant improvements on benchmarks such as AudioMNIST with hybrid real–complex architectures (65% loss reduction, 54% parameter savings) (Young et al., 4 Apr 2025).
3. Nonparametric and Adaptive CVAFs
Kernel Activation Functions (KAFs) extend CVAF expressivity beyond fixed formulas, implementing nonparametric neuron-wise complex function expansions (Scardapane et al., 2018, Scardapane et al., 2019).
- KAF: for complex kernel and fixed dictionary
- Widely Linear KAFs (WL-KAFs): , capturing full affine complex function closure.
KAFs, via positive-definite kernels in , approximate arbitrary smooth nonlinearities for each neuron and can outperform fixed activations (e.g., split-KAF: 97.2% accuracy vs. modReLU: 95.9% on complex-MNIST) (Scardapane et al., 2018). Widely linear extensions double the representational capacity without increasing parameter count, providing superior accuracy and faster convergence (Scardapane et al., 2019).
4. Universal Approximation and Theoretical Guarantees
Universal approximation theory in the complex domain diverges qualitatively from the real case due to analyticity, polyharmonicity, and the richer function structure on (Voigtlaender, 2020, Geuchen et al., 2023). Key results include:
- Deep CVNNs (depth ) are universal if the activation is neither polynomial in , holomorphic, nor antiholomorphic. This admits a vastly broader class than in the real case, allowing almost any non-holomorphic, non-affine, nonlinear CVAF (Voigtlaender, 2020, Geuchen et al., 2023).
- Shallow CVNNs (depth 1): universality requires the real or imaginary part of the activation is not (almost) polyharmonic. For non-polyharmonic, smooth , shallow networks give dense approximation of all continuous functions.
- modReLU and magnitude–phase activations satisfy universality for deep networks, with optimal approximation rates matching the real ReLU case under a doubling of the dimension: error achieved with weights and depth for targets on domains (Caragea et al., 2021).
- Kernel-based and adaptive neuron-wise activations also admit universal approximation via RKHS theory for suitable kernels (Scardapane et al., 2018).
Thus, almost all practical nonholomorphic, non-polynomial CVAFs support universal approximation in deep architectures; analytic activations are insufficient for full expressivity.
5. Complex Backpropagation and Implementation
The choice of CVAF is central to the training dynamics and gradient flow through Wirtinger calculus. For nonholomorphic , the relevant derivatives are
with corresponding derivatives wrt ; analytic activations collapse the term (Abdalla, 2023, Hammad, 27 Jul 2024). Networks are trained via CR-calculus-based updates, with different backpropagation schemes for holomorphic and generic CVAFs (Abdalla, 2023, Barrachina et al., 2023).
Complex-valued libraries (e.g., cvnn toolbox, TensorFlow/PyTorch 1.6+) now provide automatic Wirtinger differentiation and proper weight initialization (complex Xavier, scaling variance by $1/2$ for both real and imaginary parts) (Barrachina et al., 2023).
6. Empirical Benchmarks and Applications
Empirical evaluations consistently demonstrate the superiority of phase-preserving and parameter-adaptive CVAFs on tasks with complex, oscillatory, or wave-typed signals:
- MRI fingerprinting: Cardioid and magnitude–phase CVAFs yield distinctly lower NRMSE for tissue property recovery, outperforming both real-valued and naive complex activations (Virtue et al., 2017).
- Channel equalization, FFT-MNIST, and wind prediction: KAF and its widely linear variant achieve lower MSE, higher , and improved classification accuracy relative to fixed nonlinearities (Scardapane et al., 2018, Scardapane et al., 2019).
- Speech recognition (AudioMNIST) and hybrid architectures: Parameter-efficient polynomial–reciprocal phase-gated CVAFs provide up to 65% reduction in cross-entropy loss, especially in low-SNR regimes (Young et al., 4 Apr 2025).
- Function approximation under data scarcity: Holomorphic activations (e.g., CauchyNet with ) deliver compact models for rational/oscillatory signals, halving MAE relative to ReLU-MLP, with minimized parameter count (Zhang et al., 11 Oct 2025).
Where phase information is task-critical, such as in communications, medical imaging, or any frequency-domain modeling, phase-preserving and magnitude–phase CVAFs are empirically favored. Split and holomorphic activations remain competitive in generic domains or where stability, simplicity, or analytic gradients dictate.
7. Open Directions, Limitations, and Practical Recommendations
Several challenges persist:
- Bounded holomorphic CVAFs are impossible per Liouville; practical implementations of analytic activations require radius control or domain restriction (Corte et al., 2014, Hammad, 27 Jul 2024).
- Nonholomorphic but smooth, phase-aware CVAFs—magnitude–phase gates, polynomial–reciprocal, CAP-family—balance expressivity, stability, and universal approximation, and are optimal for signal and waveform-related domains (Young et al., 4 Apr 2025, Abdalla, 2023, Hammad, 27 Jul 2024).
- Nonparametric and adaptive neuron-wise CVAFs (KAF/WL-KAF) offer maximal flexibility at the cost of per-neuron parameter scaling; optimal kernel choice and dictionary grid remain active research problems (Scardapane et al., 2018, Scardapane et al., 2019).
- Gradient flow and vanishing/exploding regimes can arise at zero-magnitude (phase-division) or at poles (entire/holo activations); practical implementations must regularize, offset denominators, or penalize imaginary output components as needed (Zhang et al., 11 Oct 2025, Barrachina et al., 2023).
Best practices are now well established:
- For generic deep CVNNs: phase-aware magnitude–phase (modReLU, cardioid, parameterized phase-gates) as the default (Virtue et al., 2017, Caragea et al., 2021, Abdalla, 2023).
- For analytic modeling or when analytic gradients are critical: complex polynomials or bounded-region holomorphic activations with constrained weights (Corte et al., 2014, Zhang et al., 11 Oct 2025).
- For maximal expressive power per neuron: KAF or WL-KAF with carefully tuned dictionary and kernel (Scardapane et al., 2018, Scardapane et al., 2019).
- For rapid prototyping and cross-domain tasks: split-type (cartesian) activations for compatibility with existing real-valued frameworks (Barrachina et al., 2023, Hammad, 27 Jul 2024).
Emerging research investigates locally-bounded holomorphic activations, adaptive phase–amplitude transforms, and extensions into hypercomplex (quaternionic) neural architectures (Hammad, 27 Jul 2024).
References:
- (Caragea et al., 2021) Quantitative approximation results for complex-valued neural networks
- (Scardapane et al., 2018) Complex-valued Neural Networks with Non-parametric Activation Functions
- (Corte et al., 2014) Newton's Method Backpropagation for Complex-Valued Holomorphic Multilayer Perceptrons
- (Young et al., 4 Apr 2025) Hybrid Real- and Complex-valued Neural Network Architecture
- (Geuchen et al., 2023) Universal approximation with complex-valued deep narrow neural networks
- (Zhang et al., 11 Oct 2025) CauchyNet: Compact and Data-Efficient Learning using Holomorphic Activation Functions
- (Abdalla, 2023) Complex-valued Neural Networks -- Theory and Analysis
- (Scardapane et al., 2019) Widely Linear Kernels for Complex-Valued Kernel Activation Functions
- (Voigtlaender, 2020) The universal approximation theorem for complex-valued neural networks
- (Virtue et al., 2017) Better than Real: Complex-valued Neural Nets for MRI Fingerprinting
- (Barrachina et al., 2023) Theory and Implementation of Complex-Valued Neural Networks
- (Bassey et al., 2021) A Survey of Complex-Valued Neural Networks
- (Hammad, 27 Jul 2024) Comprehensive Survey of Complex-Valued Neural Networks: Insights into Backpropagation and Activation Functions
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free