Complex-Valued Neural Networks
- CVNNs are neural networks with weights, biases, and activations expressed as complex numbers, enabling natural modeling of amplitude and phase.
- They employ specialized optimization techniques like Wirtinger calculus and diverse activation functions to adapt classical NN architectures for complex data.
- CVNNs excel in applications such as signal processing, imaging, and wireless communications by leveraging joint magnitude-phase effects for improved accuracy.
A complex-valued neural network (CVNN) is an artificial neural network in which the weights, biases, inputs, activations, and outputs are represented as complex numbers. This extension of classical real-valued neural networks is motivated by the natural occurrence of complex-valued data and transformations in domains such as signal processing, communications, image analysis, radar, and biomedical imaging. CVNNs offer representational fidelity for phenomena involving amplitude and phase, enable operations naturally invariant under rotations in the complex plane, and facilitate modeling of joint magnitude-phase effects fundamental to many scientific and engineering disciplines (Bassey et al., 2021, Hammad, 27 Jul 2024).
1. Mathematical Foundations and Core Design Principles
A CVNN generalizes classical neural network architecture by replacing real objects with complex analogues and adapting learning algorithms for the complex field. The forward operation of a CVNN layer is expressed as
where , , and is a complex-valued activation function. In convolutional architectures, both filters and feature maps are complex (Bruna et al., 2015, Bassey et al., 2021).
Gradient-based learning in CVNNs employs the Wirtinger derivatives to handle non-holomorphic functions:
where . For a real-valued loss function , only the derivative with respect to is needed during backpropagation (Sarroff et al., 2015, Abdalla, 2023).
2. Complex-Valued Activation Functions
The selection of suitable complex-valued activation functions (CVAFs) is central to CVNN design because the requirements of boundedness and analyticity (holomorphicity) cannot simultaneously be achieved for nontrivial functions over , per Liouville’s theorem (Hammad, 27 Jul 2024). Two major classes of CVAFs are used:
- Split Activation Functions: Apply a real-valued function separately to real and imaginary parts, e.g., . This class includes split-ReLU, split-Tanh, Split-ELU, and their variants (Mönning et al., 2018, Hammad, 27 Jul 2024).
- Fully Complex Activation Functions: Operate on the complex variable as a whole. Examples include the complex sigmoid, modReLU , cardioid, amplitude-phase saturating/nonlinearities, and new forms like Fully Complex Swish and Mish (Hammad, 27 Jul 2024, Scardapane et al., 2018, Scardapane et al., 2019).
Non-parametric activation function families, notably kernel activation functions (KAFs), have been constructed directly in the complex domain, allowing data-driven adaptation of the nonlinearities (Scardapane et al., 2018, Scardapane et al., 2019). The widely linear extension (WL-KAF) increases expressive power by incorporating pseudo-kernel terms, without increasing parameter count (Scardapane et al., 2019).
3. Learning and Optimization Techniques
CVNN optimization closely parallels that of RVNNs but requires complex-aware adaptations:
- Wirtinger Calculus: Used to compute gradients of functions not holomorphic, facilitating standard backpropagation even for non-analytic CVAFs (Sarroff et al., 2015, Abdalla, 2023).
- Complex Backpropagation: Three main variants are reported (Hammad, 27 Jul 2024):
- Complex Derivative Approach: Requires analytic activations; rarely practical for bounded nonlinearities.
- Partial Derivative (Split) Approach: Differentiates with respect to real and imaginary parts independently.
- Wirtinger-Based with Cauchy-Riemann Equations: Enforces or exploits analytic structure when present.
Regularization and Initialization: Initialization schemes (e.g., complex Glorot) must ensure desired variance over both real and imaginary components (Barrachina et al., 2023, Abdalla, 2023). Online regularization methods (L1, L2,1) enable adaptive sparsification and model selection in nonstationary environments, such as wireless channel prediction (Ding et al., 2019).
Optimization methods must carefully manage magnitude and phase information to preserve signal properties; convergence can be hampered by poorly chosen initializations or inappropriate nonlinearities (Barrachina et al., 2023, Mönning et al., 2018).
4. Expressivity, Approximation, and Generalization
Both theoretical and empirical studies demonstrate that CVNNs are universal approximators for complex-valued functions under non-degenerate activation choices. For deep CVNNs (more than one hidden layer), universality holds unless the activation is almost everywhere holomorphic, antiholomorphic, or a polynomial in and (Voigtlaender, 2020).
Quantitative approximation results show that CVNNs with activations such as modReLU or cardioid achieve optimal error rates for approximating smooth functions on , with the approximation error scaling as , matching the scaling of real-valued networks when accounting for the doubled dimension (Geuchen et al., 2023, Caragea et al., 2021). In terms of network depth, CVNNs may require fewer layers than their real-valued counterparts to achieve the same error when approximating continuous complex-valued mappings (Leng et al., 16 Feb 2025).
Generalization bounds for CVNNs scale with the spectral complexity—the product of weight matrix spectral norms—demonstrating strong correlation between spectral complexity and out-of-sample error (Chen et al., 2021).
5. Architectural Diversity and Implementational Considerations
CVNNs encompass fully connected, convolutional, recurrent, and transformer-based architectures. Recent work has established the practical implementation of complex-valued transformers, with custom adaptations of attention mechanisms (e.g., using the real part of the complex inner product for softmax scoring) to preserve phase information (Leng et al., 16 Feb 2025). Deep complex-valued RBF networks (C-RBF) with specialized parameter initialization schemes achieve robust convergence in high-noise environments and complex-valued signal processing (Soares et al., 15 Aug 2024).
Specialized modules include:
- Complex Batch Normalization: Jointly whitens real and imaginary channels; covariance and mean are computed over the complex vector (often modeled as a 2D real vector) (Abdalla, 2023).
- Complex Pooling/Dropout: Operate on magnitude or maintain phase, modified compared to real-valued counterparts (Barrachina et al., 2023).
- Complex Weight Initialization: Ensures appropriate scaling of real/imaginary parts, using Rayleigh (polar) or normal (rectangular) approaches (Abdalla, 2023, Barrachina et al., 2023).
Efficient implementation is supported in frameworks such as TensorFlow and PyTorch, now including complex data type support and autodifferentiation (Abdalla, 2023, Barrachina et al., 2023). Open-source libraries have accelerated adoption and reproducibility (Barrachina et al., 2020, Barrachina et al., 2023).
6. Application Domains
The application of CVNNs is especially prevalent where data is naturally or transformatively complex:
- Signal Processing: Speech enhancement, music analysis, radar and sonar, channel estimation, device identification, and equalization tasks. CVNNs explicitly leverage the I/Q (in-phase/quadrature) representation of RF signals, leading to higher accuracy in device fingerprinting and channel state monitoring (Chen et al., 2022, Soares et al., 15 Aug 2024).
- Imaging: MRI fingerprinting, optics (wavefront modulation), holography, phase retrieval, and image reconstruction/denoising in complex Fourier domains (Caragea et al., 2021, Geuchen et al., 2023).
- Wireless Communications: Channel prediction, multi-user detection, beamforming, and joint pilot-precoder-quantization design. CVNN-based transformers achieve superior mean squared error, detection accuracy, and sum-rate optimization in 5G MIMO contexts, often with reduced computational complexity and fewer parameters compared to RVNN alternatives (Leng et al., 16 Feb 2025, Ding et al., 2019).
- Classification Tasks: Non-circular complex input datasets (i.e., where real and imaginary parts are correlated or have different variances) benefit significantly from CVNNs, which outperform RVNNs and exhibit reduced overfitting (Barrachina et al., 2020).
7. Interpretability, Calibration, and Open Challenges
A recognized challenge in CVNN research is interpretability of the learned decision surfaces and reliable calibration of probabilistic outputs. Recent work adapts Newton–Puiseux theory to fit local polynomial surrogates to CVNN decision boundaries, decomposing them into fractional-power (Puiseux) series. Dominant Puiseux coefficients serve as phase-aligned curvature descriptors, enabling closed-form estimates of robustness and more precise temperature scaling for calibration. This approach yields calibration error reductions compared to conventional temperature scaling and reveals the intrinsic multi-sheeted, phase-sensitive structure of CVNN boundaries (Migus, 27 Apr 2025).
Several open questions and unresolved challenges remain:
- Bounded, Nonlinear, and Holomorphic Activation Functions: The development of CVAFs that balance analytic restrictions, numerical stability, and expressivity remains an active area of research (Hammad, 27 Jul 2024, Abdalla, 2023).
- Training Stability and Initialization: Sensitivity to weight initialization, architecture depth, and learning rates can hamper convergence; advanced or adaptive initialization and optimization approaches are called for (Barrachina et al., 2023, Mönning et al., 2018).
- Computational Complexity: Deep CVNNs can scale in computational cost as for fully-connected architectures, necessitating architectural and algorithmic refinements for deployment in low-power or real-time applications (Mayer et al., 2023).
- Library Support: Although modern libraries offer basic support for complex tensors and operations, development of robust, feature-complete tools tailored to CVNNs is ongoing (Abdalla, 2023, Barrachina et al., 2023).
- Calibration and Robustness: Analytic frameworks capable of quantifying and improving CVNN reliability in safety-critical or high-uncertainty domains are in early stages (Migus, 27 Apr 2025).
Conclusion
Complex-Valued Neural Networks provide a mathematically principled and practically potent extension of classical neural models, enabling holistic modeling of signals where phase and amplitude are essential. Advances in activation function design, optimization theory, expressivity analysis, architectural innovation, and interpretability have accelerated adoption in several scientific and engineering fields. Ongoing research aims to resolve foundational challenges in nonlinearity, training dynamics, implementation, and theoretical understanding, ensuring CVNNs continue to evolve as indispensable tools for modeling and processing complex-valued phenomena.