Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
55 tokens/sec
2000 character limit reached

Complex-Valued Neural Networks

Updated 30 July 2025
  • Complex-valued neural networks are models that process data in the complex domain, jointly capturing amplitude and phase information.
  • They extend conventional architectures by incorporating complex weights, specialized activation functions, and optimization via Wirtinger calculus.
  • CVNNs offer practical advantages in signal processing, communications, and medical imaging, delivering improved interpretability and efficiency.

A complex-valued neural network (CVNN) is a parametric learning model in which activations, weights, and signal transformations are defined in the complex domain. In contrast to conventional real-valued neural networks, CVNNs process data as elements of ℂ, enabling the joint modeling of amplitude and phase. This feature is especially advantageous for problems in signal processing, communications, medical imaging, and other domains where complex representations arise naturally (for example, via the Fourier transform or analytic signals). CVNNs distinguish themselves through their unique theoretical foundations, architectural choices, optimization strategies, and their ability to represent richer patterns compared to real-valued networks.

1. Theoretical Foundations: Differentiability and Network Composition

Central to the formulation of CVNNs is the distinction between holomorphic (complex-differentiable) and non-holomorphic functions. A complex function f(z)=u(x,y)+iv(x,y)f(z) = u(x, y) + iv(x, y), with z=x+iyz = x + iy, is holomorphic if it satisfies the Cauchy–Riemann equations,

ux=vy,vx=uy.\frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}, \quad \frac{\partial v}{\partial x} = -\frac{\partial u}{\partial y}.

Holomorphic functions permit a chain rule analogous to the real case and support theoretically elegant learning algorithms. However, Liouville’s theorem imposes a key restriction: any bounded entire (holomorphic everywhere) function is necessarily constant. This necessitates either sacrificing analyticity (holomorphicity) to achieve useful nonlinearities, or adopting unbounded activations with possible singularities (as in the complex tanh or complex cardioid (Virtue et al., 2017)).

To compute gradients for complex parameters in non-holomorphic settings, CVNNs employ Wirtinger calculus (also known as CR\mathbb{CR}-calculus), which defines two generalized partial derivatives:

fz=12(fxify),fzˉ=12(fx+ify).\frac{\partial f}{\partial z} = \frac{1}{2}(\frac{\partial f}{\partial x} - i \frac{\partial f}{\partial y}), \quad \frac{\partial f}{\partial \bar{z}} = \frac{1}{2}(\frac{\partial f}{\partial x} + i \frac{\partial f}{\partial y}).

This framework permits gradient descent on real-valued loss functions even when the network is not everywhere analytic, and it underpins most modern CVNN optimization (Sarroff et al., 2015, Barrachina et al., 2023, Hammad, 27 Jul 2024).

2. Architectures, Activation Functions, and Building Blocks

2.1. Architectural Structures

CVNNs extend the standard multilayer perceptron (MLP), convolutional, and recurrent architectures by promoting all weights, biases, and activations to the complex field. For convolutional or dense layers, a typical transformation is:

z(l+1)=W(l)z(l)+b(l),W(l)Cm×nz^{(l+1)} = W^{(l)}z^{(l)} + b^{(l)},\quad W^{(l)} \in \mathbb{C}^{m \times n}

where z(l)z^{(l)} is a vector of activations at layer ll (Sarroff et al., 2015, Dramsch et al., 2019, Chatterjee et al., 2023).

Hybrid models can combine real- and complex-valued blocks, transferring information between domains via conversion functions (e.g., forming z=x1+ix2z = x_1 + i x_2, or mapping xeiπxx \to e^{i \pi x}). These architectures, such as the Hybrid Neural Network (HNN), can be optimized using neural architecture search tailored to this enlarged solution space (Young et al., 4 Apr 2025).

2.2. Activation Functions

The core challenge is constructing activation functions (AFs) that are either holomorphic, bounded, or possess other desirable analytic properties. The most ubiquitous classes are:

  • Split-type ("Type A"): Apply a real-valued AF gg separately to real and imaginary parts: h(z)=g((z))+ig((z))h(z) = g(\Re(z)) + i g(\Im(z)).
  • Amplitude-Phase (Type B): Apply a nonlinearity to z|z| or arg(z)\arg(z), e.g. h(z)=g(z)eiarg(z)h(z) = g(|z|)e^{i\arg(z)}.
  • Fully complex analytic: Examples include the complex tanh, cardioid, and modReLU:

modReLU:σ(z)=ReLU(zb)zz\text{modReLU:}\quad \sigma(z) = \text{ReLU}(|z| - b)\frac{z}{|z|}

These satisfy phase-homogeneity and are Lipschitz, enabling mathematically tractable approximation guarantees (Caragea et al., 2021, Geuchen et al., 2023).

  • Kernel Activation Functions (KAFs): Non-parametric activations based on trainable expansions of kernel functions over the complex plane, allowing each neuron to learn a highly flexible nonlinearity (Scardapane et al., 2018).

Amplitudes and phases may also be nonlinearly mixed in special crafted functions, such as the cardioid activation:

f(z)=12(1+cosθ)z,    θ=arg(z)f(z) = \frac{1}{2}(1+\cos\theta)z,\;\; \theta = \arg(z)

which zeroes the negative real axis, analogous to ReLU but adapted for phase-invariant attenuation (Virtue et al., 2017).

3. Optimization and Backpropagation in the Complex Domain

Training CVNNs requires adaptation of gradient-based optimization because loss functions are typically real-valued and the underlying functions may be non-analytic. The primary algorithms are:

z(hg)=hggz+hgˉgˉz\frac{\partial}{\partial z} (h\circ g) = \frac{\partial h}{\partial g}\frac{\partial g}{\partial z} + \frac{\partial h}{\partial \bar{g}}\frac{\partial \bar{g}}{\partial z}

When the function is holomorphic, all conjugate terms vanish.

  • Split derivatives approach: Compute separate partials for real and imaginary parts, applying the chain rule in R2n\mathbb{R}^{2n} (Hammad, 27 Jul 2024).
  • Cauchy–Riemann constraint: When analytic, enforce fzˉ=0\frac{\partial f}{\partial \bar{z}} = 0 and utilize standard holomorphic gradients.

Weight initialization extends methods like Glorot/Xavier by respecting the variance of complex weights and often initializing the phase uniformly, and the modulus via a Rayleigh or appropriate real distribution (Abdalla, 2023, Barrachina et al., 2023).

4. Empirical Performance, Applications, and Interpretability

CVNNs have shown competitive or superior performance over their real-valued counterparts in several settings:

Application CVNN Advantage Notable Results
MRI fingerprinting Preserves phase, greater accuracy Cardioid CVNNs outperform real nets (T1/T2 NRMSE) (Virtue et al., 2017, Caragea et al., 2021, Geuchen et al., 2023)
Electroencephalogram Reduces parameter count, boosts accuracy CV/real hybrids yield higher accuracy & fewer params (Du et al., 2022)
Seismic/Electrical Encodes phase for non-stationary data Parameter reduction; improved signal interpretation (Dramsch et al., 2019)
Image patch matching Phase/amplitude descriptors Lower FPR95 rates than SIFT/real nets (Jiang et al., 2018)
Medical image segmentation Enhanced feature capture Consistent outperformance vs. real nets of equal size (Chatterjee et al., 2023)

Complex-valued filters have been found to have superior spectral selectivity and interpretability in their learned representations; for example, filter magnitude responses in CVNNs often match the domain structure (e.g., harmonics in sawtooth/analytic datasets) while real-valued counterparts show more noise (Sarroff et al., 2015).

5. Practical Considerations and Computational Aspects

CVNNs are fundamentally more expressive per parameter, as a complex multiplication embodies both scaling and rotation. This leads to potentially lower parameter counts for equivalent functions; e.g., a single complex neuron can implement the XOR function, while an RVNN requires at least a multilayer setup (Mayer et al., 2023).

However, each complex parameter entails two real degrees of freedom, which roughly doubles the raw weight count per neuron. Computational complexity analyses reveal that for shallow CVNNs with NN neurons and PP input/output features, the inference complexity is $4N(P+R)$ real multiplications for a conventional feedforward CVNN, with scaling considerations for deeper or RBF-type architectures (Mayer et al., 2023). For low-power or resource-constrained deployment, architectures minimizing multiplications (such as the C-RBF) are recommended (Mayer et al., 2023).

Optimization challenges include:

  • Hyperparameter sensitivity, especially due to activation singularities and the need for specialized regularization to prevent overfitting.
  • The necessity for adapted batch normalization and pooling schemes respecting complex-valued statistics (Abdalla, 2023).
  • Compatibility and support in mainstream deep learning frameworks, with ongoing developments in PyTorch and TensorFlow for native complex tensor and automatic differentiation (Smith, 2023, Abdalla, 2023).

6. Advances, Hybrid Models, and Future Outlook

Hybrid architectures, combining real- and complex-valued processing (via parallel paths and domain conversion), achieve improved cross-entropy loss and parameter efficiency in tasks such as noisy audio classification, compared to real-only networks (Young et al., 4 Apr 2025). Domain conversion functions and new parameterized complex activation functions allow such hybrids to adapt representation dynamically to task requirements.

Recent methodological innovations include:

  • Flexible nonparametric activation functions based on complex-valued kernel methods, enhancing model expressivity (Scardapane et al., 2018).
  • Multi-view learning approaches, such as Steinmetz Neural Networks, where separate real and imaginary subnetworks with Hilbert consistency penalties yield interpretable and robust representations with provably tighter generalization gaps (Venkatasubramanian et al., 16 Sep 2024).
  • Universal approximation and optimal error rates for complex-valued networks employing activations such as modReLU and cardioid, demonstrated to match real-valued network rates modulo the increased input dimension associated with CdR2d\mathbb{C}^d \simeq \mathbb{R}^{2d} (Caragea et al., 2021, Geuchen et al., 2023).
  • Closed-form generalization bounds for CVNNs that scale with spectral complexity (the product of layer spectral norms), guiding architectural design and regularization (Chen et al., 2021).

Ongoing research focuses on developing bounded, smooth, and analytic complex-valued activations, improved methods for weight initialization and normalization in the complex domain, and native support in deep learning platforms. The interplay between amplitude and phase, phase-aware regularization, and hybrid domain architectures remains an active area for both theoretical investigation and empirical paper.


In sum, complex-valued neural networks provide a principled, expressive framework for learning with signals where phase and amplitude jointly encode critical information. By leveraging complex arithmetic, dedicated activation functions, and customized optimization methods based on Wirtinger derivatives, CVNNs offer both theoretical depth and empirical utility across diverse domains involving complex data. Performance gains, improved parameter efficiency, and more interpretable representations underlie their adoption in natural and engineering signal domains, with continued advances in theory, applications, and software expected to further solidify their role in scientific machine learning.