Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 102 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

modReLU Activation Functions

Updated 12 October 2025
  • modReLU activation functions are a set of generalized ReLU functions that incorporate smoothness, phase preservation, and adaptive parameters for real and complex inputs.
  • They extend the canonical ReLU by introducing smooth CDFs, parametric cubic enhancements, matrix-valued generalizations, and capped forms to improve gradient flow and robustness.
  • Their versatile formulations lead to improved training stability, higher classification accuracy, and optimal approximation properties in complex-valued neural network applications.

The modReLU activation function is a family of modified or generalized rectified linear unit functions central to both real- and complex-valued neural networks, with forms and parameterizations that enhance smoothness, adaptability, expressivity, and robustness. modReLU functions appear as smooth ReLU variants, phase-preserving complex-valued nonlinearities, parametric cubic enhancements, normalized and matrix-valued extensions, and capped activations for adversarial robustness.

1. Mathematical Definitions and Variants

modReLU refers to a collection of activation functions that build upon the canonical ReLU, f(x)=max(0,x)f(x) = \max(0, x), by introducing parameters and structure for extended behavior. The central forms include:

  • Smooth modReLU: Expressed as fα(x)=xΔα(x)f_\alpha(x) = x \cdot \Delta_\alpha(x), where Δα(x)\Delta_\alpha(x) is a smooth cumulative distribution function (CDF) such as (1eαx)(1 - e^{-\alpha x}) for x>0x > 0 and α>0\alpha > 0. In the limit α\alpha \to \infty, the function recovers the hard-step ReLU (Farhadi et al., 2019).
  • Complex-valued modReLU: For zCz \in \mathbb{C}, σ(z)=ReLU(z+b)sgn(z)\sigma(z) = \mathrm{ReLU}(|z| + b) \cdot \mathrm{sgn}(z), where bb is a bias and sgn(z)=z/z\mathrm{sgn}(z) = z/|z| for z0z \ne 0 (Parhi et al., 2019, Caragea et al., 2021). This formulation thresholds the magnitude but preserves the phase—an essential property for applications with complex data.
  • Parametric modReLU: Enhanced by higher-order terms, f(l)(x;c1(l),c2(l))=θ(x)[c1(l)x+yc2(l)x3]f^{(l)}(x; c_1^{(l)}, c_2^{(l)}) = \theta(x)[c_1^{(l)} x + y c_2^{(l)} x^3], where c1(l)c_1^{(l)} and c2(l)c_2^{(l)} are layer-dependent parameters and yy is a global scale (Yevick, 29 Mar 2024).
  • Capped modReLU: a(z,C)=max(0,min(z,C))a(z, C) = \max(0, \min(z, C)), introducing an upper bound CC to limit activation output and impede adversarial amplification (Sooksatra et al., 6 May 2024).
  • Matrix-valued modReLU: Generalizes ReLU to matrix-operator activation, where each output can depend on trainable piecewise constant functions of the input, leading to richer cross-neuron adaptivity (Liu et al., 2021).

2. Smoothness, Adaptivity, and Normalization

Smooth modReLU variants replace the non-differentiable Heaviside step with a smooth CDF, such as the exponential or logistic CDF, leading to continuous derivatives and enhanced gradient flow in deep networks (Farhadi et al., 2019). Adaptive versions further employ trainable smoothness or shape parameters (e.g., α\alpha) for each neuron, which are updated during training:

  • Back-propagation update for smoothing parameter:

ααγlossα\alpha \leftarrow \alpha - \gamma \frac{\partial \mathrm{loss}}{\partial \alpha}

with possible reparameterization to enforce positivity.

Static activation normalization, as with the "tilted ReLU" (fh(x)=x2/πf_h(x) = |x| - \sqrt{2/\pi}), ensures that activation outputs possess zero mean and unit variance under Gaussian input, preserving dynamical isometry and supporting robust convergence, notably permitting deeper architectures to be reliably trained (Richemond et al., 2019).

3. modReLU in Complex-Valued Neural Networks

modReLU is the activation of choice in complex-valued neural networks (CVNNs) due to its phase equivariance, defined by σ(eiθz)=eiθσ(z)\sigma(e^{i\theta}z) = e^{i\theta}\sigma(z)—ensuring rotation compatibility in the complex plane (Caragea et al., 2021). This property is central in domains where phase carries semantic information (e.g., MRI fingerprinting).

Theoretical analysis confirms that modReLU-equipped CVNNs approximate any CnC^n-regularity function on compact subsets of Cd\mathbb{C}^d with optimal rates (up to logarithmic factors):

Approximation errorC εO(ε2d/n)\text{Approximation error} \leq C\ \varepsilon \sim O(\varepsilon^{-2d/n})

for error tolerance ε\varepsilon and input dimension dd. The doubling in exponent ("2d/n-2d/n") compared to real networks follows from the identification CdR2d\mathbb{C}^d \cong \mathbb{R}^{2d} (Caragea et al., 2021, Geuchen et al., 2023). The optimality of these rates depends on the activations being non-polyharmonic and sufficiently smooth (Geuchen et al., 2023).

4. Adaptation, Parametric, and Matrix Extensions

Evolutionary search and gradient descent can be used to optimize modReLU parameters (e.g., bias bb or scaling α\alpha) across architectures or tasks, yielding robust, adaptive activations and improved test accuracy over standard fixed functions (Bingham et al., 2020). In applications, custom modReLU forms as parametric modules:

f(z;θ)=ReLU(αz+b)(z/z)f(z; \theta) = \mathrm{ReLU}(\alpha |z| + b) \cdot (z/|z|)

This parameterization supports architectural and dataset-specific tuning, enhancing performance. Matrix-valued modReLU generalizes fixed activations by encoding the activation as a trainable operator (possibly non-diagonal) over preactivations (Liu et al., 2021):

  • Diagonal, tri-diagonal, or general piecewise constant matrix forms
  • Parameters trained jointly with network weights and biases
  • Empirical accuracy often exceeds that of standard ReLU networks in classification and function approximation tasks

5. Regularization, Expressivity, and Spline Theory

Theoretical frameworks connect modReLU to linear spline representations in Banach spaces, with regularization terms such as the 1\ell^1 path-norm:

R(θ)=kvkwkR(\theta) = \sum_k |v_k| |w_k|

and quadratic weight-decay equivalents. modReLU, similar to ReLU and leaky ReLU (both (0,1,2)-power activations), induces optimal spline fits, controlling function class complexity via underlying operator smoothness (Parhi et al., 2019). Complex modReLU fulfills generalized admissibility and scaling properties, preserving theoretical guarantees.

Skip connections—implemented as low-degree polynomials or residual bias terms—carry over from the ReLU activation landscape to modReLU settings, supporting the learning of low-frequency (or affine) components essential for stable and expressive representation (Parhi et al., 2019).

6. Practical Impact: Accuracy, Robustness, and Applications

modReLU variants have demonstrated empirical gains in accuracy, representation richness, and training stability:

  • Adaptive cubic extensions yield improved MNIST test accuracy (0.982–0.986), exceeding standard ReLU and swish, with tradeoffs in convergence stability parameter space (Yevick, 29 Mar 2024).
  • Smooth modReLU mitigates "dead neuron" effects and enhances learning in early layers via greater variability and curvature adaptivity (Farhadi et al., 2019).
  • Matrix-value modReLU achieves lower approximation errors for oscillatory functions and higher classification accuracy on benchmarks such as CIFAR-10 relative to canonical ReLU (Liu et al., 2021).
  • Capped modReLU, a(z,C)=max(0,min(z,C))a(z, C) = \max(0, \min(z, C)), restricts adversarial perturbation amplification, yielding substantial improvements in adversarial robustness; training with adversarial examples further enhances this effect without major loss of standard accuracy. Sensitivity maps confirm reduced vulnerability at lower activation caps (Sooksatra et al., 6 May 2024).

CVNNs using modReLU activation have proven effective for applications with natural complex-valued data such as MRI, offering phase-equivariant learning and optimal approximation rates (Caragea et al., 2021, Geuchen et al., 2023).

7. Limitations, Tradeoffs, and Future Directions

modReLU's key limitations center on parameter tuning, tradeoffs between accuracy and convergence (especially with strong nonlinear augmentations), and the curse of dimensionality in high-dimensional approximations. While the introduction of smoothness, adaptive parameters, and caps mitigate issues of non-differentiability and adversarial vulnerability, excessively tight constraints can lead to underfitting or vanishing gradients (Sooksatra et al., 6 May 2024). The optimality of modReLU for CVNN expressivity may be compromised if smoothness or non-polyharmonicity properties are lost (Geuchen et al., 2023).

Matrix and parametric generalizations introduce new layers of trainable parameters, increasing model complexity and computational requirements, although empirical results consistently suggest favorable efficiency/accuracy tradeoffs. Future work is expected to explore further extensions to activation function architecture, especially for complex-valued and adversarially robust models.


modReLU activation functions comprise a versatile family instrumental for advancing the expressivity, robustness, and adaptability of both real- and complex-valued neural architectures. Key developments include smooth and normalized variants for stable, deep training; parametric cubic enhancements for increased accuracy; phase-preserving forms for complex data; matrix-valued generalizations for learnable nonlinearity; and capped forms for adversarial defense. The theoretical and empirical body confirms modReLU’s central role in the modern landscape of adaptive activation design, with ongoing research directed at resolving tradeoffs inherent in parameterization, convergence, and approximation complexity.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to modReLU Activation.