Papers
Topics
Authors
Recent
Search
2000 character limit reached

α-ReLU: Power Activation in Neural Networks

Updated 12 June 2026
  • α-ReLU is a family of power activation functions defined as [max{0,x}]^α, with α controlling smoothness, homogeneity, and growth behavior.
  • It underpins theoretical analysis and practical design in neural networks, benefiting function approximation, PDE solvers, and control barrier synthesis.
  • Variants such as two-slope Leaky α-ReLU and sparsifying transformations optimize computational tractability while achieving minimax-optimal rates in regression tasks.

The α-ReLU, or power ReLU, refers to a parametric family of activation functions of the form φα(x)=[max{0,x}]α\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, where α>0\alpha > 0 is the exponent parameter. This class generalizes the standard ReLU (α=1\alpha=1) and underpins recent advances in the theoretical analysis and practical design of neural architectures for function approximation, PDE solvers, control barrier functions, and sparse regularized learning. Its properties—homogeneity, smoothness, and approximation-theoretic behavior—are sensitive to the value of α\alpha, which enables fine-grained control over regularity and functional capacities.

1. Formal Definition and Mathematical Properties

The α-ReLU activation is defined as

φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.

Key mathematical properties include:

  • Homogeneity: For any λ0\lambda \ge 0,

φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).

  • Smoothness: If α=k+γ\alpha = k+\gamma with kN0k \in \mathbb{N}_0, γ(0,1)\gamma \in (0,1), then α>0\alpha > 00 (i.e., α>0\alpha > 01-times differentiable, α>0\alpha > 02-th derivative α>0\alpha > 03-Hölder). For integer α>0\alpha > 04, α>0\alpha > 05 but not α>0\alpha > 06.
  • Growth at Infinity: α>0\alpha > 07 as α>0\alpha > 08, growing sublinearly if α>0\alpha > 09.
  • Special Cases:
    • Standard ReLU: α=1\alpha=10, α=1\alpha=11.
    • Higher-order: integer α=1\alpha=12 yields piecewise polynomials.

For variants, such as the two-slope Leaky α-ReLU used in control settings (Samanipour et al., 16 Mar 2026), the function is piecewise linear: α=1\alpha=13 with α=1\alpha=14.

2. Approximation and Regularity in Shallow α-ReLU Networks

Shallow α-ReLU networks are central to the analysis of PDE solution operators and function approximation in Sobolev/Hölder and Barron-type norms. For the Dirichlet-Laplace (Poisson) problem on half-spaces, solution regularity and approximation rates depend sensitively on α=1\alpha=15 (Vaishampayan et al., 2024, Li et al., 18 May 2026):

  • Fractional α=1\alpha=16: Network solutions realize fractional Hölder regularity (α=1\alpha=17); the associated Barron norm α=1\alpha=18 is compatible with the solution's smoothness.
  • Integer α=1\alpha=19: One obtains α\alpha0 regularity, corresponding to one derivative less of Lipschitz continuity (but not α\alpha1).
  • Approximation Guarantees: Given a function α\alpha2 on the boundary with controlled α\alpha3 norm, the solution α\alpha4 and its Monte-Carlo approximation α\alpha5 in the domain satisfy

α\alpha6

under technical conditions on α\alpha7, and α\alpha8. For integer α\alpha9, logarithmic penalties in the Barron norm arise due to log-divergences at the boundary (Vaishampayan et al., 2024).

Approximation of general φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.0 in φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.1 balls or Sobolev spaces with shallow φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.2-ReLU networks yields rates that depend polynomially or log-polynomially on the network width φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.3, the exponent φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.4, and the spatial dimension φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.5 (Li et al., 18 May 2026).

3. Barron, Sobolev, and Spectral Characterizations

The choice of φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.6 in α-ReLU directly links the network's functional capacity to analytic regularity scales:

  • Barron Norms: For φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.7, defined as the infimum of expected weighted coefficients over representation by α-ReLU ridge superpositions, this norm governs approximation error for PDE boundary data (Vaishampayan et al., 2024).
  • Sobolev Embedding: The functional class φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.8 embeds into φα(x)=[max{0,x}]α,α>0.\varphi_\alpha(x) = [\max\{0, x\}]^\alpha, \quad \alpha > 0.9 or λ0\lambda \ge 00 depending on whether λ0\lambda \ge 01 or λ0\lambda \ge 02, where λ0\lambda \ge 03 and λ0\lambda \ge 04.
  • Path-Norm Regularization: For finite-width networks

λ0\lambda \ge 05

the λ0\lambda \ge 06 path-norm is

λ0\lambda \ge 07

Minimax-optimal generalization rates are achieved for regression over Barron and Sobolev (fractional) targets, with exponents determined by λ0\lambda \ge 08 (Li et al., 18 May 2026).

The critical regularity transition occurs when λ0\lambda \ge 09 crosses an integer: fractional powers yield φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).0 for φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).1, while integer exponents only ensure Lipschitz continuity of the φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).2-th derivative.

4. α-ReLU in Control Barrier Function Synthesis

For control systems with safe set invariance under polytopic input constraints, α-ReLU functions are used as surrogates for extended class-φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).3 barrier functions (Samanipour et al., 16 Mar 2026):

  • Two-Slope α-ReLU: Parameterized by positive slopes φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).4 on φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).5 and φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).6 respectively, ensuring continuity, piecewise differentiability, radial unboundedness, and strict monotonicity.
  • Convexity in Synthesis: The two-slope α-ReLU maintains the linearity of control barrier function (CBF) constraints in linear programming synthesis, facilitating tractable certification of safety properties.
  • Conservatism and UIS Construction: The union of invariant sets (UIS), obtained by max-composing solutions for different slopes, never reduces the certified safe set below that of the optimal linear α; in most cases, it expands it (Samanipour et al., 16 Mar 2026).

This surrogate captures the strength of general class-φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).7 nonlinearities without introducing additional nonconvexity or substantial conservatism in stability certification.

5. α-ReLU Variants and Modified Network Architectures

Beyond pointwise power functions, the literature includes sparsifying α–ReLU transforms acting on weights, notably in nonparametric regression (Beknazaryan et al., 2022):

  • Sparsifying α: Defined as

φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).8

applied entrywise to network weight matrices prior to multiplication and activation, thereby imposing structured sparsity.

  • Statistical Rates: With φα(λx)=λαφα(x).\varphi_\alpha(\lambda x) = \lambda^\alpha \varphi_\alpha(x).9 or α=k+γ\alpha = k+\gamma0-penalized empirical risk minimization, sparsified α-ReLU networks achieve, up to log factors, minimax-optimal α=k+γ\alpha = k+\gamma1 prediction rates for α=k+γ\alpha = k+\gamma2-Hölder regression under sub-Gaussian noise (Beknazaryan et al., 2022).

This approach yields scale-invariance of penalty complexity, bypassing the suboptimal covering behavior of conventional penalized ReLU networks.

6. Practical Considerations and Trade-offs

Selection of α=k+γ\alpha = k+\gamma3 in ReLUα=k+γ\alpha = k+\gamma4 activations is a design choice balancing analytic regularity, approximation power, and computational tractability:

  • Regularity Requirements: Applications like PINNs or strong-form PDE solvers may necessitate α=k+γ\alpha = k+\gamma5 regularity, thus motivating integer α=k+γ\alpha = k+\gamma6.
  • Computational Cost: For non-integer and especially irrational α=k+γ\alpha = k+\gamma7, the evaluation cost of α=k+γ\alpha = k+\gamma8 can be significant.
  • Barron-Norm Growth: For integer values, an unavoidable logarithmic penalty emerges in the Barron norm near boundaries, impacting the efficiency of representation (Vaishampayan et al., 2024).
  • Parameter Interpretability: In two-slope Leaky α-ReLU barrier constructions, tuning α=k+γ\alpha = k+\gamma9 modulates the aggressiveness of barrier enforcement for positive and negative violations (Samanipour et al., 16 Mar 2026).

A plausible implication is that the α-ReLU family enables granular matching of network expressivity to analytic and application-driven demands, but judicious tuning is required to balance all competing considerations.

7. Comparative Summary

Variant Mathematical Formulation Main Use/Result
Standard Power α-ReLU kN0k \in \mathbb{N}_00 PDE solvers, approximation with controlled regularity (Vaishampayan et al., 2024, Li et al., 18 May 2026)
Two-slope Leaky α-ReLU Piecewise linear with kN0k \in \mathbb{N}_01 Barrier certification under control saturation (Samanipour et al., 16 Mar 2026)
Sparsifying α-ReLU Piecewise constant-linear on weights Sparse nonparametric regression at minimax rates (Beknazaryan et al., 2022)

References

  • Vaishampayan and Wojtowytsch, "Solving the Poisson Equation with Dirichlet data by shallow ReLUkN0k \in \mathbb{N}_02-networks" (Vaishampayan et al., 2024).
  • Beknazaryan and Sang, "Nonparametric regression with modified ReLU networks" (Beknazaryan et al., 2022).
  • Li, Liu, and Shi, "Shallow ReLUkN0k \in \mathbb{N}_03 Networks in kN0k \in \mathbb{N}_04-Type and Sobolev Spaces: Approximation and Path-Norm Controlled Generalization" (Li et al., 18 May 2026).
  • ReLU Barrier Functions (multiple authors), "ReLU Barrier Functions for Nonlinear Systems with Constrained Control: A Union of Invariant Sets Approach" (Samanipour et al., 16 Mar 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to α-ReLU.