Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Elementary Universal Activation Function

Updated 23 September 2025
  • EUAF is a class of elementary activation functions that guarantee universal approximation by meeting strict non-polynomial and continuity criteria.
  • It leverages harmonic analysis, ridgelet transforms, and parametrized adaptivity to enable dense function approximation on both compact and non-compact domains.
  • EUAF methodologies promote architectural parsimony and practical adaptability, impacting tasks from vision to dynamic system modeling.

An Elementary Universal Activation Function (EUAF) is a class of activation functions for neural networks that—despite their elementary structure—are able to confer the universal approximation property to networks in both theory and practice. The universality of EUAFs refers to the capacity for a neural network employing such a function (or a parametrized version thereof) to approximate any target function from broad classes (e.g., continuous, LpL^p, Sobolev) on compact or even non-compact domains, arbitrarily well, given sufficient expressivity in terms of weights, biases, or composition depth. The paper and characterization of EUAFs spans harmonic analysis, approximation theory, dynamical systems, and practical machine learning, centering on their algebraic, analytic, and geometric properties that guarantee density in functional spaces and enable efficient signal propagation.

1. Mathematical Foundations and Universality Criteria

The foundational criterion for universality in activation functions derives from results such as the Cybenko theorem and its many generalizations (Sonoda et al., 2015, Neufeld et al., 18 Oct 2024, Shin et al., 10 Apr 2025). Specifically, an activation function σ:RR\sigma:\mathbb{R}\to\mathbb{R} is universal if:

  • It is non-polynomial and continuous (classical scalar case).
  • Its derivatives (possibly up to a certain order) are integrable and bounded, forming a neural network approximate identity (nAI) (Bui-Thanh, 2021).
  • It satisfies certain admissibility conditions in a generalized harmonic analysis framework, such as reconstructivity in the ridgelet transform construction, i.e., for admissible ψ\psi, η\eta:

K(ψ,η)=(2π)m1R{0}ψ^(ζ)η^(ζ)ζmdζ0K_{(\psi,\eta)} = (2\pi)^{m-1} \int_{\mathbb{R}\setminus\{0\}} \frac{\overline{\hat\psi(\zeta)}\hat\eta(\zeta)}{|\zeta|^m} d\zeta \neq 0

  • For unbounded activations (e.g., ReLU), universality is maintained under Lizorkin distribution regularity, provided the admissibility condition above holds (Sonoda et al., 2015).

The EUAF framework encompasses both fixed and parametrized functions, including smooth, piecewise analytic, periodically oscillatory, and even certain non-smooth constructs—subject to these criteria.

2. Squashable and Superexpressive Activation Functions

The concept of "squashable" activation functions is pivotal; a function σ\sigma is squashable if, via composition with affine mappings, it can approximate both the identity and step functions on any compact set (Shin et al., 10 Apr 2025). Explicitly:

  • Condition 1 (Identity): Continuous differentiability and nonzero derivative at some zRz\in\mathbb{R}.
  • Condition 2 (Step): Existence of a width-1 σ\sigma-network approximating the binary threshold outside any small region.

Table: Classes of Activation Functions Satisfying Squashability

Class Criteria Satisfied Example Functions
Non-affine analytic Both identity and step function (via composition) Sigmoid, tanh, sine, exp
Piecewise, with nonzero derivatives Step via kink, identity via local invertibility Leaky-ReLU, h-swish

Superexpressive families (e.g., {sin,arcsin}\{\sin,\arcsin\}) provide a fixed-width architecture—independent of target function complexity—capable of universal approximation (Yarotsky, 2021, Wang et al., 12 Jul 2024). Periodic and inverse trigonometric functions facilitate dense coding via irrational winding arguments, in contrast to Pfaffian or polynomially piecewise activations, which lack sufficient oscillatory complexity for such properties.

3. Construction via Harmonic Analysis: Ridgelet, Radon, and Backprojection

For unbounded or Lizorkin activation functions, universality is established constructively through ridgelet transform theory (Sonoda et al., 2015):

  • Ridgelet transform Rψf(a,b)R_\psi f(a, b) analyzes ff along hyperplane slices.
  • Its inversion employs admissible pairs (ψ,η)(\psi, \eta) and reconstructs via the dual ridgelet transform:

f(x)=RηRψf(x)f(x) = R_\eta R_\psi f(x)

  • The backprojection filter (Radon inversion) is interpreted as what the network "learns" after backpropagation; Parseval's relation establishes energy conservation:

Rψf,Rηg=f,g\langle R_\psi f, R_\eta g \rangle = \langle f, g \rangle

Activation function choice determines the functional behavior in both analysis (ridgelet) and synthesis (dual transform) steps—admissibility is then a precise regularity and frequency condition.

4. Parametric and Adaptive EUAFs

Modern practice introduces parametrized activation functions (e.g., Universal Activation Function (UAF), Parametric Elementary Universal Activation Function (PEUAF)) (Yuen et al., 2020, Wang et al., 12 Jul 2024):

  • UAF: fUAF(x)=ln(1+eA(x+B)+Cx2)ln(1+eD(xB))+Ef_\mathrm{UAF}(x) = \ln(1 + e^{A(x+B)+C x^2}) - \ln(1 + e^{D(x-B)}) + E
  • PEUAF (triangle-wave + analytic tail):

PEUAF(x)={wx2(wx+1)/2,x0 x1+x,x<0\mathrm{PEUAF}(x) = \begin{cases} |wx - 2\lfloor (wx + 1)/2 \rfloor|, & x \geq 0 \ \frac{x}{1+|x|}, & x < 0 \end{cases}

  • These functions morph among known forms (identity, sigmoid, ReLU, Mish, etc.) as their parameters adapt during gradient-based training, enabling optimization and task-specific nonlinearity.

In practical tasks (CIFAR-10, gas quantification, reinforcement learning), networks using UAFs/PEUAFs evolve their parameters to match nearly optimal fixed activations or discover new ones, demonstrating empirical universality and adaptability.

5. Minimum Width and Architectural Parsimony

The theoretical bound for minimum width in networks using EUAFs is established as w=max{dx,dy,2}w = \max\{d_x, d_y, 2\} for input dimension dxd_x, output dimension dyd_y (Shin et al., 10 Apr 2025). For monotone activation functions and dx=dy=1d_x = d_y = 1, w=2w=2 is both necessary and sufficient.

Table: Minimum Width for Universal Approximation with EUAF

dxd_x dyd_y Monotone σ\sigma Min. Width ww
1 1 Yes 2
≥2 Any Yes/No max\max
Any ≥2 Yes/No max\max

This establishes the parsimonious architectural regimes in which EUAF-based networks maintain universality.

6. Extensions: Refinability, Structural Manipulations, and Universal Domains

Beyond standard universality:

  • Refinable activation functions (e.g., spline-based) allow splitting neurons and inserting layers without changing network output, leveraging subdivision theory via two-scale or refinement equations (López-Ureña, 16 Oct 2024).
  • EUAFs ensure universal approximation over non-compact domains (weighted CkC^k or Sobolev spaces) as long as the activation is non-polynomial and meets certain growth and regularity constraints (Neufeld et al., 18 Oct 2024).
  • In ODENet and ResNet architectures, the use of a single non-polynomial, Lipschitz continuous EUAF is sufficient for universal approximation of continuous dynamical mappings, with function class and discretization error controlled robustly (Kimura et al., 22 Oct 2024).

7. Practical Implications, Optimization, and Future Directions

The EUAF paradigm provides:

  • A general framework for activation function search, including entropy-based optimization (EAFO), with explicit correction schemes to reduce information entropy and improve robustness (e.g., CRReLU) (Sun et al., 19 May 2024).
  • Constructive guidance for architecture design, including explicit formulas for neuron number, scaling, weights, and non-asymptotic error rates (Bui-Thanh, 2021).
  • Adaptation to equivariant and domain-specific architectures (e.g., unitary equivariant networks) using generalized invariant scalar functionals fused with any standard EUAF (Ma, 17 Nov 2024).
  • Opportunities for further exploration of superexpressiveness, parametric adaptivity, and composite or hybrid activations, as indicated by the empirical success and theoretical density results obtained for periodic, spline-based, and analytic forms.

References to Key Literature

EUAF research demonstrates that elementary—yet fundamentally robust and mathematically sound—activation functions support the full expressive power of neural network architectures. This has yielded both deeper theoretical insight and improved practical adaptability, marking EUAFs as central objects in the paper and design of advanced neural systems.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Elementary Universal Activation Function (EUAF).