Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 21 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 231 tok/s Pro
GPT OSS 120B 435 tok/s Pro
Claude Sonnet 4 33 tok/s Pro
2000 character limit reached

Minimum width for universal approximation using squashable activation functions (2504.07371v1)

Published 10 Apr 2025 in cs.LG

Abstract: The exact minimum width that allows for universal approximation of unbounded-depth networks is known only for ReLU and its variants. In this work, we study the minimum width of networks using general activation functions. Specifically, we focus on squashable functions that can approximate the identity function and binary step function by alternatively composing with affine transformations. We show that for networks using a squashable activation function to universally approximate $Lp$ functions from $[0,1]{d_x}$ to $\mathbb R{d_y}$, the minimum width is $\max{d_x,d_y,2}$ unless $d_x=d_y=1$; the same bound holds for $d_x=d_y=1$ if the activation function is monotone. We then provide sufficient conditions for squashability and show that all non-affine analytic functions and a class of piecewise functions are squashable, i.e., our minimum width result holds for those general classes of activation functions.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.