On Expressivity of Height in Neural Networks (2305.07037v2)
Abstract: In this work, beyond width and depth, we augment a neural network with a new dimension called height by intra-linking neurons in the same layer to create an intra-layer hierarchy, which gives rise to the notion of height. We call a neural network characterized by width, depth, and height a 3D network. To put a 3D network in perspective, we theoretically and empirically investigate the expressivity of height. We show via bound estimation and explicit construction that given the same number of neurons and parameters, a 3D ReLU network of width $W$, depth $K$, and height $H$ has greater expressive power than a 2D network of width $H\times W$ and depth $K$, \textit{i.e.}, $\mathcal{O}((2H-1)W)K)$ vs $\mathcal{O}((HW)K)$, in terms of generating more pieces in a piecewise linear function. Next, through approximation rate analysis, we show that by introducing intra-layer links into networks, a ReLU network of width $\mathcal{O}(W)$ and depth $\mathcal{O}(K)$ can approximate polynomials in $[0,1]d$ with error $\mathcal{O}\left(2{-2WK}\right)$, which improves $\mathcal{O}\left(W{-K}\right)$ and $\mathcal{O}\left(2{-K}\right)$ for fixed width networks. Lastly, numerical experiments on 5 synthetic datasets, 15 tabular datasets, and 3 image benchmarks verify that 3D networks can deliver competitive regression and classification performance.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.