Unimodality of minimal width vectors for filling architectures

Determine whether, for fixed L,d0,dL and activation degree r, every width vector d that is minimal (with respect to coordinatewise order) among those yielding a filling architecture V_{d,r}=(Sym_{r^{L−1}}(R^{d0}))^{dL} is unimodal—i.e., weakly increasing up to some index and then weakly decreasing.

Background

The authors discuss architectural design and propose a conjecture—originating from prior work—that minimal width profiles achieving maximal expressivity (filling neurovarieties) are structurally unimodal. This aligns with common machine learning heuristics that networks should expand then contract to capture and refine features.

They bring this conjecture into the present framework of polynomial neural networks, relating architectural patterns to expressivity characterized by the dimension and filling property of neurovarieties.

References

The following conjecture suggests a unimodal distribution of layer widths within a neural network is efficient. Fix $L,d_0,d_L$ and $r$; any minimal (with respect to $\preccurlyeq$) vector of widths $d=(d_0,d_1,\dots,d_L)$ such that the architecture $\mathcal{V}{d,r}$ is filling, is unimodal, i.e.\ there exists $i\in{0,1,\dots,L}$ such that $(d_0,\dots,d_i)$ is weakly increasing and $(d{i},\dots,d_L)$ is weakly decreasing.

— Geometry of Polynomial Neural Networks (2402.00949 - Kubjas et al., 1 Feb 2024) in Section 5.1 (A plethora of conjectures)

Unimodality of minimal width vectors for filling architectures

Background

References

Related Problems