Rectified Linear Complexity in ReLU Networks

Updated 12 January 2026

Rectified Linear Complexity is a metric that quantifies how the interplay of depth and width in ReLU networks governs the creation of affine (piecewise linear) regions.
The analysis reveals that increased depth exponentially multiplies affine segments, thereby enhancing expressivity while imposing computational challenges.
Theoretical results, including depth–size gap theorems and zonotope-based lower bounds, underscore the need for deep architectures to efficiently approximate complex functions.

Rectified Linear Complexity denotes the interplay among depth, width, and the number of affine (piecewise linear) regions in functions realized by @@@@1@@@@ with rectified linear units (ReLU-DNNs). It quantifies the expressivity of ReLU networks by measuring how their architecture governs the partitioning of input space into regions where the computed function is affine. The notion synthesizes structural and functional complexity of ReLU-DNNs and establishes formal lower bounds relating network architecture to function representation and training complexity (Arora et al., 2016).

1. Function Class and Complexity Measures

A ReLU-DNN with input dimension $w_0$ , output dimension $w_{k+1}$ , and $k$ hidden layers of widths $w_1, \dots, w_k$ implements functions:

$f(x) = T_{k+1} \circ \sigma \circ T_k \circ \cdots \circ \sigma \circ T_1(x)$

where $T_i: \mathbb{R}^{w_{i-1}} \rightarrow \mathbb{R}^{w_i}$ is affine for $i=1\dots k$ , $T_{k+1}$ is linear, and $\sigma$ applies coordinate-wise as $\sigma(t) = \max\{0, t\}$ .

Key structural and functional measures:

Term	Definition	Notation
Depth	Total number of layers, including output	$k+1$
Width	Max hidden layer width	$\max\{w_1, \dots, w_k\}$
Size	Sum of hidden units across layers	$\sum_{i=1}^k w_i$
Affine Pieces	Maximal connected regions on which $f$ is affine	Number of PWL regions

Any ReLU-DNN computes a continuous piecewise linear (PWL) function. Conversely, every PWL function $f : \mathbb{R}^n \rightarrow \mathbb{R}$ can be represented by a ReLU-DNN of depth $\lceil \log_2(n+1)\rceil + 1$ . The number of affine regions, i.e., the cardinality of maximal connected input regions mapped affinely, serves as a fundamental complexity metric.

2. Global Optimization for One Hidden Layer

Empirical risk minimization over ReLU networks with one hidden layer and convex loss $\ell$ can be globally optimized as:

$\min_{A, b, a'} \frac{1}{D} \sum_{j=1}^{D} \ell\left(a' \cdot \sigma(A x_j + b), y_j\right)$

A globally optimal algorithm proceeds via:

Writing each hidden unit $i$ as $s_i \max\{0, \tilde{a}^i \cdot x + \tilde{b}_i\}$ , $s_i \in \{\pm 1\}$ .
Partitioning data by sign of $\tilde{a}^i \cdot x_j + \tilde{b}_i$ for all $i$ .
Enumerating all sign and partition choices $(s_i)_{i=1}^w \in \{\pm1\}^w$ and all hyperplane partitions $(P^i_+, P^i_-)$ , with possible count $2^w D^{n w}$ .
For each, solving the induced convex program in $(\tilde{a}^i, \tilde{b}_i)$ .

Total runtime:

$O \left( 2^w D^{n w} \mathrm{poly}(D, n, w) \right)$

This is polynomial in sample size $D$ for fixed $n, w$ but exponential in $n$ and $w$ , matching known computational hardness bounds.

3. Depth–Size Gap Theorems

The expressivity of ReLU-DNNs grows rapidly with increased depth compared to width or overall size. For integers $k \geq 1$ , $w \geq 2$ , there exists a 1D function $f$ such that:

A $(k+1)$ -layer ReLU net of width $w$ represents $f$ .
Any representation by a shallower $(k'+1)$ -layer net with $k' < k$ incurs a lower bound on required size:

$\text{size} \geq \frac{1}{2} k' w^{k/k' - 1}$

Furthermore, for every $k \in \mathbb{N}$ , there exists a member of a smoothly-parameterized family (by $\bigcup_{M>0}\Delta_M^{k^3-1}$ ):

$f$ is realized by a depth $k^2 + 1$ net of size $k^3$ .
Any depth $k + 1$ ReLU net computing $f$ requires at least:

$\frac{1}{2} k^{k+1} - 1$

The construction uses sawtooth-composed functions—composition amplifies the number of affine segments exponentially in depth.

4. Lower Bounds via Zonotope Constructions

A new lower bound for affine region count in ReLU-DNNs is established via the theory of zonotopes. For vectors $v_1,\dots,v_m \in \mathbb{R}^n$ , the zonotope $Z(v_1,\dots,v_m)$ is:

$Z(v_1,\dots,v_m) = \left\{ \sum_{i=1}^m \lambda_i v_i : -1 \leq \lambda_i \leq 1 \right\}$

and its support function:

$\gamma_Z(x) = \max_{z \in Z(v_1,\dots,v_m)} \langle x, z \rangle = \sum_{i=1}^m |\langle x, v_i \rangle|$

For general $v_i$ , $\gamma_Z$ has:

$|\mathrm{verts}(Z(v_1,\dots,v_m))| = \sum_{i=0}^{n-1} \binom{m-1}{i}$

distinct affine pieces. $\gamma_Z$ can be implemented by a two-layer ReLU net of size $2m$.

Composition with a $k$ -fold sawtooth map $H_{t_1,\dots,t_k}$ yields a ReLU net of depth $k+2$ , size $2m + wk$, and number of segments:

$\left( \sum_{i=0}^{n-1} \binom{m-1}{i} \right) w^k$

Asymptotically, choosing $m \gg n$ or $m \approx n$ ,

$\Omega(m^{n-1} w^k)$

For depth $\leq k+1$ , matching this piece count requires size at least $\Omega((m-1)^{(n-1)/k'} w^{k/k'})$ .

5. Synthesis and Implications of Rectified Linear Complexity

The composition of depth and width exponentially increases the count of affine regions:

Depth acts as the exponential composition resource; each layer can multiply the region count.
Width (or size) determines the parallel granularity per layer.
The total number of affine pieces grows as $\text{width}^{\text{depth}}$ in 1D or $(\sum_{i=0}^{n-1} \binom{m-1}{i}) \, \text{width}^{\text{depth}}$ in $\mathbb{R}^n$ .

The triplet $(\text{depth}, \text{width}, \text{number of affine regions})$ forms a natural complexity measure—Rectified Linear Complexity—which encapsulates the expressive power of ReLU networks. Deeper networks attain exponential region growth with moderate width, while shallow networks require super-polynomial size for function approximation equivalence.

A plausible implication is that for function classes requiring exponentially many affine pieces, depth is indispensable for architectural efficiency. Furthermore, computational hardness in training aligns with the representational barriers: even for one hidden layer, the exponential increase in complexity with input dimensionality indicates that training algorithms are fundamentally limited by both representational and computational regimes (Arora et al., 2016).

Markdown Report Issue Upgrade to Chat

References (1)

Understanding Deep Neural Networks with Rectified Linear Units (2016)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Rectified Linear Complexity.