Piecewise Linear Representation Theory

Updated 13 January 2026

Piecewise Linear Representation Theory is a unified framework for representing functions using finitely or countably many affine-linear regions, vital in optimization, neural networks, and computational geometry.
It employs canonical representations like partition-and-affine and graph-union, with insights into normal forms, minimal arity bounds, and homological methods for rigorous function analysis.
The theory underpins efficient neural network implementations, stable basis constructions for numerical analysis, and diverse applications in symbolic dynamics and control allocation.

Piecewise Linear Representation Theory provides a rigorous unified framework for expressing, analyzing, and algorithmically manipulating functions and operators defined via finitely or countably many affine-linear regimes. This theory interlinks combinatorial, algebraic, geometric, and computational perspectives, and underpins much of modern modeling in optimization, computational geometry, neural network theory, and symbolic dynamics. Central concepts include fundamental normal forms, complexity bounds, basis stability, homological constructions, encoding by shallow and deep ReLU neural networks, and multidimensional partition structures.

1. Canonical Definitions and Normal Forms

A function $f:\mathbb{R}^d \to \mathbb{R}$ (or more generally, between normed vector spaces) is piecewise linear (PL) if there exists a finite collection of polyhedral sets $P_i \subset \mathbb{R}^d$ covering the domain, and affine maps $A_i(x) = a_i^T x + b_i$ such that $f|_{P_i} = A_i$ for all $i$ (Tao et al., 2022, Zheng et al., 2020). This can be generalized to piecewise affine (PA) and piecewise multilinear (PMLR) function classes in higher-arity or tensor settings (Rajput et al., 2022).

Two main representations are standard:

Partition-and-affine (PA): $X = \cup_{i=1}^m P_i$ , $f(x) = T_i(x) + b_i$ for $x \in P_i$ .
Graph-union (PL $^1$ ): $\text{graph}(f) = \cup_{i=1}^m A_i$ with each $A_i$ a polyhedron in $X \times Y$ (Zheng et al., 2020).

In finite-dimensional spaces, these notions coincide.

A fundamental structural result (Theorem 3.2, (Zheng et al., 2020)) states: any PL map $f:X\to Y$ splits as $f(x_1 + x_2) = T(x_1) + g(x_2)$ with $X = X_1 \oplus X_2$ , $\dim X_2 < \infty$ , $T:X_1 \to Y$ linear, and $g$ a fully PL map on $X_2$ .

In the combinatorial setting of $\mathbb{R}^n \to \mathbb{R}$ , every CPWL function can be written as sums of $\max$ 's of affine forms; the minimal arity (number of arguments required in each $\max$ ) is characterized by the maximal number of polyhedral regions meeting at a vertex in the cell-complex induced by $f$ (Koutschan et al., 2024).

2. Homological and Simplicial Methods for PL Functions

Recent developments exploit triangulation and homology to construct succinct PL representations. Given a function $f \in \mathrm{PL}(d,n)$ supported on a polyhedron $P \subset \mathbb{R}^d$ , its graph in $\mathbb{R}^{d+1}$ together with $P$ yields a polyhedral $d$ -sphere $S$ . The key topological construction produces a $d+1$ -chain $c = \sum_{i=1}^N \epsilon_i \Delta_i$ (with $N \le 2n$ and $\Delta_i$ nondegenerate $(d+1)$ -simplices), bounding $S$ in relative homology (Calegari, 11 May 2025). The associated decomposition

$f(x) = \sum_{i=1}^N \epsilon_i \tau(\Delta_i)(x)$

expresses $f$ as a sum of simplex functions $\tau(\Delta)$ , where $\tau(\Delta)(x)$ is the vertical thickness of $\Delta$ above $x$ .

Each $\tau(\Delta)$ is itself a PL function with $O(d^2)$ linear pieces, and explicit ReLU (max-min) normal forms can be constructed for each simplex function. This triangulation-theoretic approach underpins efficient universal neural network representations (Calegari, 11 May 2025).

3. Explicit Representations: Max-Min, Abs-Normal, and PMLR Theory

Max-Min and Minimal Arity Representations

Every CPWL function $F:\mathbb{R}^n \to \mathbb{R}$ can be written as a linear combination of $\max$ 's of at most $n+1$ affine-linear functions. This upper bound is tight; for instance, $F(x)=\max\{0, x_1, \ldots, x_n\}$ achieves the minimal arity $n+1$ (Koutschan et al., 2024). The precise minimal arity is determined by the combinatorics of the cell decomposition induced by $F$ : it is the maximal number of regions meeting at a vertex.

The lattice polynomial (max-min) representation of a CPWL function provides a constructive encoding, and the structure of gradient jumps along flags in the decomposition underlies the tightness of minimal arity bounds (Koutschan et al., 2024).

Abs-Normal Form

The universal abs-normal form is central in algorithmic and optimization contexts (Griewank et al., 2017): $\begin{aligned} z &= c + Zx + L|z| \ y &= b + Jx + Y|z| \end{aligned}$ with $L$ strictly lower-triangular. The switching variables $z$ encode the kink structure; the domain is partitioned into polyhedral cells classified by the signs of $z$ . The generalized Jacobian on each cell is explicitly computable, as are the KKT-like conditions (relating to AVE and LCP) and Schur-complement reductions.

This formalism enables Newton and fixed-point methods for solving PL systems, with convergence governed by spectral properties of the Schur complement $S = L - ZJ^{-1}Y$ .

Piecewise Multi-Linear Representation

For tensor-product grids, PMLR theory constructs multidimensional PL representations via nested block-Kronecker products and the separation of variables. Given breakpoints in each coordinate, one encodes $g:\mathbb{R}^k\to\mathbb{R}^m$ as

$g(z) = \Gamma (\hat z_1 \otimes \cdots \otimes \hat z_k)$

with $\hat z_j = [1, z_j, |z_j-\mu_j(2)|, \ldots]^T$ . The coefficient matrix $\Gamma$ is determined directly from data fitting (exact match at the grid points) (Rajput et al., 2022).

4. Stability, Basis Structure, and Function Representation

PL representation theory distinguishes between local and nonlocal parameterizations.

Local (Hat Basis): Every CPWL function can be represented, given a triangulation, as $\sum_v c_v \beta_v(x)$ , where $\beta_v$ is a nodal hat function at vertex $v$ . The hat basis is stable, forming a Riesz basis in $L_2$ , provided the "star" volumes around each vertex are uniformly bounded above and below (Goujon et al., 2022). For uniform grids, the optimal $\ell_2\to L_2$ condition number is $\sqrt{d+2}$ , independent of grid spacing.
Nonlocal (ReLU/Max-like Expansions): Representations such as sums of shifted ReLUs or max-affine expansions have poor $L_2$ conditioning: small parameter changes can induce large function changes, leading to condition numbers growing superlinearly in the number of pieces.

Hat basis methods are thus widely favored in numerical analysis and mesh-based modeling, while ReLU and max-min normal forms dominate in explicit neural and combinatorial representations.

5. Neural Network Realizations and Complexity Bounds

A central insight of PL representation theory is that (deep or shallow) ReLU neural networks universally realize all CPWL functions. Recent advances have greatly sharpened estimates of the neural complexity required (Chen et al., 2022, Zanotti, 17 Mar 2025, Calegari, 11 May 2025):

Any CPWL function with $q$ pieces can be realized by a ReLU network with $O(q^2)$ hidden neurons, independent of the ambient dimension (Chen et al., 2022).
If only $k$ distinct affine components occur, the bound improves to $O(kq)$ .
For continuous piecewise-affine functions $f:\mathbb{R}^2 \to \mathbb{R}$ with $p$ polygonal pieces (possibly nonconvex), $f$ can be represented exactly by a two-hidden-layer ReLU network with $O(p)$ hidden neurons; the construction decomposes $f$ into nested max-min of affines supported on triangle-like wedges, and parallelizes their computation (Zanotti, 17 Mar 2025).
The "infinite width, finite cost" setting—where a function is represented as an integral over ReLU units against a finite signed measure—collapses precisely to the finite-width shallow ReLU setting for CPWL functions (McCarty, 2023).

These results demonstrate that classical beliefs in exponential complexity are false for generic CPWL representations, though factorial behavior can occur in worst-case scenarios with many distinct linear components.

6. Multi-Dimensional and Group-Theoretic Extensions

PL representation theory extends to several multidimensional and structurally enriched contexts:

Piecewise Multi-Linear and Control Allocation: In flight control and other engineering domains, multidimensional, tensor-product PL representations enable exact, continuous, and closed-form Jacobian computation for control maps on rectilinear grids (Rajput et al., 2022).
Equivariant Representation Theory: The interplay between symmetry, group actions, and PL neural networks leads to a refined decomposition: layers decompose into isotypic components, PL activations (such as ReLU) yield an intricate poset of equivariant map types (the "PL-Schur Lemma"), and iterated ReLU applications correspond to a filtration generalizing classical Fourier series (Gibson et al., 2024).

7. Symbolic Dynamics, Markov Maps, and Diophantine Structure

PL representation applies in symbolic dynamics and arithmetic coding via countably piecewise linear Markov maps: such maps partition a (possibly non-compact) domain into intervals, with each branch an affine map over integer-indexed partitions and Markov transition structure (Kalocsai, 23 Aug 2025). Symbolic codings via Markov shifts, shadowing lemmas (robustness of orbits under noise), and Cantor-series expansions for orbits of rational points are central results. The "bottleneck criterion" characterizes when all rational orbits are eventually periodic, linking arithmetic, ergodic theory, and infinite-state symbolic dynamics.

Table: Key Representation Paradigms and Complexity Results

Paradigm	Representation Form	Complexity/Property
Homological/Triangulation	Sum over simplex functions $\tau(\Delta)$	$N \leq 2n$ terms for $n$ -simplex support (Calegari, 11 May 2025)
Max-Min/Arity Bound	Sum of $\max$ -terms, arity $\leq n+1$	Tight in dimension $n$ (Koutschan et al., 2024)
Abs-Normal/State-Space	$z = c + Zx + L\|z\|$ ; $y = b + Jx + Y\|z\|$	Universal for PL, supports Newton-type solvers (Griewank et al., 2017)
Local Hat Basis	Nodal sum on triangulation vertices	Riesz-basis, optimal $L_2$ conditioning (Goujon et al., 2022)
Block-Kronecker (PMLR)	Nested Kronecker expansion on grid	Exact interpolation, factorizes dimensionally (Rajput et al., 2022)
ReLU Neural Networks	Network of ReLU units with depth, width	$O(q^2)$ neurons for $q$ pieces; $O(p)$ for $p$ polygons in $\mathbb{R}^2$ (Chen et al., 2022, Zanotti, 17 Mar 2025)

References

“Triangulating PL functions and the existence of efficient ReLU DNNs” (Calegari, 11 May 2025)
“Improved Bounds on Neural Complexity for Representing Piecewise Linear Functions” (Chen et al., 2022)
“Stable Parametrization of Continuous and Piecewise-Linear Functions” (Goujon et al., 2022)
“Representing Piecewise-Linear Functions by Functions with Minimal Arity” (Koutschan et al., 2024)
“Linear-Size Neural Network Representation of Piecewise Affine Functions in $\mathbb{R}^2$ ” (Zanotti, 17 Mar 2025)
“Piecewise Linear Neural Networks and Deep Learning” (Tao et al., 2022)
“Solving piecewise linear equations in abs-normal form” (Griewank et al., 2017)
“Nonlinear Control Allocation Using A Piecewise Multi-Linear Representation” (Rajput et al., 2022)
“Equivariant neural networks and piecewise linear representation theory” (Gibson et al., 2024)
“Piecewise Linear Functions Representable with Infinite Width Shallow ReLU Neural Networks” (McCarty, 2023)
“Symbolic dynamics, shadowing and representation of real numbers with some countably piecewise linear Markov maps” (Kalocsai, 23 Aug 2025)
“Fully piecewise linear vector optimization problem” (Zheng et al., 2020)

Piecewise Linear Representation Theory thus offers a rigorous, algorithmically constructive toolkit foundational for nonlinear optimization, neural computation, control, and symbolic dynamics, with a broad spectrum of representations tied closely to combinatorial, homological, and group-theoretic structure.