ReLU Cartier Divisor in Neural Networks
- The ReLU Cartier divisor is an algebraic-geometric construct that represents piecewise linear functions of unbiased ReLU networks as support functions on toric varieties.
- It leverages hyperplane arrangements and slope vectors to form a ReLU fan, encoding activation patterns and defining equivalence classes under affine transformations.
- Intersection numbers with torus-invariant curves provide exact criteria for function realizability, linking network expressivity with principles of tropical geometry.
The ReLU Cartier divisor is an algebraic-geometric construct that encapsulates the piecewise linear structure induced by ReLU activation functions in feedforward neural networks. By reinterpreting the output function of a network as a support function for a (ℚ–)Cartier divisor on an associated toric variety, this framework builds a formal bridge between deep learning architectures and the mathematical apparatus of toric and tropical geometry. The notion is formulated for unbiased ReLU networks (networks without bias terms and with rational weights) and leverages the hyperplane arrangement determined by the network to define the polyhedral fan, toric variety, and corresponding divisor.
1. Formal Definition and Construction
Given a ReLU neural network with architecture and rational weights, the output function is continuous and finitely piecewise linear. The construction proceeds by associating the so-called ReLU fan —a polyhedral complex formed from the arrangement of bent hyperplanes in input space determined by the network’s activation patterns. Each maximal cone corresponds to a region over which is affine linear, with slope vector .
The ReLU toric variety, , is then built by gluing affine toric pieces corresponding to these cones. The ReLU Cartier divisor is defined by interpreting (modulo global affine terms) as a support function on the fan, specifically: where represents the “Cartier data” for each cone . Thus, encodes the collection of slope vectors for across all regions of linearity.
2. Algebraic-Geometric Framework and Equivalence Classes
Any two piecewise linear functions and differing by a global affine function define the same Cartier divisor on . This invariance under addition of affine functions makes the divisor well-adapted to capture only the “bending” information of , i.e., the essential combinatorial data controlled by the network architecture and activation pattern. The toric approach is thus suited to questions of function expressivity for ReLU networks, reducing the investigation to equivalence classes of support functions modulo affine shifts.
3. Exact Realizability and Intersection-Theoretic Criteria
The core motivation for developing the ReLU Cartier divisor is to translate the exact realization problem (which piecewise linear functions can a network compute?) into the language of divisors and intersection theory. For a fixed network architecture, necessary and sufficient conditions for realization are stated in terms of intersection numbers of with torus-invariant curves in .
Specifically, when two maximal dimensional cones and meet along a wall of the fan, the intersection number is given by: where is a primitive vector associated with the wall. For shallow unbiased architectures, realization requires that the intersection numbers computed over different walls of a fixed hyperplane must be equal—a uniformity expressing the symmetry constraints of the network’s connectivity.
4. Piecewise Linearity, Tropical Geometry, and Newton Polytopes
The piecewise linearity of ReLU networks is reflected naturally in tropical geometry. Tropical polynomials are expressions of the form
with tropical addition as and multiplication as , yielding continuous, convex, piecewise linear functions equivalent to (general) ReLU outputs. The Newton polytope , the convex hull of the exponents , participates in the description of the ReLU Cartier divisor via the relation
where denotes the polytope associated to the Cartier divisor . The mixed volume of equates to the algebro-geometric volume of the associated line bundle on the toric variety—a quantitative measure of network expressivity and combinatorial complexity.
5. Intersection Numbers and Characterization of Network Expressivity
Intersection theory for the ReLU Cartier divisor provides explicit algebraic criteria for function realizability. In this framework, the divisor's intersection numbers with torus-invariant curves summarize the “bends” or transitions between the network's affine regions. By computing these numbers and demanding that they satisfy symmetry and uniformity conditions matching the network’s architecture, one obtains a characterization: only functions whose associated divisor exhibits the prescribed intersection pattern can be exactly realized by the network.
6. Connections to Toric and Tropical Geometry
The identification of ReLU Cartier divisors with support functions in toric geometry harmonizes the paper of ReLU neural networks with tropical algebra, where max-linear structure governs function shape. This duality facilitates the use of combinatorial and algebraic invariants—volume, intersection number, equivalence class—traditionally reserved for toric varieties, to paper neural network architectures. The encoding of network output as a divisor unifies neural network theory and algebraic geometry, providing new perspectives on expressivity and approximation in terms of classical mathematical structures.
7. Context and Broader Implications
The construction of the ReLU Cartier divisor does not imply a literal operational or computational use of divisors in neural network training; rather, it gives an invariant encoding of the output space, enabling exact analysis of realizability and expressivity. While superficial analogies exist between the piecewise nature of ReLU activations and divisors, the connection is made precise specifically within the toric and tropical frameworks. This suggests that the techniques developed may generalize to other settings involving piecewise linear approximations and support function analysis, and may provide analytic tools for studying the geometry of more advanced neural architectures as well as the structure of their function spaces.
In summary, the ReLU Cartier divisor formalizes the geometry of piecewise linear functions produced by feedforward ReLU networks as support functions on associated toric varieties, facilitating exact characterizations of network expressivity in terms of intersection-theoretic and combinatorial invariants (Fu, 7 Sep 2025).