Max-linear Bayesian Networks

Updated 15 November 2025

Max-linear Bayesian networks are directed graphical models where each variable is defined as a max-linear function of its parents and exogenous innovations, capturing extreme-value behavior.
They leverage tropical (max-plus) algebra to compute maximal-weight paths, which is key for identifying causal extremal dependencies and quantifying heavy-tailed effects.
Structure learning adapts classical d-separation into novel criteria like *- and C*-separation, enabling robust parameter estimation even with hidden variables.

Max-linear Bayesian networks are a class of directed graphical models designed to capture the causal and extremal dependencies in multivariate systems, especially those with heavy-tailed or extreme-value behavior. The defining feature is that each node variable is generated as a max-linear function of its parents in a directed acyclic graph (DAG), plus an exogenous innovation. This non-additive structure motivates alternative separation, identification, and estimation criteria distinct from classical Gaussian or discrete Bayesian networks.

1. Model Formulation and Algebraic Structure

A max-linear Bayesian network (MLBN) is specified by a DAG $\mathcal{D} = (V,E)$ on $d$ nodes, edge weights $c_{uv} > 0$ for $u \to v \in E$ , and independent innovations $Z_v \sim RV(\alpha)$ , typically with Fréchet or Pareto tails. The recursive structural equation is:

$X_v = \bigvee_{u \in pa(v)} c_{vu} X_u \vee Z_v$

where $\vee$ denotes the coordinatewise maximum. The joint solution (by iterative substitution, leveraging acyclicity) is expressible as:

$X_v = \bigvee_{j \in An(v)} b_{vj} Z_j$

with $b_{vj}$ the maximal product of weights along all directed paths from $j$ to $v$ .

In tropical (max-plus) algebra, this can be compactly represented as $X = C^* \odot Z$ where $C^*$ is the Kleene star of $C$ , computing all maximal-weight paths (see also (Klüppelberg et al., 2019, Améndola et al., 15 Nov 2024)). The variables $X_1,\ldots,X_d$ thus reflect an extremal propagation mechanism via the DAG, inheriting the heavy-tailed nature of $Z$ .

2. Conditional Independence and Separation Criteria

MLBNs fundamentally deviate from classical Bayesian networks in their conditional independence (CI) structure. While classical models utilize d-separation for Markov and faithfulness properties, max-linear systems allow additional CI statements not detectable by d-separation due to max-weight path overlaps.

Three separation concepts arise:

d-separation: Classical, based on path blocking by conditioning.
*-separation: Paths are valid only if they contain at most one collider (Améndola et al., 2021, Améndola et al., 2020, Améndola et al., 19 Aug 2025), capturing more refined dependencies given the max operator.
$C^\ast$ -separation: For weighted DAGs, only critical paths (maximal weight) avoiding the conditioning set must be considered for separation (Boege et al., 29 Apr 2025, Améndola et al., 15 Nov 2024).

$C^\ast$ -separation precisely captures the compositional graphoid of the MLBN's CI structure (termed "maxoid" Editor's term (Boege et al., 29 Apr 2025)). Every maxoid is associated to a stratification of the weight space by a polyhedral fan, and is in general not representable by Gaussian or discrete models. Faithfulness to d-separation is not generic (Klüppelberg et al., 2019), but $C^\ast$ -separation achieves completeness for conditional independence.

3. Parameter Estimation and Identifiability

Classical maximum likelihood estimation is generically invalid for MLBNs because different parameterizations yield mutually singular distributions and the likelihood is non-dominated (Klüppelberg et al., 2019, Ferry, 8 Nov 2025). Instead, estimation exploits structural properties:

Minimum-ratio estimator: For each edge $c_{ji}$ , $\hat{c}_{ji} = \min_v X_i^{(v)} / X_j^{(v)}$ , selecting the minimal nonzero ratio across observations (Klüppelberg et al., 2019, Ferry, 8 Nov 2025).
Scalings via spectral measure: Nonparametric estimation of the scaling matrix $\Sigma$ uses empirical measures on "tail angles" $\omega = X/\|X\|$ for extremal samples, estimating $\sigma_{ij}^2 = \int \omega_i \omega_j dH_X(\omega)$ (Klüppelberg et al., 2019). Parameters are then recursively recovered by solutions to linear systems involving these scalings.
Tropical/geometric methods: Data, under the log-transform, naturally lives within a tropical polytope (or polytrope). The minimum bounding polytrope algorithm computes the smallest tropical region enclosing the data, whose facets correspond to recoverable edge weights. Identifiability is determined by the weighted transitive reduction: only edges realizing unique critical paths can be estimated (Améndola et al., 15 Nov 2024, Ferry, 8 Nov 2025).

Identifiability is precisely characterized: the minimal DAG supporting the observed distribution is uniquely determined by the atomic structure of certain ratio statistics and the underlying polytrope (see section 6 for algorithms).

4. Structure Learning and Graph Recovery

Structure learning in max-linear Bayesian networks requires adaptation from classical methods. Generic constraint-based algorithms (e.g., PC algorithm) are inconsistent under d-separation for MLBNs due to non-faithfulness (Améndola et al., 19 Aug 2025). However, given oracle access to *-separation or $C^\ast$ -separation, perfect recovery is possible:

PC* algorithm: Utilizes $C^\ast$ -separation to reconstruct the weighted transitive reduction (critical edges) and orients colliders and induced cycles via separator sets. Complexity is polynomial in the number of vertices and maximal indegree (Améndola et al., 19 Aug 2025).
Set-cover and sample complexity: Full recovery of all edges (for known node ordering) is tied to sampling extreme datapoints that attain the critical ratios corresponding to each edge. The minimum number of samples required for best-case recovery equals the set-cover number of the associated tropical polytope's subdivision (Ferry, 8 Nov 2025).

Not all noncritical edges are recoverable. Identifiability is robust to missing nodes only under specific path conditions: on trees of transitive tournaments, all parameters can be identified if all unobserved nodes are roots in at least one tournament and have at least two children (Segers et al., 2022).

5. Extensions: Noise, Hidden Variables, and Discrete Models

MLBNs admit substantial generalization:

Additive noise: Introducing multiplicative Gaussian or log-normal noise preserves the tropical structure in log-coordinates and enables inference via Gaussian mixture EM or quadratic programming approaches. Edge parameter estimators are asymptotically normal if sufficient "edge-activating" samples are available (Adams et al., 1 May 2025).
Hidden nodes: The minimal representation for observed marginals is characterized by specific causal path properties, one can recover if max-weighted paths can be observed without hidden confounders. Two-step thresholding methods enable asymptotically normal testing for extremal dependence even in the presence of latent variables (Krali et al., 2023).
Discrete max-linear networks: The discrete innovation case aligns with conjunctive Bayesian networks when all nodes are binary, and constitutes a family of toric varieties under Möbius inversion of joint distributions. Maximum-likelihood estimation reduces to toric optimization problems (Hollering et al., 2021).

6. Applications and Illustrative Examples

Max-linear Bayesian networks have been applied across domains where extremal dependence is paramount:

Finance: Recovery of tail-risk architectures in industry portfolios; uniquely identifies root-risk propagators (e.g. Chemicals in US sector returns) and their influence paths (Klüppelberg et al., 2019).
Nutrition epidemiology: Directional DAGs among nutrients recovered from NHANES dietary intake data; e.g., geometric estimation recovers α-carotene $\to$ lutein+zeaxanthin $\to$ vitamin A (Klüppelberg et al., 2019, Ferry, 8 Nov 2025).
River-flood networks: Estimation of critical paths and polytropes for hydrological systems, robust to heavy-tailed data (Ferry, 8 Nov 2025).
Simulations: High recovery rates of structure and weights ( $>95\%$ for sample sizes $n\geq 5,000$ ), but success is contingent on sufficient representation of each edge/path in the sample (Klüppelberg et al., 2019, Adams et al., 1 May 2025).

7. Connections to Tropical Geometry, Graphoids, and Future Research

MLBNs are intimately linked with tropical geometry via the embedding of solution supports in polytropes. The conditional independence models ("maxoids") stratify the weight space into polyhedral fans, and compose a strict superset of classical Markov graphoids. Every maxoid is attainable via transitive closures and reductions of the DAG (Boege et al., 29 Apr 2025, Améndola et al., 15 Nov 2024).

Questions remain regarding the algebraic structure of the tail dependence matrix, completeness of tropical rank constraints for CI, and extensions to time-series, max-stable models, and non-independent innovations. Algorithmic advances are targeting efficient constraint-based learning leveraging tropical convexity, as well as improved sample-efficient methods for extremal dependence recovery.

Max-linear Bayesian networks constitute a mathematically rich and practically salient paradigm for extremal causal modeling. They generalize classical Bayesian networks to the field of heavy-tailed dependence, deploy a complete separation and estimation theory grounded in tropical geometry, and motivate novel combinatorial, statistical, and geometric methodologies for data-driven graph recovery. Their applicability to domains where extremes dominate—especially finance, hydrology, and epidemiology—is a focal point of current research.