Hierarchical Minimum Variance Portfolios

Updated 5 January 2026

Hierarchical Minimum Variance Portfolios (HMVPs) are portfolio construction methods that use hierarchical clustering and covariance cleaning to enhance statistical robustness and scalability.
They integrate bootstrapped clustering, recursive Schur complement decomposition, and two-step denoising to interpolate between heuristic diversification and global minimum variance solutions.
Empirical studies show HMVPs reduce volatility, improve risk diversification, and provide computational efficiency using adaptive allocation and parallel algorithmic schemes.

Hierarchical Minimum Variance Portfolios (HMVPs) are a family of portfolio construction methodologies that leverage hierarchical structures—drawn from hierarchical clustering, graph theory, and recursive matrix decompositions—to improve both the statistical robustness and the computational scalability of minimum variance portfolio optimization. This paradigm encompasses several key strands: covariance cleaning via bootstrapped hierarchical clustering, recursive allocation via hierarchical graph structures and Schur complements, and estimator-agnostic two-step approaches that embed hierarchical denoising into high-dimensional covariance estimation. These advances permit practitioners to interpolate between fully robust, highly diversified heuristics and exact global minimum variance portfolios (GMVPs), with efficient algorithms and direct connections to the classical Markowitz framework.

1. Hierarchical Covariance Modeling and Clustering

Hierarchical Minimum Variance methodologies begin by imposing hierarchical structure on the asset covariance or correlation matrix. Several approaches codify this idea:

Bootstrapped Average Hierarchical Clustering (BAHC): This method applies average-linkage agglomerative clustering to the sample correlation matrix, assigning the distance $d_{ij}=1-c_{ij}$ . At each clustering level, the pair of clusters with minimum average distance is merged, yielding a dendrogram $\mathcal{G}$ . The HCAL filter (average-linkage cluster-averaged) replaces off-diagonal correlations $c_{ij}$ with $c_{ij}^{<}=1-\rho_{pq}$ upon first co-occurrence in a merged node $(p,q)\in \mathcal{G}$ , ensuring positive semidefiniteness in the presence of a strong global mode (Bongiorno et al., 2020).
Hierarchical Graphs and Block Structures: In advanced formulations, the covariance matrix $\Sigma$ is modeled as a weighted graph $G=(V,E,\Sigma)$ , where nodes represent assets and edge weights reflect covariances (Mograby, 16 Mar 2025). The node set is recursively partitioned into junction nodes (connecting clusters) and interior nodes (contained within a base cluster), organizing $\Sigma$ into a sequence of $2\times 2$ block matrices suitable for recursive decomposition.
Hierarchical-Nested Covariance Models: In synthetic and empirical studies, population covariance matrices are often constructed to possess explicit hierarchical structures, such as a loading matrix $L$ with nested banded form ( $L_{i,j} = \gamma \cdot 1_{j \leq p-i+1}$ ), resulting in eigenvalue spectra and dendrograms with deeply nested splits (García-Medina, 2024).

These hierarchical representations induce statistical regularization and localization, which concentrate the principal directions of risk and mitigate estimation error in high-dimensional settings.

2. Covariance Cleaning via Hierarchical Filters and k-BAHC

Accurate estimation of the covariance matrix is central to minimum variance portfolio construction. Hierarchical Minimum Variance approaches employ two primary denoising paradigms:

BAHC and k-Fold Boosted BAHC: The BAHC estimator softens the deterministic output of a hierarchical filter by averaging over $m$ bootstrap resamples of the returns matrix $R$ , each subjected to the HCAL process. The k-fold boosted extension, k-BAHC, iteratively applies hierarchical filtering to the residual correlation structure at each level:

$\begin{aligned} E_{(1)}^{(b)} &= C^{(b)} - C_{(1)}^{(b)<} \ C_{(2)}^{(b)<} &= C_{(1)}^{(b)<} + \mathrm{HCAL}(E_{(1)}^{(b)}) \end{aligned}$

For general $k$ , this recursion is:

$E_{(k)}^{(b)} = C^{(b)} - C_{(k)}^{(b)<}, \qquad C_{(k+1)}^{(b)<} = C_{(k)}^{(b)<} + \mathrm{HCAL}(E_{(k)}^{(b)})$

The final cleaned estimator is $\Sigma^{\,k\text{-BAHC}}_{ij}=c_{ij}^{\,k\text{-BAHC}}\,\sqrt{\sigma_{ii}\,\sigma_{jj}}$ , averaged across bootstraps. Small negative eigenvalues (for $k>1$ ) are truncated to preserve PSDness (Bongiorno et al., 2020).

Two-Step Hierarchical Denoising: A two-step procedure first applies high-dimensional shrinkage (Ledoit–Péché, Stein, or deterministic equivalents) to the sample covariance, then submits the denoised matrix to average-linkage clustering (ALCA). Reconstructed block-diagonalized covariances $\Xi^{2S}$ are used in subsequent optimizations. This approach further integrates random matrix theory–based denoising with hierarchical clustering, leading to improved out-of-sample and diversification statistics in both synthetic and S&P 500 settings (García-Medina, 2024).

3. Hierarchical Schur Complement and Recursive Weight Computation

Portfolio optimization is often recast as a recursive scheme exploiting the block structure of hierarchical covariance matrices:

Schur Complement Decomposition: At any hierarchical level $\ell$ , the covariance matrix is organized as:

$\Sigma_\ell = \begin{bmatrix} T_\ell & J_\ell^\top \ J_\ell & X_\ell \end{bmatrix}$

The unconstrained minimum variance weights $w^*$ satisfy $w^* \propto \Sigma^{-1}1$ . By applying the Schur complement $S(\Sigma_\ell) = T_\ell - J_\ell^\top X_\ell^{-1}J_\ell$ , weights for junction nodes and interior nodes can be computed recursively:

$\begin{aligned} w_\mathrm{jun} &= S(\Sigma_\ell)^{-1} \left[1_\mathrm{jun} - J_\ell^\top X_\ell^{-1}1_\mathrm{in}\right] \ w_\mathrm{in} &= X_\ell^{-1} (1_\mathrm{in} - J_\ell w_\mathrm{jun}) \end{aligned}$

Repetition of this process down to the base cluster ( $\ell=0$ ) permits computation of $w^*$ via repeated inversion of small submatrices, greatly reducing total computational cost (Mograby, 16 Mar 2025).

Schur-Augmented Hierarchical Risk Allocation: More generally, the allocation at each recursion can be augmented with a parameter $\gamma\in[0,1]$ that interpolates between pure hierarchical risk parity (HRP, $\gamma=0$ —blocks are recursively decoupled) and classical minimum variance (full use of off-block covariance, $\gamma=1$ ). Theoretical results show that for full-rank $\Sigma$ , the Schur-augmented recursion (HMV) with $\gamma=1$ exactly recovers Markowitz GMVP allocations. As $\gamma$ is reduced, the solutions trade optimality for out-of-sample stability (Cotton, 2024).

4. Reactive, Empirical, and Algorithmic Implementations

HMVP methods support a wide range of practical workflows, including reactive updating and scalable algorithms:

Reactive Estimation: In k-BAHC, covariance estimation is performed on a short rolling window of returns (often $\tau_\text{in} < n$ assets), recalibrated at each rebalancing date (e.g., every 21 trading days). This reactivity allows rapid adjustment to shifting market conditions and asset compositions without reliance on longer-memory conditional volatility models. No additional smoothing or shrinkage is required beyond hierarchical and bootstrap regularization (Bongiorno et al., 2020).
Efficient Algorithmic Schemes: The recursive application of Schur complements in hierarchical graphs, together with the divide-and-conquer allocation, enables computation of GMVP weights while only inverting submatrices of size equal to the base cluster (often $m_0\ll n$ ). Complexity thus reduces from $O(n^3)$ to $O((\sum_k b_k + \ell) m_0^3)$ , where $b_k$ is the number of blocks and $\ell$ the depth of the hierarchy. Large synthetic examples (e.g., Sierpiński graphs at level 2, $n_2=15$ ) confirm that the recursive scheme exactly matches direct inversion while reducing computational cost (Mograby, 16 Mar 2025).
Pseudocode Pipelines: Both covariance denoising and recursive allocation steps are implementable by compact, highly parallel algorithms. A canonical optimization pipeline consists of data windowing, covariance estimation (including shrinkage and hierarchical filtering), constrained/unconstrained quadratic program solution, and evaluation of out-of-sample performance metrics (variance, Sharpe, leverage, turnover, diversification) (García-Medina, 2024, Cotton, 2024, Mograby, 16 Mar 2025).

5. Empirical Performance and Sensitivity Analysis

Hierarchical Minimum Variance approaches exhibit distinctive risk and allocation properties:

Volatility and Sharpe Improvement: On US equity data (1999–2020), k-BAHC portfolios achieved realized volatilities up to 20% lower and Sharpe ratios up to 18% higher than cross-validated nonlinear shrinkage (CV) and sample $\Sigma$ benchmarks. The optimal hierarchical recursion order $k^*$ grows linearly with calibration window $\tau_\text{in}$ , e.g., $k^*\simeq 0.12\,\tau_\text{in}$ for long–short portfolios (Bongiorno et al., 2020).
Risk Diversification and Concentration: Two-step hierarchical denoising systematically reduces concentration (Herfindahl index) and leverage in high-dimensional synthetic and S&P 500 data, e.g., $~25–30\%$ reduction in HHI and leverage for $p=441$ stocks, maintaining lower realized variance than uniformly weighted portfolios (García-Medina, 2024).
Stability and Symmetry Restoration: Schur-augmented HMV allocations smooth allocation asymmetries that afflict HRP under block-permutation, recovering full-MV symmetry as $\gamma\to 1$ , with convergence in out-of-sample variance and improved stability across multiple random draws of $\Sigma$ (Cotton, 2024).
Trade-offs: Increased recursive depth ( $k$ in k-BAHC, or $\gamma$ in HMV) typically lowers portfolio risk but raises turnover and concentration. Long-only and long–short modes both benefit, but gains are most pronounced for long–short and in periods of volatile, changing market conditions (Bongiorno et al., 2020, Cotton, 2024).

The hierarchical minimum variance framework unifies and generalizes earlier approaches:

Markowitz and Minimum Variance: Classical mean–variance optimization is recovered as a limiting case where all cross-block covariances are harnessed and the hierarchy spans the entire universe.
Hierarchical Risk Parity (HRP): The divide-and-conquer allocation procedure and tree-structured seriation are inherited from HRP. However, standard HRP discards cross-group correlations at recursion splits. Schur-augmented HMVP frameworks systematically re-integrate these cross–block effects via the Schur complement, allowing continuous interpolation (Cotton, 2024, Mograby, 16 Mar 2025).
Random Matrix Theory and Denoising: Two-step hierarchical estimators and k-BAHC are compatible with shrinkage and nonparametric denoising techniques based on random matrix theory, addressing the curse of dimensionality when $p \sim n$ (García-Medina, 2024).
Computational Scaling: By algorithmically resolving only small matrix inversions, HMVPs enable practical GMVP construction in large universes and support real-time recalibration—empowering applications in high-frequency and large-scale asset management (Mograby, 16 Mar 2025).

7. Practical Recommendations and Implementation Considerations

Deployment of HMVPs entails several considerations:

Parameter Selection: The recursive depth (k-fold in k-BAHC) and the block-size threshold or $\gamma$ in Schur-augmented schemes must be tuned to the calibration window and asset universe size. Rule-of-thumb fits and empirical cross-validation guide this process (Bongiorno et al., 2020).
Transaction Costs and Turnover: Hierarchical cleaning and deeper hierarchies reduce volatility at the cost of higher turnover. Empirical results indicate transaction cost–adjusted outperformance remains robust in realistic backtests (2 bps round-trip assumed) (Bongiorno et al., 2020).
Software and Parallelization: The computation is readily parallelizable across bootstrap samples and block inversions. Open-source implementations such as the "bahc" Python package (version 1.4) are available (Bongiorno et al., 2020).
Estimation Stability: Covariance estimates should be regularized (via shrinkage or eigenvalue truncation) to maintain positive-definiteness at all hierarchy levels. Adaptive schemes may modulate recursion depth or off-diagonal dampening to accommodate noisy or nearly singular sub-blocks (Cotton, 2024, Mograby, 16 Mar 2025).

Hierarchical Minimum Variance Portfolios, through their integration of hierarchical clustering, advanced covariance cleaning, recursive Schur-complement allocation, and connections to both robust heuristics and classical optimization, establish a transparent, scalable, and empirically validated approach to portfolio risk minimization in modern high-dimensional and nonstationary financial environments (Bongiorno et al., 2020, Mograby, 16 Mar 2025, García-Medina, 2024, Cotton, 2024).