Mergeable Trees in Computational Topology
- Mergeable trees are specialized rooted tree structures that represent hierarchical merging in sublevel set filtrations by encoding critical points and merge events.
- They support efficient algorithmic operations such as nearest common ancestor, insertions, and dynamic merges, often achieving O(log n) time complexity under various paradigms.
- Mergeable trees enable advanced statistical analysis by providing a geometric structure for computing barycenters, medians, and geodesics via the interleaving distance metric.
A mergeable tree is a specialized rooted tree structure that encodes the hierarchical merging of connected components in sublevel set filtrations of real-valued functions, particularly within the context of computational topology and topological data analysis (TDA). Mergeable trees synthesize combinatorial, algorithmic, and metric perspectives, supporting both dynamic data-structural operations and topological statistics grounded in the interleaving distance. Their mathematical formalism, algorithmic construction, connection to discrete Morse theory, and statistical implications have been established in foundational works, notably by Georgiadis et al. (data structures) (0711.1682), Munch–Stefanou (metric/geodesics/statistics) (Gasparovic et al., 2019), Scoville–Wen (realizability with homological sequences) (Scoville et al., 2024), and subsequent refinements on computation and averaging (Touli et al., 28 Feb 2026).
1. Formalism: Merge Trees and Mergeable Trees
Let be a continuous scalar function on a connected topological space. For each , the sublevel set defines a filtration where new components appear at minima and pairs of components merge at saddle points. A merge tree encodes the genealogy of these components:
- is a finite rooted tree.
- is strictly decreasing along every path from the root toward leaves.
- Nodes correspond to critical points; edges encode mergings.
- The root has the maximal value, consistent with either orientation convention (e.g., time-to-root).
A mergeable forest, as studied in algorithmics, is a heap-ordered collection of merge trees with the invariant that, for every edge , the label (i.e., key) at is no less than that at its parent (0711.1682).
2. Data Structures and Algorithms for Mergeable Trees
The mergeable tree data structure addresses the efficient maintenance of dynamic forests under a rich suite of operations and, critically, supports a merge operation that fuses two upward paths into a single heap-ordered path, possibly altering many parent pointers at once. Supported operations (as defined in (0711.1682)) include:
- insert, parent0, root1
- nca2: nearest common ancestor query
- cut3, delete4
- merge5: path-merge as described above
Three main algorithmic paradigms exist:
| Paradigm | Time per Merge | Restrictions |
|---|---|---|
| Dynamic trees + topmost | 6 | General (cuts allowed) |
| Partition by rank | 7 | No cuts |
| Implicit representation | 8 | No cuts, no parent query |
The "dynamic trees + topmost" approach relies on augmenting classical dynamic trees with a topmost query and uses a potential-based amortized analysis. Partition-by-rank leverages solid/dashed path decompositions and finger-search trees, yielding 9 amortized merges when cuts are disallowed. The implicit method simulates mergeable trees in a link-cut tree or similar black-box framework and matches the 0 bounds for all cases except those requiring parent queries.
Lower bounds reflect reductions to dynamic connectivity or sorting; for general merges with cuts, 1 amortized time is required. In special cases (no cuts, restricted queries), optimality matches upper bounds (0711.1682).
3. Merge Trees as Topological Invariants and Discrete Morse Functions
Given a discrete Morse function 2 on a finite tree 3 (the combinatorial analog of a scalar field), the associated merge tree 4 summarizes the sequence of topological changes in connected components:
- Leaves of 5 correspond to births at critical vertices; interior nodes represent merges at critical edges.
- Each merge is assigned a chirality (left/right) determined by a merging rule, often governed by the minimum birth index ("elder rule").
- The merge tree is necessarily a full binary tree (every internal node has exactly two children).
Every merge tree can be realized as the output of some discrete Morse function on a path graph. Explicit constructive methods yield either an index-ordered or sublevel-connected Morse function 6 such that 7 for any binary tree 8 (Brüggemann, 2021). Equivalence classes of Morse functions mod out by symmetry or shuffle relations correspond precisely to merge trees with appropriate labelings (Ml-trees).
On general trees, the cm-equivalence defined in (Brüggemann, 2021) generalizes this correspondence: for every discrete Morse function (modulo component-merge equivalence), there exists a unique labeled merge tree.
4. Interleaving Distance and Metric Geometry of Merge Trees
The interleaving distance 9 formalizes when two merge trees are "similar" in terms of their topological summary behavior. This metric, originally inspired by persistent homology, admits two equivalent formulations: via 0-interleaving maps (with control on ancestor relations and value shifts) and via 1-good maps (with shift, controlled fiber-diameter, and shallow-off-image conditions) (Gasparovic et al., 2019, Touli et al., 28 Feb 2026).
On the space of labeled merge trees with fixed label set, 2 reduces to the 3 distance on the corresponding "ultra-matrices" 4, which are relaxed ultrametrics. In this 5 model, the metric is strictly intrinsic: the linear interpolation of ultra-matrices yields geodesic paths and enables explicit barycenter and principal geodesic constructions. The unlabeled case requires combinatorial minimization over alignments but retains the intrinsic (geodesic) property (Gasparovic et al., 2019).
5. Statistical Analysis and Averaging in the Space of Merge Trees
The intrinsicness of 6 equips the space of merge trees with a geometric structure supporting statistical summaries—means, medians, and Fréchet barycenters (Gasparovic et al., 2019). Given two merge trees 7, 8 with 9, a midpoint (average) merge tree 0 is any 1 such that 2, 3 (Touli et al., 28 Feb 2026).
A constructive algorithm for 4 proceeds in four phases:
- Augment the trees with Steiner nodes so that pairs of critical heights shared up to 5 are explicit; compute a 6-good correspondence.
- Shift the entire first tree upward by 7 in value.
- Graft outside-image subtrees from the second tree, shifting them downward by 8.
- Enforce 9-goodness at identified points via local lowest common ancestor adjustments.
This ensures the midpoint satisfies the interleaving conditions for all pairs of associated points. The complexity is dominated by the computation of the 0-good map (Touli et al., 28 Feb 2026).
For larger datasets, (Gasparovic et al., 2019) describes an 1 algorithm for the 1-center in labeled merge trees: compute the entrywise 2 midpoint of the ultra-matrices, project back to a valid ultra-matrix, and construct the merge tree via minimum-spanning-tree sublevel-set algorithms.
These developments form the analytical backbone for statistics and clustering in merge-tree spaces, directly enabling applications in uncertainty quantification and hypothesis testing.
6. Realizability with Prescribed Homological Sequences
The merge tree 3 and the 4th Betti number sequence (homological sequence) 5 induced by a discrete Morse function 6 on a tree 7 are distinct yet related descriptors of the filtration's topology (Scoville et al., 2024). For a full binary merge tree 8 and any sequence 9 dominating the canonical homological sequence 0 associated to 1, there exists a construction yielding a discrete Morse function 2 on a tree realizing both invariants: 3 and 4.
The explicit algorithm assigns a total order to 5's nodes according to 6, attaches vertices/edges in the prescribed order, and ensures at each stage that the number of components matches 7 by attaching accordingly. Construction is possible whenever 8 dominates 9 pointwise and matches in length, but not every pair is realizable; incompatibilities arise if these conditions fail.
A plausible implication is that, for merge trees derived from generic Morse functions, the pair 0 summarizes all information needed for certain persistent homological inferences, modulo construction constraints (Scoville et al., 2024).
7. Applications and Broader Significance
Mergeable trees unify algorithmic, algebraic, and metric approaches to understanding hierarchical mergers in data. In computational topology, their structure underpins efficient dynamic connectivity, topological simplification, and fast statistical summaries. In topological data analysis, the existence of canonical averages, explicit geodesics, and the reconciliation of Morse-theoretic with persistent homological information position mergeable trees as central objects for scalable quantification, uncertainty modeling, and machine learning over hierarchical topological summaries (0711.1682, Gasparovic et al., 2019, Touli et al., 28 Feb 2026, Scoville et al., 2024, Brüggemann, 2021).