TreeSeg: Hierarchical Segmentation & Interval Queries

Updated 2 March 2026

TreeSeg is a tree-based framework encompassing both hierarchical segmentation of noisy ASR transcripts and a dynamic BITS-Tree for efficient segment storage.
It employs recursive divisive clustering with windowed embeddings to partition transcripts into semantically coherent segments while addressing noise and variable segment counts.
The BITS-Tree component supports efficient point and range queries through logarithmic operations, making it effective for dynamic interval querying in large datasets.

TreeSeg refers to distinct concepts in the literature: (1) an algorithm for hierarchical topic segmentation of large transcripts, and (2) a dynamic data structure for efficient segment storage and interval queries (the BITS-Tree). Both rely on tree-based representations to organize, partition, or index sequential data. This article reviews both domains for completeness.

1. Hierarchical Topic Segmentation: TreeSeg Algorithm

TreeSeg is an approach for hierarchical, structure-preserving topic segmentation of long, noisy transcripts, notably those generated by Automatic Speech Recognition (ASR) systems. The core objective is to partition a temporally ordered sequence of utterances $U = [U_1, \ldots, U_T]$ into $K$ contiguous, semantically coherent segments $(P_1, \ldots, P_K)$ , despite noise, uncertain ground-truth $K$ , and large $T$ (Gklezakos et al., 2024).

The method addresses three principal challenges: persistent ASR noise, variable and ambiguous segment counts, and computational efficiency for large transcripts. TreeSeg outputs a binary tree segmentation, allowing flexible control of granularity by tree cutting at arbitrary depth.

2. Mathematical Formulation and Algorithm

Each utterance $U_t$ is embedded via a windowed approach using a pre-trained, frozen embedding model $f$ (e.g., ADA, SBERT, RoBERTa), computing $e_t = f(B_t) \in \mathbb{R}^d$ for the block $B_t = [U_{\max(1, t-W)}, \ldots, U_t]$ with window width $W$ .

TreeSeg employs divisive clustering recursively, searching for split index $i$ within a segment $E_v = [e_s, \ldots, e_e]$ to minimize the within-cluster squared Euclidean distance:

$\mathcal{L}_v(i) = \sum_{t=s}^{i-1} \|e_t - \mu_L\|^2_2 + \sum_{t=i}^{e} \|e_t - \mu_R\|^2_2$

where

$\mu_L = \frac{1}{i - s} \sum_{t=s}^{i-1} e_t, \quad \mu_R = \frac{1}{e - i + 1} \sum_{t=i}^{e} e_t$

A minimum segment size $M$ is enforced for split validity ( $s+M\leq i \leq e-M+1$ ). The binary tree is constructed recursively: at each step, the “best” leaf to split and its best split point are selected (via a min-heap over candidate losses), and splitting proceeds until a user-specified number of leaves $K$ is reached, or segments become too small.

This recursive, batch-divisive procedure yields a hierarchical representation, allowing for efficient “zooming” into transcript structure at arbitrary resolutions.

3. Noise Robustness, Embedding Model Integration, and Computational Properties

By embedding blocks instead of isolated utterances and averaging in clustering, TreeSeg attenuates the effect of local ASR noise. The model $f$ is not fine-tuned; no additional dimensionality reduction or smoothing is applied except for windowing and averaging. This fosters portability and reproducibility.

The complexity is as follows: for $T$ utterances, $d$ -dimensional embeddings, and final $S$ segments,

Embedding: $O(Td)$
Precomputation (cumulative sums): $O(Td)$
All splitting: $O(T\log S)$ (expected, from geometric shrinkage)
Heap maintenance: $O(S\log S)$ Overall, $O(Td + T\log S + S\log S)$ , which is near-linear in $T$ for practical $S \ll T$ .

Memory usage is $O(Td)$ for embeddings, $O(T)$ for accumulated statistics, and $O(S)$ for tree nodes and heap.

4. Empirical Evaluation and Results

TreeSeg was evaluated on three datasets:

ICSI: 75 meetings with up to 4-level segmentation, mean 1454 utterances
AMI: >100 hours, up to 3 levels, mean 636 utterances
TinyRec: 21 technical sessions, 2-level segmentation, mean 267 utterances

Comparisons included RandomSeg, EquiSeg, HyperSeg (TextTiling-style, hyperdimensional embeddings), and BertSeg (TextTiling-style, BERT blocks). Performance was assessed using $P_k$ and WindowDiff metrics (lower is better).

Aggregate multi-level results: | Corpus | Metric | TreeSeg | Next Best (BertSeg) | |---------|---------------|---------|---------------------| | ICSI | $P_k$ | 0.310 | 0.388 | | | WinDiff | 0.353 | 0.432 | | AMI | $P_k$ | 0.355 | 0.443 | | | WinDiff | 0.396 | 0.480 | | TinyRec | $P_k$ | 0.367 | 0.473 | | | WinDiff | 0.382 | 0.486 |

Per-level scores also favor TreeSeg, e.g., on ICSI level 1: $P_k=0.28$ (TreeSeg) vs $0.343$ (BertSeg), WinDiff $0.314$ vs $0.386$. TreeSeg outperforms all baselines across datasets and at multiple resolutions (Gklezakos et al., 2024).

5. Limitations and Future Directions in Transcript Segmentation

Current validation is limited to structured meeting corpora (ICSI, AMI); diversity and scale in transcript types remain open for exploration. Only ADA embeddings were tested systematically. Direct evaluations against M³Seg—or more recent hierarchical segmentation baselines—are constrained by code availability.

Proposed future work includes:

Systematic embedding model comparisons (SBERT, RoBERTa, LLaMA, etc.)
Extending the hierarchical tree for downstream applications: multi-level summarization, chapter labeling, knowledge extraction
Application to less-structured, noisier, or conversational transcript domains

6. The BITS-Tree Data Structure (TreeSeg in Segment Storage Context)

In the context of dynamic segment storage and interval queries, TreeSeg refers to the BITS-Tree (Balanced Inorder Threaded Segment Tree) (Easwarakumar et al., 2015). This structure maintains a height-balanced (AVL) binary search tree in which each node stores a non-overlapping interval and the list of original segments containing that interval.

Key characteristics:

Insertion/Deletion: $O(\log n + k)$ , where $n$ is the segment count and $k$ the number of affected nodes (overlapping with the inserted/deleted segment). Segments can extend beyond previous tree bounds.
Query:
- Point (stabbing) query: $O(\log n + k)$ , with $k$ as output size.
- Range query: $O(\log n + m)$ , $m=$ number of nodes intersecting query.
Node count: At most $2n-1$, much lower than $2(U_{\max}-U_{\min})-1$ for classic dynamic segment trees built over a universe $U$ .
Space: Worst-case total segment-list storage $O(n^2)$ .
Height: $O(\log n)$ .
Threading: Inorder threads enable efficient traversal for range queries.

This structure is particularly advantageous when the global universe $U$ is large and $n$ is comparatively small, minimizing node count while supporting efficient dynamic updates and fast queries (Easwarakumar et al., 2015).

7. Relationship to Other Tree-Based Segmentation Techniques

The term “TreeSeg” is also used in TreeSegNet for image segmentation, notably in adaptive CNNs constructed according to class confusion statistics (Yue et al., 2018). Despite methodologically divergent goals, these approaches share the strategic use of tree structures—either for recursively partitioning data (topic segmentation, BITS-Tree) or as an architectural prior in neural segmentation networks (TreeSegNet).

A plausible implication is that tree-based representations enable scalable, multiresolution partitioning or specialization in domains where hierarchical or ambiguous boundaries are intrinsic to the data.

References:

TreeSeg for hierarchical topic segmentation: (Gklezakos et al., 2024)
BITS-Tree (dynamic segment storage): (Easwarakumar et al., 2015)
TreeSegNet in adaptive CNN segmentation: (Yue et al., 2018)

Markdown Report Issue Upgrade to Chat

References (3)

TreeSeg: Hierarchical Topic Segmentation of Large Transcripts (2024)

BITS-Tree-An Efficient Data Structure for Segment Storage and Query Processing (2015)

TreeSegNet: Adaptive Tree CNNs for Subdecimeter Aerial Image Segmentation (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to TreeSeg.