Optimal Semantic Encoding Tree

Updated 27 November 2025

Optimal Semantic Encoding Tree is a multi-resolution map abstraction that preserves essential semantic information while pruning irrelevant details.
It employs semantic octree structures, probabilistic belief propagation, and information-theoretic cost functions to balance compression and fidelity.
Empirical evaluations indicate up to 80% reduction in leaf cells with high mutual information retention, enhancing motion planning efficiency.

An optimal semantic encoding tree is a multi-resolution environment representation constructed and compressed to maximally preserve information about relevant semantic classes while aggressively discarding distinctions pertaining to irrelevant classes. This procedure utilizes a data structure based on semantic octrees, probabilistic belief propagation, and information-theoretic cost functions. The optimal tree emerges through a pruning algorithm controlled by semantic priority weights and a complexity penalty, systematically tracing out the Pareto frontier between compression and semantic fidelity (Larsson et al., 2022).

1. Semantic Octree Data Structure

A semantic octree models a bounded spatial domain $\Omega \subset \mathbb{R}^3$ as a hierarchical, multi-resolution tree $T_0$ , where each leaf corresponds to a cell in the finest grid subdivision. Each node $n \in T_0$ (interior or leaf) represents a cubic subvolume, with up to eight children denoting octants. The occupancy and semantic belief at a leaf $x$ obey a $K$ -class categorical distribution $p(S = k \mid x)$ for $k \in \{0, \dots, K\}$ , where class 0 denotes free space and $k > 0$ correspond to object categories. Typically, only the three highest-probability classes and a residual bin are stored per cell.

Random variable $X : \Omega \rightarrow T_0$ assigns to each location the finest cell index, with prior probability $p(x)$ . For each semantic class $i$ , a relevant indicator variable $Y_i: \Omega \to \{0, 1\}$ is defined so that $\mathbb{P}[Y_i = 1 \mid X = x] = p(S = i \mid x)$ . Irrelevant classes $j$ are encoded analogously as indicators $Z_j$ .

For any interior node $n$ , child beliefs aggregate as

$p(n) = \sum_{c \in \text{children}(n)} p(c),$

and

$p(S = i \mid n) = \sum_{c \in \text{children}(n)} \frac{p(c)}{p(n)} p(S = i \mid c),$

enabling computation of $p(Y_i = 1 \mid n)$ and $p(Z_j = 1 \mid n)$ through weighted averages.

2. Information-Theoretic Cost Functions

The construction trades off tree complexity against semantic fidelity. For each interior node $n$ with $p(n) > 0$ ,

The relative-weight vector over children: $\Pi_c = p(c) / p(n), \ c \in \text{children}(n)$ .
Coding-cost increment (penalizing complexity): $\Delta I_X(n) = p(n) H(\Pi)$ , with $H(\Pi) = -\sum_c \Pi_c \log \Pi_c$ .
For a relevant class $i$ , semantic-loss increment: $\Delta I_{Y_i}(n) = p(n) \mathrm{JS}(p(Y_i \mid c_1), \ldots, p(Y_i \mid c_m))$ , where

$\mathrm{JS}(p_1, \ldots, p_\ell) = \sum_r \omega_r \mathrm{KL}(p_r \| \sum_s \omega_s p_s),$

and

$\mathrm{KL}(p \| q) = \sum_x p(x) \log \frac{p(x)}{q(x)},$

with mixture mean $\bar{p}(Y_i) = \sum_u \Pi_{c_u} p(Y_i \mid c_u)$ . For irrelevant class $j$ , similar increments $\Delta I_{Z_j}(n)$ are defined.

Semantic-loss increments quantify the cost of merging children nodes distinguished by their semantic class distributions: when these distributions are nearly identical, semantic loss is minimal, encouraging pruning.

3. Optimization Formulation and Pruning Algorithm

The optimization problem seeks a subtree $T \subseteq T_0$ with the same root, maximizing

$J(T) = \sum_{i \in I_Y} \beta_i I_{Y_i}(T) - \sum_{j \in I_Z} \gamma_j I_{Z_j}(T) - \alpha I_X(T),$

with

$I_{Y_i}(T) = I(Y_i; N)$ as the mutual information between $Y_i$ and the leaf variable $N$ ;
$I_{Z_j}(T) = I(Z_j; N)$ for irrelevant classes;
$I_X(T) = H(N)$ as the coding rate (leaf entropy).

Crucially, these objectives decompose:

$I_{Y_i}(T) = \sum_{n \in \text{interior}(T)} \Delta I_{Y_i}(n)$ ,
$I_X(T) = \sum_{n \in \text{interior}(T)} \Delta I_X(n)$ ,
similarly for $I_{Z_j}(T)$ .

The binary decision at each interior node $n$ —whether to expand or prune—proceeds via a local one-step reward for expansion:

$\Delta J(n) = \sum_i \beta_i \Delta I_{Y_i}(n) - \sum_j \gamma_j \Delta I_{Z_j}(n) - \alpha \Delta I_X(n)$

with a recursive $G$ -value:

$G(n) = \max \left\{ \Delta J(n) + \sum_{c \in \text{children}(n)} G(c), \ 0 \right\}.$

Expansion occurs iff $G(n) > 0$ ; otherwise, node $n$ is pruned to a leaf.

4. Class Weighting Scheme

Tree construction is governed by three types of weights:

$\beta_i > 0$ : weight for retaining information on relevant class $i$ ,
$\gamma_j > 0$ : weight penalizing retention of irrelevant class $j$ ,
$\alpha > 0$ : penalty per bit in leaf encoding (controls overall tree size).

Increased $\beta_i$ enforces retention of distinctions for class $i$ , while increased $\gamma_j$ motivates pruning of nodes differing only in class $j$ . Higher $\alpha$ prioritizes tree compactness. Tuning these parameters facilitates explicit tradeoffs between semantic fidelity and resolution.

5. Joint Build-and-Compress Algorithm

New semantic point-cloud observations incrementally update the octree. The build-and-compress process executes as follows:

Leaf Update: Each new observation updates occupancy and class probabilities at the corresponding leaf $x$ using a Bayesian semantic-octomap update.
Upwards Propagation: For each parent node $n$ $n$ from $x$ $x$ to the root:
- Reconstruct full $p(S | c)$ for children, distributing residual probability over un-stored classes.
- Aggregate child beliefs for $p(n), p(Y_i | n), p(Z_j | n)$ .
- Compute cost differentials $\Delta I_{Y_i}(n), \Delta I_{Z_j}(n), \Delta I_X(n)$ .
- Update $G(n)$ recursively.
Top-Down Pruning: Beginning at the root, nodes are expanded only if $G(n) > 0$ ; otherwise, descendants are pruned, and $n$ becomes a leaf.
Return: The final pruned tree $T^*$ is produced, balancing compression and semantic retention.

6. Empirical Evaluation and Practical Impact

In experiments using large, semantically rich outdoor environments with 25 classes, setting “asphalt” as the relevant class ( $\beta_{asphalt} = 1$ ), penalizing “grass” and “trees” ( $\gamma_{grass}, \gamma_{trees} > 0$ ), and tuning $\alpha$ resulted in multi-resolution maps retaining high detail for roads and coarsening vegetation. Increasing $\alpha$ (compression) decreased leaf cells by up to 80%, while mutual information about “asphalt” remained above 90% for a broad $\alpha$ range. Irrelevant semantic information attenuated more rapidly.

During motion-planning, generating a search graph from the compressed semantic octree (as opposed to uniform Halton sampling) led to approximately 10% faster solution times and a 60% reduction in time variance. This suggests that informative semantic compression offers tangible computational advantages in downstream planning tasks.

7. Context, Significance, and Applications

The optimal semantic encoding tree yields a task-adaptive map abstraction with user-tunable priorities. By adjusting weights $\{\beta, \gamma, \alpha\}$ , practitioners realize multi-resolution compression that selectively retains or discards semantic features. This abstraction underpins integrated perception and planning for autonomous robotics, especially in environments where fine semantic distinctions are critical for navigation and coarse representations suffice for other regions. The framework establishes a principled, information-theoretic basis for multi-class map compression and demonstrates operational superiority over uninformed sampling methods such as Halton sequences (Larsson et al., 2022).

Markdown Upgrade to Chat

References (1)

Information-theoretic Abstraction of Semantic Octree Models for Integrated Perception and Planning (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Optimal Semantic Encoding Tree.