Papers
Topics
Authors
Recent
2000 character limit reached

Optimal Semantic Encoding Tree

Updated 27 November 2025
  • Optimal Semantic Encoding Tree is a multi-resolution map abstraction that preserves essential semantic information while pruning irrelevant details.
  • It employs semantic octree structures, probabilistic belief propagation, and information-theoretic cost functions to balance compression and fidelity.
  • Empirical evaluations indicate up to 80% reduction in leaf cells with high mutual information retention, enhancing motion planning efficiency.

An optimal semantic encoding tree is a multi-resolution environment representation constructed and compressed to maximally preserve information about relevant semantic classes while aggressively discarding distinctions pertaining to irrelevant classes. This procedure utilizes a data structure based on semantic octrees, probabilistic belief propagation, and information-theoretic cost functions. The optimal tree emerges through a pruning algorithm controlled by semantic priority weights and a complexity penalty, systematically tracing out the Pareto frontier between compression and semantic fidelity (Larsson et al., 2022).

1. Semantic Octree Data Structure

A semantic octree models a bounded spatial domain ΩR3\Omega \subset \mathbb{R}^3 as a hierarchical, multi-resolution tree T0T_0, where each leaf corresponds to a cell in the finest grid subdivision. Each node nT0n \in T_0 (interior or leaf) represents a cubic subvolume, with up to eight children denoting octants. The occupancy and semantic belief at a leaf xx obey a KK-class categorical distribution p(S=kx)p(S = k \mid x) for k{0,,K}k \in \{0, \dots, K\}, where class 0 denotes free space and k>0k > 0 correspond to object categories. Typically, only the three highest-probability classes and a residual bin are stored per cell.

Random variable X:ΩT0X : \Omega \rightarrow T_0 assigns to each location the finest cell index, with prior probability p(x)p(x). For each semantic class ii, a relevant indicator variable Yi:Ω{0,1}Y_i: \Omega \to \{0, 1\} is defined so that P[Yi=1X=x]=p(S=ix)\mathbb{P}[Y_i = 1 \mid X = x] = p(S = i \mid x). Irrelevant classes jj are encoded analogously as indicators ZjZ_j.

For any interior node nn, child beliefs aggregate as

p(n)=cchildren(n)p(c),p(n) = \sum_{c \in \text{children}(n)} p(c),

and

p(S=in)=cchildren(n)p(c)p(n)p(S=ic),p(S = i \mid n) = \sum_{c \in \text{children}(n)} \frac{p(c)}{p(n)} p(S = i \mid c),

enabling computation of p(Yi=1n)p(Y_i = 1 \mid n) and p(Zj=1n)p(Z_j = 1 \mid n) through weighted averages.

2. Information-Theoretic Cost Functions

The construction trades off tree complexity against semantic fidelity. For each interior node nn with p(n)>0p(n) > 0,

  • The relative-weight vector over children: Πc=p(c)/p(n), cchildren(n)\Pi_c = p(c) / p(n), \ c \in \text{children}(n).
  • Coding-cost increment (penalizing complexity): ΔIX(n)=p(n)H(Π)\Delta I_X(n) = p(n) H(\Pi), with H(Π)=cΠclogΠcH(\Pi) = -\sum_c \Pi_c \log \Pi_c.
  • For a relevant class ii, semantic-loss increment: ΔIYi(n)=p(n)JS(p(Yic1),,p(Yicm))\Delta I_{Y_i}(n) = p(n) \mathrm{JS}(p(Y_i \mid c_1), \ldots, p(Y_i \mid c_m)), where

JS(p1,,p)=rωrKL(prsωsps),\mathrm{JS}(p_1, \ldots, p_\ell) = \sum_r \omega_r \mathrm{KL}(p_r \| \sum_s \omega_s p_s),

and

KL(pq)=xp(x)logp(x)q(x),\mathrm{KL}(p \| q) = \sum_x p(x) \log \frac{p(x)}{q(x)},

with mixture mean pˉ(Yi)=uΠcup(Yicu)\bar{p}(Y_i) = \sum_u \Pi_{c_u} p(Y_i \mid c_u). For irrelevant class jj, similar increments ΔIZj(n)\Delta I_{Z_j}(n) are defined.

Semantic-loss increments quantify the cost of merging children nodes distinguished by their semantic class distributions: when these distributions are nearly identical, semantic loss is minimal, encouraging pruning.

3. Optimization Formulation and Pruning Algorithm

The optimization problem seeks a subtree TT0T \subseteq T_0 with the same root, maximizing

J(T)=iIYβiIYi(T)jIZγjIZj(T)αIX(T),J(T) = \sum_{i \in I_Y} \beta_i I_{Y_i}(T) - \sum_{j \in I_Z} \gamma_j I_{Z_j}(T) - \alpha I_X(T),

with

  • IYi(T)=I(Yi;N)I_{Y_i}(T) = I(Y_i; N) as the mutual information between YiY_i and the leaf variable NN;
  • IZj(T)=I(Zj;N)I_{Z_j}(T) = I(Z_j; N) for irrelevant classes;
  • IX(T)=H(N)I_X(T) = H(N) as the coding rate (leaf entropy).

Crucially, these objectives decompose:

  • IYi(T)=ninterior(T)ΔIYi(n)I_{Y_i}(T) = \sum_{n \in \text{interior}(T)} \Delta I_{Y_i}(n),
  • IX(T)=ninterior(T)ΔIX(n)I_X(T) = \sum_{n \in \text{interior}(T)} \Delta I_X(n),
  • similarly for IZj(T)I_{Z_j}(T).

The binary decision at each interior node nn—whether to expand or prune—proceeds via a local one-step reward for expansion:

ΔJ(n)=iβiΔIYi(n)jγjΔIZj(n)αΔIX(n)\Delta J(n) = \sum_i \beta_i \Delta I_{Y_i}(n) - \sum_j \gamma_j \Delta I_{Z_j}(n) - \alpha \Delta I_X(n)

with a recursive GG-value:

G(n)=max{ΔJ(n)+cchildren(n)G(c), 0}.G(n) = \max \left\{ \Delta J(n) + \sum_{c \in \text{children}(n)} G(c), \ 0 \right\}.

Expansion occurs iff G(n)>0G(n) > 0; otherwise, node nn is pruned to a leaf.

4. Class Weighting Scheme

Tree construction is governed by three types of weights:

  • βi>0\beta_i > 0: weight for retaining information on relevant class ii,
  • γj>0\gamma_j > 0: weight penalizing retention of irrelevant class jj,
  • α>0\alpha > 0: penalty per bit in leaf encoding (controls overall tree size).

Increased βi\beta_i enforces retention of distinctions for class ii, while increased γj\gamma_j motivates pruning of nodes differing only in class jj. Higher α\alpha prioritizes tree compactness. Tuning these parameters facilitates explicit tradeoffs between semantic fidelity and resolution.

5. Joint Build-and-Compress Algorithm

New semantic point-cloud observations incrementally update the octree. The build-and-compress process executes as follows:

  1. Leaf Update: Each new observation updates occupancy and class probabilities at the corresponding leaf xx using a Bayesian semantic-octomap update.
  2. Upwards Propagation: For each parent node nn from xx to the root:
    • Reconstruct full p(Sc)p(S | c) for children, distributing residual probability over un-stored classes.
    • Aggregate child beliefs for p(n),p(Yin),p(Zjn)p(n), p(Y_i | n), p(Z_j | n).
    • Compute cost differentials ΔIYi(n),ΔIZj(n),ΔIX(n)\Delta I_{Y_i}(n), \Delta I_{Z_j}(n), \Delta I_X(n).
    • Update G(n)G(n) recursively.
  3. Top-Down Pruning: Beginning at the root, nodes are expanded only if G(n)>0G(n) > 0; otherwise, descendants are pruned, and nn becomes a leaf.
  4. Return: The final pruned tree TT^* is produced, balancing compression and semantic retention.

6. Empirical Evaluation and Practical Impact

In experiments using large, semantically rich outdoor environments with 25 classes, setting “asphalt” as the relevant class (βasphalt=1\beta_{asphalt} = 1), penalizing “grass” and “trees” (γgrass,γtrees>0\gamma_{grass}, \gamma_{trees} > 0), and tuning α\alpha resulted in multi-resolution maps retaining high detail for roads and coarsening vegetation. Increasing α\alpha (compression) decreased leaf cells by up to 80%, while mutual information about “asphalt” remained above 90% for a broad α\alpha range. Irrelevant semantic information attenuated more rapidly.

During motion-planning, generating a search graph from the compressed semantic octree (as opposed to uniform Halton sampling) led to approximately 10% faster solution times and a 60% reduction in time variance. This suggests that informative semantic compression offers tangible computational advantages in downstream planning tasks.

7. Context, Significance, and Applications

The optimal semantic encoding tree yields a task-adaptive map abstraction with user-tunable priorities. By adjusting weights {β,γ,α}\{\beta, \gamma, \alpha\}, practitioners realize multi-resolution compression that selectively retains or discards semantic features. This abstraction underpins integrated perception and planning for autonomous robotics, especially in environments where fine semantic distinctions are critical for navigation and coarse representations suffice for other regions. The framework establishes a principled, information-theoretic basis for multi-class map compression and demonstrates operational superiority over uninformed sampling methods such as Halton sequences (Larsson et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Optimal Semantic Encoding Tree.