- The paper introduces H-TokCom that clusters semantically similar tokens and hierarchically maps bit representations to reduce semantic distortion under low SNR.
- It employs a two-tier bit mapping strategy—using prefix bits for cluster identification and suffix bits for token refinement—to ensure errors yield semantically proximate tokens.
- Cluster-prioritized power allocation, validated through numerical experiments, significantly enhances semantic preservation compared to conventional methods.
Semantics-Aware Hierarchical Token Communication: Clustering, Bit Mapping, and Power Allocation
Motivation and Context
Token communication (TokCom) represents a shift from bit-centric architectures to the transmission of semantic units that underpin modern transformer-based models. Unlike traditional communication systems, TokCom operates with tokens that carry context and meaning, making the preservation of semantic integrity over noisy channels a crucial objective. Existing TokCom paradigms predominantly adopt AI-layer semantic recovery mechanisms—such as masked token completion using LLMs—while neglecting semantic structure in physical-layer designs (bit mapping and power allocation). This omission results in substantial semantic distortion under low SNR, as token indices are mapped to bits agnostically and symbol errors lead to the decoding of semantically dissimilar tokens.
The paper introduces Hierarchical Token Communication (H-TokCom), which embeds semantic structure directly into the communication layer by clustering semantically similar tokens and hierarchically assigning their bit representations. This approach—coupled with cluster-prioritized power allocation—enables robust preservation of semantics in the presence of channel noise.
Figure 1: Comparison of naïve TokCom (a) versus H-TokCom (b); H-TokCom errors result in semantically similar outputs by retaining cluster indices.
Hierarchical Semantic Clustering and Bit Mapping
H-TokCom capitalizes on token-level semantic proximity via embedding-based clustering. The vocabulary is partitioned into K clusters (with controlled maximal size T), where tokens within a cluster are close in the embedding space. Starting from singleton clusters, iterative merging based on average inter-cluster cosine distance yields the semantic hierarchy.
Bit mapping follows a two-tiered structure:
- Prefix Bits: Each cluster is assigned a unique q-bit prefix, constructed using semantic centroid ordering (e.g., via Hilbert space-filling curves) and Gray coding to ensure adjacent clusters differ by minimal Hamming distance. This enables robust identification of semantic regions.
- Suffix Bits: Tokens within a cluster receive (L−q)-bit suffixes optimized to align Hamming distance with semantic similarity. An iterative swapping procedure minimizes distortion between assigned and target Hamming distances, enforcing intra-cluster semantic proximity.
Figure 2: Hierarchical bit mapping illustration: (a) semantic clustering, (b) cluster-prefix assignment, (c) intra-cluster token suffix allocation.
This hierarchical assignment strategy ensures that channel-induced bit errors primarily perturb suffix bits—mapping to semantically similar tokens within the same cluster—thus minimizing semantic loss.
Cluster-Prioritized Power Allocation
Transmission robustness is further enhanced by an unequal power allocation strategy. The cluster-prefix bits, being critical for preserving coarse semantics, are assigned higher transmit power. The suffix bits, responsible for fine-grained token distinction, receive the remaining budget. The optimal target SER for prefix bits is determined as a function of symbol SNR, balancing semantic region reliability against overall power constraints.
The prefix symbol SER is set by an exponential function ε∗(γ)=ηe−θγ (with empirically fitted parameters), enabling adaptive protection as a function of channel conditions. The allocation converges to equal power at high SNR but favors prefix reliability at lower SNR.
Figure 3: Optimal target SER ε∗ vs. SNR, showing exponential fit and design bounds for cluster-prefix protection.
Semantic-Aware Token Reconstruction
Decoding decomposes received bit sequences into prefix and suffix segments. The prefix identifies the candidate cluster, and the suffix is matched to tokens with minimum Hamming distance within that cluster. This reconstruction procedure ensures that even under bit errors, the output token is semantically proximate to the original. The final reconstructed sentence is compared to the original via cosine similarity of sentence embeddings.
Numerical Results
Evaluation across COCO Captions, Flickr30k, and QQP datasets demonstrates consistent semantic similarity gains by H-TokCom over conventional TokCom and semantic error correction (SEC)-enhanced TokCom. Hierarchical mapping alone significantly improves robustness, and cluster-prioritized power allocation further enhances performance, particularly at low-to-moderate SNR.
At γ=3 dB on COCO, the semantic similarity increases from $0.206$ to $0.279$ (a relative gain of 35.4%). Similar behavior is observed across all datasets, and SEC methods underperform at low SNR due to unreliable contextual correction.
Figure 4: Semantic similarity vs. SNR for all baseline and proposed schemes, demonstrating the advantages of hierarchical mapping and power allocation.
Implications and Future Directions
The integration of semantic structure into physical-layer design marks a substantive advance in semantic communication, bridging AI-driven recovery with communication-layer robustness. H-TokCom's techniques allow for semantic-preserving transmission without relying on large AI modules, reducing computational complexity and improving practical performance in distributed settings or bandwidth-constrained environments.
This framework lays groundwork for further research, including hierarchy-aware semantic error correction, adaptive clustering schemes, and extensions to multi-user or multi-modal communication scenarios. The theoretical implications suggest new approaches to the joint optimization of semantic-aware mapping, channel coding, and transmit resource management, potentially informing the design of future AI-native communication systems.
Conclusion
H-TokCom introduces a semantics-aware hierarchical framework for token communication, jointly addressing clustering, bit mapping, and power allocation at the communication layer. Through embedding-based clustering, hierarchical mapping, and adaptive power prioritization, the paradigm significantly enhances semantic robustness in noisy channels. The results establish H-TokCom as an effective method for semantically reliable transmission, with broad implications for AI-native and token-centric communication architectures. Future directions include synergy with advanced SEC methods and further integration of AI and communication strategies for robust cross-modal and agentic systems.