Papers
Topics
Authors
Recent
Search
2000 character limit reached

Context-Aware Iterative Token Detection and Masked Transmission for Wireless Token Communication

Published 25 Jan 2026 in eess.SP and cs.AI | (2601.17770v1)

Abstract: The success of large-scale LLMs has established tokens as compact and meaningful units for natural-language representation, which motivates token communication over wireless channels, where tokens are considered fundamental units for wireless transmission. We propose a context-aware token communication framework that uses a pretrained masked LLM (MLM) as a shared contextual probability model between the transmitter (Tx) and receiver (Rx). At Rx, we develop an iterative token detection method that jointly exploits MLM-guided contextual priors and channel observations based on a Bayesian perspective. At Tx, we additionally introduce a context-aware masking strategy which skips highly predictable token transmission to reduce transmission rate. Simulation results demonstrate that the proposed framework substantially improves reconstructed sentence quality and supports effective rate adaptation under various channel conditions.

Summary

  • The paper introduces a probabilistic framework using masked language models to adapt transmission rates by selectively masking predictable tokens.
  • It applies an iterative MAP-based detection method at the receiver, integrating channel likelihoods with contextual priors for enhanced token inference.
  • Empirical results demonstrate significant semantic preservation and robust rate adaptation under various channel conditions and masking ratios.

Context-Aware Iterative Token Detection and Masked Transmission for Wireless Token Communication

Introduction

This paper presents a probabilistic framework for wireless transmission of language tokens, leveraging masked LLMs (MLMs) as a central mechanism for both adaptive transmission and robust symbol detection. The emergence of token-based representations in downstream applications motivates a communication paradigm shift in which not all tokens are treated equally; rather, contextual redundancy is explicitly exploited both to reduce transmission rates and to counteract channel errors. The methodology establishes a dual role for masked LLMs: at the transmitter (Tx), they guide context-aware masking to omit highly predictable tokens, while at the receiver (Rx) they enable iterative MAP-based token inference, fusing channel observations and contextual priors. Figure 1

Figure 1: An illustration of the proposed token communication framework.

System Model

The system models a wireless channel where a sequence of language tokens is encoded, masked, and modulated using standard digital communication schemes (e.g., QAM) before experiencing Rayleigh block fading and AWGN. Crucially, a subset of tokens is masked at the Tx based on MLM-driven entropy metrics, reducing physical-layer bandwidth. The unmasked tokens are converted to binary, mapped onto signal constellations, and sent over the air. At the Rx, both the channel likelihoods and MLM-based priors are integrated to reconstruct the original sequence. Masked positions receive no physical channel observation, and their likelihoods are set uniformly over the vocabulary. The framework is agnostic to the tokenizer and modular in terms of modulation and channel models.

Context-Aware Inference and Masking

Iterative Bayesian Token Detection at the Receiver

The main technical innovation at the Rx is an iterative variant of MAP estimation for tokens. In each iteration, the Rx uses the MLM to compute conditional priors P(wi∣w^−i)P(w_i|\mathbf{\hat{w}}_{-i}), leveraging the current hypothesis sequence. These priors are combined with the observed channel likelihoods P(yi∣wi)P(\mathbf{y}_i|w_i) in a Bayesian posterior scoring, refining token estimates. The iterative process is analogous to turbo-style single-sequence approximations, where the maximization over all possible context sequences w−i\mathbf{w}_{-i} is approximated by the current best sequence, addressing the intractable dependencies in natural language token sequences.

Context-Aware Masking at the Transmitter

At the Tx, the same MLM provides entropy scores quantifying the predictability of each token given the rest of the sequence. Tokens with the lowest entropy are greedily selected for masking, subject to a global masking ratio rr. Only the non-masked tokens are encoded and transmitted. This entropy-based selection is justified by the capacity of the MLM to capture both local and long-range dependencies, which empirical results show preserves task-relevant semantic information even at high masking rates.

Experimental Results

The framework is evaluated on standard corpora (Europarl, WikiText-103), with BERT as the shared MLM and sentence-level cosine similarity (SIM) as the main measure of semantic fidelity between sent and reconstructed sequences. Modulation is performed with 16-QAM over Rayleigh fading. Figure 2

Figure 2

Figure 2: Performance of the proposed iterative token detection strategy (Rx-side only) under different iteration counts.

Results indicate substantial improvements over conventional maximum likelihood symbol detection, particularly at low SNR and under increasing numbers of Rx iterations. For full-token transmission (no masking), the iterative procedure yields up to 0.1769 (Europarl) and 0.1558 (WikiText-103) higher SIM over ML decoding, with diminishing returns beyond five iterations. This demonstrates the effective fusion of channel evidence and semantic priors in mitigating noise and fading. Figure 3

Figure 3

Figure 3: Performance comparison of the joint Tx–Rx strategy with different masking ratios and masking policies.

In the rate-adaptive regime, context-aware masking exhibits strong robustness compared to random masking baselines. At a masking ratio r=0.1r=0.1, the joint Tx–Rx approach achieves minimal SIM degradation relative to full transmission, whereas random masking causes substantial semantic loss. Even for r=0.3r=0.3, context-aware masking maintains significantly higher SIM, highlighting the importance of semantic redundancy selection rather than uninformed omission.

Theoretical and Practical Implications

The framework advances the integration of NLP-based priors into the wireless communication stack, formalizing semantic communication wherein probabilistic context models drive adaptation at both encoder and decoder. This closes the gap between context-agnostic physical layer methods and the requirements of language-based downstream tasks, offering a rigorous MAP-based interface adaptable to varying SNR and application relevance.

In practice, this line of work enables:

  • Semantic-aware rate adaptation in language-oriented IoT, conversational AI, and collaborative edge devices.
  • Joint source-channel coding mechanisms inherently robust to packet loss, erasure, and adversarial environments, as the context model can fill in semantic gaps.
  • Extensions beyond text, motivating the use of large multimodal models as context priors in joint vision-language communication.

Conclusion

A Bayesian-contextual framework for wireless token transmission is introduced, leveraging MLM priors for both mask-based rate adaptation at the Tx and MAP-based iterative token detection at the Rx. Empirical results confirm improved semantic preservation and rate-efficiency under a range of challenging channel conditions. The modularity of the MLM prior paves the way for integrating more advanced or task-specific models, as well as the extension to multimodal and cross-lingual communication scenarios.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.