Papers
Topics
Authors
Recent
Search
2000 character limit reached

Token-Domain Multiple Access (ToDMA)

Updated 27 February 2026
  • Token-Domain Multiple Access (ToDMA) is a semantic communications framework that encodes data into token sequences using shared pre-trained tokenization and modulation codebooks.
  • It achieves high transmission efficiency with low end-to-end latency, demonstrating near-orthogonal performance metrics such as PSNR and BERTScore in text and image modalities.
  • The framework employs compressed sensing, channel clustering, and masked token prediction via multimodal language models to effectively resolve over 90% of token collisions.

Token-Domain Multiple Access (ToDMA) is a large-model-driven grant-free multiple access framework for semantic communications in which a massive set of devices encode, transmit, and recover information in the token domain. ToDMA leverages pre-trained tokenization and modulation codebooks, compressed sensing for token detection, channel state information (CSI) clustering for user separation, and context-aware masked token prediction using multimodal LLMs (MLLMs) to resolve token collisions. This pipeline achieves high transmission efficiency and low end-to-end latency, with semantic-level quality superior to both orthogonal and context-unaware non-orthogonal communication schemes, across text and image modalities (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

1. Architecture and Signal Processing Pipeline

ToDMA defines a unified transmitter and receiver pipeline implemented as follows:

  • Tokenization (Source Coding): Each device encodes its source data (such as an image patch or text sequence) ss into a sequence of tokens via a shared learned tokenizer,

T:s(t1,,tN)T: s \mapsto (t_1, \dots, t_N)

where each tn{1,,Q}t_n \in \{1, \dots, Q\} indexes a common token codebook of size QQ. Tokens are equivalently represented as one-hot vectors bn{0,1}Q\mathbf{b}_n \in \{0,1\}^Q (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

  • Modulation (Channel Coding): Token indices are mapped by a shared complex-valued modulation codebook UCL×Q\mathbf{U} \in \mathbb{C}^{L \times Q},

m:tnxn=UbnCLm: t_n \mapsto \mathbf{x}_n = \mathbf{U}\mathbf{b}_n \in \mathbb{C}^L

so each device transmits over NN time slots the codeword matrix X=[x1,,xN]CL×N\mathbf{X} = [\mathbf{x}_1, \ldots, \mathbf{x}_N] \in \mathbb{C}^{L \times N}. The orthonormality UHU=IQU^HU = I_Q is enforced for efficient projection and separation (Qiao et al., 10 Feb 2025).

  • Channel Model: In a grant-free uplink, KKTK \ll K_T active devices simultaneously transmit to an MM-antenna base station (BS). The received signal at slot nn is

Yn=k=1Kxk,nhkT+Zn=U(k=1Kbk,nhkT)+Zn\mathbf{Y}_n = \sum_{k=1}^K \mathbf{x}_{k,n} \mathbf{h}_k^T + \mathbf{Z}_n = \mathbf{U}\left( \sum_{k=1}^K \mathbf{b}_{k,n} \mathbf{h}_k^T \right) + \mathbf{Z}_n

where hkCM\mathbf{h}_k \in \mathbb{C}^M is the (slot-invariant) channel of device kk and Zn\mathbf{Z}_n is i.i.d. additive white Gaussian noise (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

2. Receiver Design: Token Detection and Channel Clustering

The base station implements a multi-step recovery procedure:

  • Compressed Sensing Token Detection: For each time slot, the receiver projects Yn\mathbf{Y}_n onto columns of U\mathbf{U}, forming

H^n=UHYn=Hn+UHZn\hat{\mathbf{H}}_n = \mathbf{U}^H \mathbf{Y}_n = \mathbf{H}_n + \mathbf{U}^H \mathbf{Z}_n

where HnCQ×M\mathbf{H}_n \in \mathbb{C}^{Q \times M} is row-sparse, encoding which tokens are active. Approximate Message Passing (AMP) is employed to detect the active token set P^n\hat{\mathcal{P}}_n and estimate their CSI vectors {h^ϕ,n}\{\hat{\mathbf{h}}_{\phi, n}\} (Qiao et al., 16 May 2025).

  • Channel Clustering for Source Assignment: Each device's channel remains constant over NN slots. For robust token-to-user assignment, the receiver clusters the estimated CSI vectors F^={h^ϕ,n}\widehat{\mathcal{F}} = \{\hat{\mathbf{h}}_{\phi, n}\} into KK groups using K-means++,

min{Ck}k=1KhCkhck22\min_{\{\mathcal{C}_k\}} \sum_{k=1}^K \sum_{\mathbf{h} \in \mathcal{C}_k} \|\mathbf{h} - \mathbf{c}_k\|_2^2

Each detected token is then assigned to the closest cluster center (user), yielding a set of partially reconstructed token sequences per device (Qiao et al., 16 May 2025).

3. Semantic Orthogonality and Collision Mitigation

  • Semantic Orthogonality: Token sequences from different devices occupy distant regions in a learned embedding space, measured via low average cosine similarity or large distributional divergence:

sim(Bi,Bj)=1Nn=1NEti,n,Etj,nEti,nEtj,n\text{sim}(B_i,B_j) = \frac{1}{N}\sum_{n=1}^N \frac{\langle E_{t_{i,n}}, E_{t_{j,n}} \rangle}{\|E_{t_{i,n}}\| \cdot \|E_{t_{j,n}}\|}

This separation enables disentangling of overlapping transmissions even when tokens collide (Qiao et al., 10 Feb 2025).

  • Collision Handling via Masked Prediction: When multiple devices choose the same token in the same slot, clustering cannot resolve assignments, and corresponding tokens are masked ([MASK][{\rm MASK}]). Each device's partial token sequence is completed using a pre-trained multimodal LLM (MLLM), e.g., BERT (text) or MaskGIT (image), by predicting candidate tokens in context, restricted to the detected collision set P~n\widetilde{\mathcal{P}}_n:

t^m=argmaxqP~nP(tm=qcontext)\hat{t}_m = \arg\max_{q \in \widetilde{\mathcal{P}}_n} P(t_m = q \mid \text{context})

This targeted search reduces masked token prediction complexity from O(Q)O(Q) to O(P~n)O(|\widetilde{\mathcal{P}}_n|) (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

4. Bayesian Interpretation and Receiver Inference

Maximum a posteriori (MAP) inference for the joint set of all tokens transmitted by all devices is formally intractable: {t^k,n}=argmax{tk,n}P({tk,n}{Yn}n=1N,U,{hk})\{\hat{t}_{k,n}\} = \arg\max_{\{t_{k,n}\}} P(\{t_{k,n}\} | \{\mathbf{Y}_n\}_{n=1}^N, \mathbf{U}, \{\mathbf{h}_k\}) ToDMA operationalizes an approximate MAP solution by chaining:

  • Matched-filter token detection and energy thresholding;
  • Nearest-CSI channel assignment (clustering);
  • Contextual masked token re-prediction, as a conditional MAP estimate restricted to detected collision candidates, with the transformer acting as a probabilistic scorer (Qiao et al., 10 Feb 2025).

5. Practical Implementation and Complexity

A single shared tokenizer and modulation codebook ensure coherent symbol mapping and efficient grant-free random access across a large device set. Transformer-based tokenizers provide context-sensitive token quantization, increasing semantic robustness (Qiao et al., 10 Feb 2025, Qiao et al., 16 May 2025).

Per-slot receiver complexity is determined by:

Operation Complexity Dominant Factors
Token Detection O(NLM+QM)O(NLM + QM) Matrix projections
Assignment O(KPnM)O(K|\mathcal{P}_n|M) Channel clustering; PnK|\mathcal{P}_n|\approx K
Masked Prediction O(KNd2)O(KNd^2) (per sample) Transformer embedding dimension dd; reduced by candidate restriction

The candidate set restriction for masked prediction permits the use of large token vocabularies (Q103104Q\approx 10^3-10^4) and long sequences efficiently (Qiao et al., 10 Feb 2025).

Recommended system design principles include:

  • MQM \geq Q BS antennas for under-determined least squares separation;
  • Codebook orthonormality and transformer depth (8–12 layers) for effective masking;
  • Thresholds (Th2σ2T_h \approx 2\sigma^2) calibrated for detection/assignment accuracy (Qiao et al., 10 Feb 2025).

6. Performance Evaluation

Simulation studies in image (ImageNet-100, VQ-GAN, MaskGIT) and text (QUOTES500K, BERT) transmission scenarios confirm the following performance characteristics (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025):

  • Token Error Rate (TER): ToDMA maintains TER near those of error-free orthogonal QAM (Orth-Com) for increasing active device count KK, while context-unaware non-orthogonal schemes degrade rapidly with collisions.
  • Semantic Quality: ToDMA's PSNR in images remains within 1–2 dB of Orth-Com up to K=80K = 80 (Orth-Com at PSNR ≈ 32 dB for K=20K=20); text BERTScore ≈ 0.92 at K=20K=20 (vs 0.78 for non-contextual baseline). Visual reconstructions are nearly indistinguishable from the orthogonal scheme (Qiao et al., 16 May 2025).
  • Latency: ToDMA achieves 4×\sim4\times lower end-to-end latency relative to Orth-Com at typical BER targets, due to non-orthogonal, grant-free access and minimal pre-transmission coordination.
  • Collision Recovery: MLLM-based masked token prediction successfully resolves >90% of token collisions, as shown in token-map heatmaps and output quality (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

7. Insights, Limitations, and Design Implications

Key properties of ToDMA, supported by empirical results and theoretical design, include:

  • Exploitation of semantic orthogonality enables disentanglement of overlapping codewords at the receiver, leveraging the low inter-sequence similarity in embedding space.
  • Joint source-channel coding is realized in the token domain, bypassing the inefficiency of separate bit-wise coding.
  • Grant-free operation with global codebook/tokenizer coordination dramatically reduces signaling overhead and scales to large device populations.
  • Leveraging masked tokens and MLLMs for context-driven completion enables robust resolution of collisions with tractable inference complexity.
  • A plausible implication is that future ToDMA extensions could adaptively tune codebook size, sequence length, and transformer depth to modality and application requirements.

ToDMA currently assumes shared tokenization/modulation infrastructure and well-calibrated model/antenna configurations; deviations might affect separation or completion accuracy. The necessity of large, pre-trained context models (e.g., MaskGIT, BERT) implicates scalability constraints in resource-limited settings. These considerations delineate active research directions within semantic communications leveraging token-domain access mechanisms (Qiao et al., 16 May 2025, Qiao et al., 10 Feb 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Token-Domain Multiple Access (ToDMA).