Full-Duplex Models: Advances in Wireless & Dialogue
- Full-duplex models are communication architectures enabling simultaneous bidirectional transmission by mitigating self-interference and cross-link interference.
- In wireless systems, these models use cascaded analog and digital cancellation to achieve near-doubling of throughput and significant rate enhancements.
- In spoken dialogue, full-duplex designs synchronize listening and speaking to support natural, overlapping interactions with improved responsiveness.
Full-duplex (FD) models, in the context of communication systems and spoken dialogue agents, refer to architectures, protocols, and analytical frameworks enabling simultaneous bidirectional transmission—either in the electromagnetic spectrum (wireless radios, cellular networks, mmWave links) or in real-time interactive systems (spoken dialogue models)—without the conventional half-duplex constraint of strict time or channel separation. Realizing practical full-duplex operation entails overcoming self-interference (SI), nontrivial cross-link interference (CLI), and implications for scheduling, rate allocation, or dialogue turn-taking. Recent engineering and machine learning advances have driven FD from conceptual hardware demonstrations to large-scale networking protocols, detailed analytical models, and spoken LLMs capable of overlapping multi-speaker interaction. Below is a comprehensive survey of full-duplex models spanning wireless communications and interactive neural systems.
1. Full-Duplex in Wireless Communication: System Models and Signal Processing
Early full-duplex models established the canonical baseband equation at a radio node as
with the node's own transmission, the self-interference channel, the signal-of-interest, and additive noise/impairment terms (Ahmed et al., 2013). In practical prototypes, such as the SDR system in (Chung et al., 2015), analog and digital SI cancellation are cascaded:
- Analog cancellation: Uses dual-polarized antennas for initial 42 dB passive SI suppression and active analog filters for tunable amplitude, phase, and delay control, achieving an aggregate 60 dB analog-domain suppression.
- Digital cancellation: After digitization, an adaptive self-interference channel estimate reconstructs and subtracts the SI based on knowledge of the transmitted symbols, pushing residual SI power below the thermal noise floor (e.g., 103 dB total suppression for near-ideal FD gains).
Performance is quantified via the achievable rate: and compared to half-duplex rates,
with measured gains close to the theoretical 2 limit (e.g., ), provided residual SI is below the noise floor (Chung et al., 2015).
Analog baseband cancellation architectures have also been validated experimentally, with patch-antenna RF isolation (30–40 dB) followed by analog baseband adaptive filtering, yielding measurable gains up to 10 dB in cancellation depth, 2.5 bps/Hz in rate, and orders-of-magnitude BER improvement compared to RF-only cancellation (Kaufman et al., 2013, Kaufman et al., 2013). The frequency-selectivity of the SI channel, bandwidth of the cancellation network, and group-delay mismatches are captured in channel- and device-specific models, and guide the joint analog/digital design trade-offs.
2. Analytical and Capacity Models: Rate, DoF, and Resource Allocation
More general FD system models capture realistic SI and interference dynamics. For MIMO full-duplex relaying systems, (Shende et al., 2013) introduces a self-interference power law: with cancellation efficacy parameterized by and experimental parameters for passive/active suppression. Achievable rates and degrees of freedom (DoF) are derived for various antenna/RF-chain constraints, yielding design formulas for antenna splits, power scaling (e.g., ), and mode switching. Key results include: with half-duplex DoF , and regime boundaries determined by and SNR scaling (Shende et al., 2013).
In multi-channel (OFDM) FD architectures relevant for small-form-factor devices, self-interference suppression becomes frequency-selective. The residual SI power at the mobile on subchannel is modeled as
where is the analog cancellation null's placement. Joint optimization over power allocation across channels and null-placement becomes a nontrivial biconvex program, with nearly-optimal polynomial-time algorithms leveraging biconcavity and bounded gradients in (Marašević et al., 2015). Measured RFIC platforms closely track this model and demonstrate the necessity of at least 80–100 dB aggregate cancellation (analogue + digital) to realize significant FD rate gains on realistic mobile hardware (Marašević et al., 2015).
3. Interference Management, Network Protocols, and Infrastructure Implications
Full-duplex operation in networked environments necessitates careful management of cross-link interference (CLI), inter-node interference (INI), and resource allocation. Empirical and analytical models (Kim et al., 8 Feb 2024) detail:
- Signal-level residual SI: Post-cancellation power typically , with –120 dB cumulative suppression (antenna, analog, digital).
- Cross-link interference (CLI): Where a BS's uplink user's transmission creates interference at the downlink user, parametrized as (with path-loss and log-normal shadowing).
Full-duplex MAC protocols extend CSMA/CA to enable FD opportunities, exploiting the AP's ability to reply simultaneously. Markov-chain analyses (Doost-Mohammady et al., 2015) yield closed-form throughput expressions, explicitly modeling covered/hidden terminal collision domains and showing network throughput improvements of 35–40% over half-duplex (HD) even under non-idealities. Probabilistic MAC scheduling (LP-based with optimal allocation of full- and half-duplex epochs) can attain up to 2.70 the throughput of legacy HD scheduling, accommodating fairness, heterogeneity, and distributed operation (Chen et al., 2016).
On the networking layer, analytical models encompass user pairing, joint UL/DL scheduling, inter-cell/CLI interference, and are routinely mapped into 3D ray-traced SLS environments for performance evaluation under realistic urban densities, path-loss, and shadowing (Kim et al., 8 Feb 2024, Psomas et al., 2015). Outage probability models for two- and three-node architectures rigorously quantify the effect of residual SI, path-loss exponents, user density, transmit power, and SI suppression on the overall rate distribution (Psomas et al., 2015).
4. Full-Duplex in Spoken Dialogue: Models, Synchronization, and Evaluation
Full-duplex spoken dialogue models (FD-SLMs) are defined by their ability to listen and speak concurrently, crucial for natural conversational timing, backchanneling, real-time interruption, and repair (Chen et al., 18 Sep 2025). Architectures fall into two principal categories:
A. Engineered Synchronization:
Composed of explicit controllers—finite-state machines (FSM), voice-activity detectors (VAD), or neural FSMs implemented as special tokens—to manage the listen/speak arbitration atop sequential ASR/LLM/TTS cascades (Wang et al., 29 May 2024). For instance, the neural FSM in (Wang et al., 29 May 2024) uses state/control tokens to switch between LISTEN and SPEAK, governed by a transition function (see Eq. (1)). This FSM-aware LLM, operating as a next-token predictor, achieves lower response latency than half-duplex baselines, and increased interruption-handling precision by 8% over the best commercial competitor at comparable model scale.
B. Learned Synchronization:
End-to-end architectures trained directly on synchronous bidirectional data, using joint or conditional sequence modeling (e.g., NTPP, dGSLM, Moshi), and discrete or continuous representations of audio streams (Chen et al., 18 Sep 2025). These models often approach human-level MOS but still lag on temporal metrics (e.g., first-turn offset, interruption response) and semantic coherence under high overlap conditions.
Unified evaluation frameworks, such as Full-Duplex-Bench (Lin et al., 6 Mar 2025) and FD-Bench v1.5 (Lin et al., 30 Jul 2025), stress-test models on overlap management: interruption handling, backchannel robustness, prosodic adaptation, and latency/quality trade-offs. Metrics include behavioral arbitration (respond/resume/etc.), stop/response latency, overlap-WER, and prosodic modulation. Notably, models exhibit a dichotomy between repair-first strategies (rapid yielding to user barge-in) and continuity-first (maintaining response coherence under overlap), with different suitability depending on application context (Lin et al., 30 Jul 2025). Current FD-SLMs, both modular and end-to-end, remain challenged on fluid response to interruptions, cross-talker distinction, and acoustic robustness under noise and far-field speech (Lin et al., 6 Mar 2025, Peng et al., 25 Jul 2025).
5. Design Trade-offs, Hardware and Implementation Constraints
Across domains, full-duplex models are constrained by:
- Residual SI and dynamic range: Analog/digital cancellation must suppress SI below the noise floor to avoid overwhelming receiver ADCs or masking the signal-of-interest. This is exacerbated in high-power or wideband systems, with hardware impairments (PA nonlinearity, I/Q imbalance, phase noise) further complicating SI cancellation (Chung et al., 2015).
- Scheduling complexity: For MAC and network layers, practical algorithms (e.g., biconvex optimization, LP scheduling, distributed contention mechanisms) balance throughput, fairness, and implementation overhead (Marašević et al., 2015, Chen et al., 2016).
- Dialogic fluidity: In FD-SLMs, fast event-loop orchestration (ASR chunking, streaming TTS, LLM stateful decoding) and precise state control are critical to avoid premature takeovers or incoherent interruptions (Wang et al., 29 May 2024, Chen et al., 18 Sep 2025).
- Resource constraints: FPGA/CPU limitations for real-time signal processing, VAD/ASR chunk size trade-offs for dialogue latency, and memory/compute capacity for LLM-based FD dialogue remain key constraints in deployment (Chung et al., 2015, Wang et al., 29 May 2024).
6. Impact, Evaluation Benchmarks, and Open Challenges
Full-duplex designs have nearly doubled measured throughput in prototype wireless systems (Chung et al., 2015), supported 35–40% network-wide gains in CSMA/CA-based WLANs (Doost-Mohammady et al., 2015), and enabled real-time overlapping interaction in neural dialogue agents (Wang et al., 29 May 2024, Lin et al., 6 Mar 2025). Comprehensive benchmarks for FD spoken dialogue now incorporate behavioral, latency, and quality axes, with large-scale, synthetically or empirically constructed test corpora (Peng et al., 25 Jul 2025, Chen et al., 18 Sep 2025).
Persisting open challenges include:
- Synchronous data scarcity: Realistic large-scale, multi-channel training corpora for FD-SLMs are lacking (Chen et al., 18 Sep 2025).
- Robust real-time SI cancellation: Especially under hardware impairments and for wideband/millimeter-wave links with strong near-field coupling and complex environmental reflections (Roberts et al., 2023).
- Evaluation beyond rate/MOS: Multi-dimensional metrics targeting conversational repair, proactive behaviors, and safety in dialogue agents, and fairness/coverage in wireless resources (Chen et al., 18 Sep 2025).
- Dynamic mode switching: Hybrid HD/FD transceivers that adapt operation to traffic, SI, and interference in situ achieve best-in-class joint comm-sensing (ISAC) or dialogue fluidity-performance trade-off (Wang et al., 2022).
Ongoing research continues to integrate hybrid architectures, advanced SI estimation/cancellation, end-to-end FD machine learning, and comprehensive multi-axis benchmarks, driving FD models toward robust deployment in both wireless and interactive human-AI systems.