Two-Stage Multi-Beam Training Method

Updated 14 January 2026

The paper demonstrates that two-stage multi-beam training can reduce beam training measurements by up to 81% compared to exhaustive search while maintaining high success rates.
It employs a sequential process where an initial coarse multi-beam sweep identifies candidate directions and a refinement stage uses dense measurements for accurate beam selection.
The method balances training overhead and accuracy across various architectures, including analog, hybrid, and digital systems, enabling efficient directional communication.

A two-stage multi-beam training method is a sequential beam selection and refinement scheme designed to minimize training overhead while maximizing the accuracy and efficiency of beam alignment and directional communication, particularly in mmWave and THz wireless systems. Such approaches structurally divide the beamspace search process into two coordinated phases: an initial coarse/focused candidate identification stage using multi-beam patterns or analog/digital codebooks, followed by a secondary disambiguation or refinement stage using narrowed candidate sets, higher-resolution probing, or algorithmic cross-validation. This design targets the fundamental challenge of balancing rapid link establishment against the explosion in required measurements with large antenna arrays and dense angular grids. Two-stage multi-beam training has been shown to dramatically reduce the number of required beam training measurements relative to exhaustive or single-stage approaches, while retaining—sometimes even improving—the success rate of correct beam identification (Wang et al., 7 Jan 2026, Zhou et al., 2024, Yang et al., 2021, Ghassemi et al., 2024, Yang et al., 2022).

1. Core Principles and General Methodology

Two-stage multi-beam training schemes leverage the observation that, for large arrays and dense angular grids, most candidate beam directions are irrelevant given an initial set of measurements. The first stage typically uses spatially multiplexed or structured multi-beam patterns to identify a manageable set of potential user directions, leveraging sparse subarray geometries, multi-modal sensor data, or hierarchical codebooks. The second stage then resolves ambiguity among these candidates using more focused measurements, often with dense subarrays, digital refinement, cross-validation, or learning-based decision policies.

The general workflow can be characterized as follows:

Stage I (Multi-beam sweeping with sparse/analog or partitioned codebooks):
- The antenna array is partitioned (often sparsely) and excited to form high-gain, multiple simultaneous beams covering the angular domain in an overlapping or interleaved fashion.
- Received measurement energy across multi-beam codewords is used to create a candidate set of possible target angles (or, in near-field, range–angle pairs).
Stage II (Refinement with dense/analog/digital or adaptive codebooks):
- The array is re-partitioned (densely) or digitally controlled to form wider but steerable beams, or to iteratively probe within candidate sectors.
- Algorithms such as cross-validation, Q-learning, or clustering are applied to efficiently resolve ambiguity.
- Final beam(s) or beam group index are selected for data transmission.

This approach can be implemented with pure analog, hybrid analog–digital, or fully digital architectures, and is compatible with hierarchical, learning-based, or cross-domain sensor fusion paradigms (Wang et al., 7 Jan 2026, Ghassemi et al., 2024, Yang et al., 2022).

2. Representative Mathematical Formulations

A canonical example is the method of (Wang et al., 7 Jan 2026), which mathematically structures the procedure as follows:

Stage I (Sparse-Subarray Multi-Beam Sweeping):

Partition $N$ -element ULA into $M^{(I)}$ sparse subarrays, each of size $N_1=N/M^{(I)}$ , with inter-subarray spacing $d_1=M^{(I)}d_0$ ( $d_0 = \lambda/2$ ).
Each codeword $\mathbf w^{(I)}$ is an interleaved stack of $M^{(I)}$ subarray vectors, producing $Q^{(I)} = (M^{(I)})^2$ narrow beams.
Full coverage of the angular domain requires $L^{(I)} = N/(M^{(I)})^2$ codewords.
Candidate angles for user $k$ are determined by received power maximization:

$\ell_k = \arg\max_\ell P_k^{(I)}(\ell), \qquad \Omega_k^{(I)} = \{ -1+\tfrac{2\ell_k-1}{N} + \tfrac{2(q-1)}{(M^{(I)})^2} : q=1,\dots, Q^{(I)} \}$

Stage II (Dense-Subarray Cross-Validation):

Re-partition array into $M^{(II)}$ dense subarrays of size $N_2 = N/M^{(II)}$ with $d_2=d_0$ .
Each wide-beam codeword $\mathbf w^{(II)}$ forms $M^{(II)}$ wider beams, each with 3 dB beamwidth $2M^{(II)}/N$ .
An iterative cross-validation procedure subdivides the angular domain and selects amongst candidates with a binary search–like overhead $1 + \log_2 M^{(II)}$ .

Overhead:

$T = \frac{N}{M^2} \left(2 + \frac12 \log_2 Q\right)$

where $M = M^{(I)} = M^{(II)}$ , $Q = M^2$ .

3. Implementation Variants and Extensions

Multiple high-impact papers illustrate alternative strategies for two-stage multi-beam training:

Near-Field Sparse Activation: (Zhou et al., 2024) introduces sparse linear array (SLA) multi-beam codebooks exploiting grating lobes in the near field. Stage I uses $Q$ SLA codewords, each producing $M$ simultaneous beams to cover the range–angle sector, while Stage II sweeps $M$ single-beam codewords across candidate locations for ambiguity resolution.
Deep Learning–Based Hierarchical Codebooks: Learning-based hierarchical codebooks (Yang et al., 2022, Yang et al., 2023) implement a two-tier procedure: Tier-1 probes the channel using a coarse codebook and selects a candidate set/group via an MLP selector; Tier-2 sweeps a fine codebook within the selected group. The full DNN is trained in two steps, first the tier-1 codebook and selector, then the fine codebooks and final predictors.
Reinforcement Learning and Multi-Modal Fusion: (Ghassemi et al., 2024) merges multi-modal transformers (MMT) for group prediction with Q-learning for intra-group beam selection. The MMT reduces the decision space from 64 to 8 beam groups, using transformer-encoded sensor data, after which a tabular RL agent selects the optimal beam index within the selected group.
Analog/Digital Hybrid Codebooks: In large-scale MIMO, hybrid analog–digital beam training alternates analog multi-beam scanning across subarrays with digital combining for fine discrimination, as in the THBT method (Chen et al., 2023). The first stage is performed per subarray using far-field approximations; digital combining then selects the optimal joint codeword using stored subarray measurements, followed by BRPSS (closed-form phase tracking) for post-selection refinement.

4. Performance Metrics and Comparative Analysis

The effectiveness of two-stage multi-beam training is measured by the tradeoff between training overhead, correct beam-identification rate (success rate), and spectral efficiency. Representative findings include:

Scheme (N=256, K=5)	Overhead $T$	Success Rate $R_s$ @ SNR=–18.3dB
Single-beam exhaustive	256	0.98
Dense-subarray multi-beam	64	0.72
Antenna sparse-activation	96	0.60
2-tier hierarchical	64	0.65
Two-stage multi-beam (Q=16)	48	0.87

Table: Representative results from (Wang et al., 7 Jan 2026).

Key findings across systems and SNR regimes:

Two-stage methods achieve up to 81% reduction in measurement overhead while incurring only minor reduction in identification accuracy versus exhaustive search (Wang et al., 7 Jan 2026, Zhou et al., 2024).
The performance gap over single-stage sparse and hierarchical benchmarks can exceed 15–25 percentage points in difficult SNR regimes.
Hybrid analog–digital and learning-based two-stage methods demonstrate similar order-of-magnitude reduction in RF training overhead with high robustness to noise and multipath (Yang et al., 2021, Yang et al., 2022, Ghassemi et al., 2024).

5. Algorithmic and Structural Tradeoffs

The success of two-stage strategies relies critically on choices of subarray partitioning, codebook structure, and candidate set size:

Sparse subarrays in Stage I minimize the number of narrow beams needed for full domain coverage but may introduce inter-beam interference if not properly designed.
Dense subarrays in Stage II produce wider, highly steerable beams used for ambiguity resolution and rapid cross-validation.
The optimal number of subarrays and per-stage codewords is problem-dependent. For $N=256$ and $Q=16$ , taking $M=4$ yields the best tradeoff for moderate $K$ (Wang et al., 7 Jan 2026).
Cross-validation and iterative candidate reduction (as opposed to full exhaustive or simple top-k selection) significantly contribute to reduced overhead and high identification rates (Wang et al., 7 Jan 2026, Ghassemi et al., 2024).
Deep learning or reinforcement learning can serve as either the group predictor (analogous to Stage I) or the fine selector (analogous to Stage II), with or without traditional codebooks (Ghassemi et al., 2024, Yang et al., 2022).

6. Limitations, Generalizations, and Outlook

While two-stage multi-beam training methods have proven efficacy, several limitations and domain-specific caveats are noted:

On-grid quantization: The achievable resolution is fundamentally bounded by codebook design; off-grid users may reduce ultimate accuracy (Zhou et al., 2024, Wang et al., 7 Jan 2026).
Noise and SNR robustness: At low SNR, Stage I candidate selection can dominate the miss probability; robust codeword design or increased gain per beam may be required.
Inter-beam interference: Simultaneous (multi-beam) patterns may exacerbate angular ambiguities in channels with high spatial correlation or strong scatterers.
System calibration: Hybrid methods require careful calibration between analog/digital stages; digital combining may require significant computational resources in massive MIMO.
Sensor and data constraints: Learning-based and multi-modal methods rely on the availability of suitably calibrated input modalities and sufficient training data (Ghassemi et al., 2024).

Future directions include integrating these methods into joint channel estimation/tracking pipelines, fully unifying online learning and sensing paradigms, and further adapting to the demands of near-field, ultra-massive MIMO, and high-mobility scenarios. The general two-stage design principle is expected to remain central as the scale and heterogeneity of communications systems increase.