Papers
Topics
Authors
Recent
Search
2000 character limit reached

Targeted Beam Alignment Strategy (2PHTS)

Updated 4 February 2026
  • Targeted beam alignment is an algorithmic framework using multi-armed bandit theory and phased search to efficiently pinpoint optimal beams in high-dimensional spaces.
  • It exploits angular correlation and heteroscedastic reward modeling to greatly reduce alignment latency and measurement overhead compared to exhaustive search methods.
  • The 2PHTS approach demonstrates a 4x–20x reduction in probe count and over 97% beam detection accuracy in practical mmWave systems.

A targeted beam alignment strategy refers to an algorithmic framework for efficiently and reliably identifying optimal beam directions in wireless, optical, or particle beam delivery systems. Central to these strategies are mechanisms that minimize alignment latency and measurement overhead while attaining high probability of correct beam selection, often by exploiting the correlation and structure in angular domains or beam codebooks. Contemporary approaches blend pure-exploration multi-armed bandit theory, heteroscedastic reward modeling, and phased search procedures, as exemplified by the Two-Phase Heteroscedastic Track-and-Stop (2PHTS) algorithm (Wei et al., 2022). These techniques are particularly salient in milimeter-wave systems where the beam space is large and probing resources are scarce, but are applicable in broader alignment contexts including mmWave, THz, and general high-dimensional settings.

1. Problem Formulation: Beam Alignment as Structured Pure Exploration

Targeted beam alignment is cast as a pure-exploration multi-armed bandit (MAB) problem. The transmitter is equipped with a fixed analog beamforming codebook C={f0,,fK1}\mathcal{C} = \{ \mathbf{f}_0,\ldots,\mathbf{f}_{K-1} \}, each fk\mathbf{f}_k representing a unit-norm NN-antenna beam. Sequentially probing an arm kk means transmitting a pilot on fk\mathbf{f}_k and receiving

R(fk)=phHfk+n2N(μk,2σ2μk),R(\mathbf{f}_k) = | \sqrt{p} \mathbf{h}^\mathsf{H} \mathbf{f}_k + n |^2 \sim \mathcal{N}(\mu_k, 2\sigma^2 \mu_k),

with mean μk=phHfk2\mu_k = p |\mathbf{h}^\mathsf{H} \mathbf{f}_k|^2 and noise nCN(0,σ2)n \sim \mathcal{CN}(0,\sigma^2). The objective is (δ,J)(\delta,J)-PAC: to find k=argmaxkμkk^* = \arg\max_k \mu_k such that P(kπ=k)1δP(k^\pi = k^*) \ge 1-\delta while minimizing the total number of required probes τ\tau. The method leverages local angular correlation, recognizing that beams within a window JJ have similar expected rewards.

2. Metric Model and Confidence Bounds: Heteroscedasticity and KL Analysis

Unlike classic MAB settings with homoscedastic Gaussian noise, beam alignment rewards exhibit heteroscedasticity:

  • Each base-arm kk yields RkN(μk,2σ2μk)R_k \sim \mathcal{N}(\mu_k, 2\sigma^2 \mu_k).
  • Super-arms SS (sets of JJ consecutive beams) aggregate rewards: RSN(μS,2σ2μS)R_S \sim \mathcal{N}(\mu_S, 2\sigma^2 \mu_S) with μS=phH(kSfk)2\mu_S = p |\mathbf{h}^\mathsf{H} (\sum_{k \in S} \mathbf{f}_k)|^2.

Discrimination between arms is quantified by the KL divergence of heteroscedastic Gaussians,

DHG(μi,μj)=12lnμjμi+μiμj2μj+(μjμi)24σ2μj12.D_{HG}(\mu_i,\mu_j) = \frac{1}{2} \ln\frac{\mu_j}{\mu_i} + \frac{\mu_i-\mu_j}{2\mu_j} + \frac{(\mu_j-\mu_i)^2}{4\sigma^2 \mu_j} - \frac{1}{2}.

Stopping while controlling confidence uses β(t,δ,α)=ln(αt/δ)\beta(t,\delta,\alpha) = \ln(\alpha t/\delta).

3. Two-Phase Track-and-Stop Procedure

The 2PHTS strategy partitions alignment into two sequential phases:

Phase I: Beam-Set (Super-arm) Selection

  • Group KK beams into G=K/JG = K/J non-overlapping sets SgS_g, each of length JJ.
  • Apply Heteroscedastic Track-and-Stop (HT&S) to select gg^*:
    • Track number of pulls TgT_g, empirical means μ^g\hat{\mu}_g.
    • Pull least-tested super-arms or allocate based on optimal strategy.
    • Stop when the discrimination variable Z(t)Z(t) between best and alternatives exceeds the bound β(t,δ1,1)\beta(t, \delta_1, 1).
    • Select the super-arm with the highest empirical mean.

Phase II: Beam Identification Within Set

  • Form candidate set Sf=SgS_f = S_{g^*} \cup neighbor (the adjacent set with higher μ^\hat{\mu}), total 2J\le 2J beams.
  • Reapply HT&S with risk δ2=δδ1\delta_2 = \delta - \delta_1 to select kk^*.
  • Output kk^* as the recommended beam.

This phased structure exploits local smoothness and beam grouping, reducing the probe space from KK to G+2JG + 2J.

4. Theoretical Guarantees and Optimality

2PHTS admits the following performance bounds:

  • Lower bound: For any (δ,J)(\delta,J)-PAC algorithm,

E[τ]c(ν)ln14δ,\mathbb{E}[\tau] \ge c^*(\nu) \ln\frac{1}{4\delta},

where c(ν)1=supwΔKinfuAlt(ν)k=1KwkDHG(μk,μku)c^*(\nu)^{-1} = \sup_{w \in \Delta_K} \inf_{u \in \text{Alt}(\nu)} \sum_{k=1}^K w_k D_{HG}(\mu_k, \mu_k^u).

  • Upper bound: lim supδ0E[τ]/ln(1/δ)cu(s)+cu(b)\limsup_{\delta \to 0} \mathbb{E}[\tau]/\ln(1/\delta) \le c_u^*(s) + c_u^*(b), separating cost for super-arm and base-arm phases.

Crucially, the phase decomposition yields order-of-magnitude reduction in probes compared to exhaustive search.

5. Practical Implementation, Parameters, and Simulation Results

Parameters:

  • δ\delta is split between phases (δ1+δ2=δ\delta_1 + \delta_2 = \delta).
  • Choice of JJ determined by codebook (J=2K/N1J = 2\lceil K/N\rceil - 1 for NN antennas).
  • Overlapping variants in Phase II use $2J$ arms to prevent boundary effects.

Simulation, as reported in (Wei et al., 2022):

  • N=64N=64 Tx antennas, K=120K=120 beams, J=3J=3.
  • Channels with L=3L=3 paths; noise σ2=80\sigma^2 = -80 dBm.
  • Across synthetic and "ray-tracing city" scenarios, 2PHTS requires 4×4\times20×20\times fewer probes than exhaustive or vanilla track-and-stop, achieving >97%>97\% beam-detection probability.

6. Underlying Principles: Correlation Exploitation, Heteroscedasticity, and Latency Reduction

Targeted strategies leverage:

  • Angular correlation: spatially proximate beams sharing high reward structure render wide-area sweeps unnecessary.
  • Heteroscedastic modeling: measurement noise scales with mean reward, allowing sharper confidence intervals.
  • Phased search: coarse localization in super-arm space, refined identification, minimizes expected probe count for target confidence.

In practical mmWave contexts, where coherence time (35μ\sim 35\,\mus) permits >104>10^4 slots, 2PHTS typically uses O(102)O(10^2)O(103)O(10^3) slots for full alignment.

7. Significance and Extensions

This paradigm extends to hierarchical codebooks, side-information pre-filtering, and adaptive grouping for non-uniform angular statistics. The method generalizes to broader pure-exploration problems in high-dimensional search, provided the reward topology admits sufficient local smoothness. Hyperparameter tuning (e.g., α\alpha in β(t,δ,α)\beta(t,\delta,\alpha), initial pulls, overlapping window size) is implementation-driven; rigorous bounds serve as design guidance for practical system deployment.

The targeted beam alignment strategy, as exemplified by 2PHTS, marks an intersection of multi-armed bandit theory, angular correlation modeling, and phased search logic that achieves provable latency minimization and high-confidence operation in wide-band beamforming systems (Wei et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Targeted Beam Alignment Strategy.