Papers
Topics
Authors
Recent
Search
2000 character limit reached

Squint in Wireless, Learning & Robotics

Updated 4 July 2026
  • Squint is a multifaceted topic covering beam squint in wideband arrays, a second-order, parameter-free online learning algorithm, and a visual soft actor–critic method in robotics.
  • In communications, squint introduces frequency-dependent deviations in beam direction, affecting spectral efficiency and prompting novel codebook and beamforming designs.
  • In learning and robotics, Squint achieves adaptive second-order regret control and fast sim-to-real transfer via resolution squinting and optimized visual pipelines.

Squint denotes several technical concepts in current arXiv literature. In wireless communications it most often refers to beam squint, the frequency-dependent shift of a beam’s main lobe or focal point in wideband arrays. In online learning it names a second-order, parameter-free algorithm for prediction with expert advice and later extensions for changing environments and global second-order control. In robotics it names a visual Soft Actor–Critic method engineered for fast wall-clock training and zero-shot sim-to-real transfer (Cai et al., 2017, Neuteboom et al., 2022, Almuzairee et al., 24 Feb 2026).

1. Technical senses of the term

In the literature represented here, the term is used in three distinct ways.

Domain Meaning Representative arXiv source
Wideband wireless/array processing Beam squint: frequency-dependent beam or focus displacement (Cai et al., 2017)
Online learning Squint: second-order expert-advice algorithm and later variants (Neuteboom et al., 2022)
Sim-to-real robotics Squint: visual SAC method with “resolution squinting” (Almuzairee et al., 24 Feb 2026)

The first usage is by far the broadest. It spans switched-beam codebooks, hybrid beamforming, THz and sub-THz systems, near-field XL-MIMO, IRS design, integrated sensing and communications, and wideband OTFS. The second usage is specific to sequential decision-making with expert advice, where Squint is defined through a mixture over learning rates and a second-order potential. The third usage is a proper algorithm name in reinforcement learning, where “resolution squinting” denotes a deliberate render-then-downsample observation pipeline rather than any electromagnetic effect (Cai et al., 2017, Luo, 3 Mar 2026, Almuzairee et al., 24 Feb 2026).

2. Beam squint as a wideband array phenomenon

In phased arrays, beam squint arises because the phase shifters are typically fixed for the carrier frequency and do not realize the exact time delays required at other frequencies. For a ULA with NN antennas, spacing dd, incident angle θ\theta, and frequency ff, the steering vector is

a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.

If the beamformer is fixed at fcf_c to focus at θF\theta_F, the phase shifts are

βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.

Using the virtual-angle notation ψ=sinθ\psi=\sin\theta and ξ=f/fc\xi=f/f_c, the maximum gain occurs when dd0, so the effective pointing direction is dd1, and the squint in virtual angle is

dd2

For small fractional offset, dd3, with dd4 (Cai et al., 2017).

An equivalent formulation in wideband OFDM uses the subcarrier frequencies

dd5

and the equivalent spatial angle

dd6

The dimensionless squint factor is

dd7

so dd8. This makes explicit that wider bandwidths cause different subcarriers to “see” different steering directions even when the array geometry is fixed (Yu et al., 2021).

The same mechanism appears beyond ULAs. In UPAs, the normalized array gain becomes the product of two Dirichlet terms, one per spatial dimension, and the paper on THz communications summarizes the severity through a Beam Squint Ratio,

dd9

With half-wavelength spacing and fixed total θ\theta0, the BSR is minimized by a square UPA, and the paper states

θ\theta1

In a separate wideband large-scale MIMO analysis, the closed-form beam-squint ratio for a ULA is

θ\theta2

which again scales linearly with antenna count and fractional bandwidth (Ma et al., 2023, Ma et al., 2022).

Near-field wideband systems generalize angular squint to spatial squint. A phase-shifter vector that focuses at θ\theta3 for θ\theta4 only perfectly focuses the central subcarrier; the remaining subcarriers shift to different θ\theta5. In near-field ISAC work this is described as a continuous spatial trajectory of beam foci across subcarriers, and in wideband XL-MIMO it is the basis for controllable beam squint with true-time-delay lines (Luo et al., 2023, Lei et al., 2024).

Performance degradation follows directly from this frequency dependence. For a single-path OFDM channel, the wideband capacity with beam squint,

θ\theta6

satisfies

θ\theta7

with strict inequality except at broadside. In fixed-size codebooks with idealized beams, the average spectral efficiency for small θ\theta8 satisfies

θ\theta9

so the drop is linear in the squint factor (Cai et al., 2017, Yu et al., 2021).

3. Compensation and suppression in communication systems

A first line of work treats beam squint as a codebook-design problem. In switched-beam systems, each beam ff0 is assigned a coverage set

ff1

and the objective is to minimize codebook size subject to complete angular coverage. The resulting beam-alignment procedure starts at broadside, finds coverage edges by binary search, mirrors beams symmetrically, and continues until the target interval is covered. The paper reports that, for ff2, ff3 GHz, and ff4 GHz, the squint-aware design yields up to ff5 higher minimum capacity; at the same operating point, roughly ff6 beams are needed rather than ff7 if squint is ignored. It also identifies a supremum ff8 beyond which no finite codebook can satisfy the capacity constraint (Cai et al., 2017).

A second line assumes the codebook size is fixed and optimizes beam shapes. One formulation samples the expanded angular interval seen under squint, introduces weights ff9, and maximizes a weighted sum of a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.0 subject to a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.1. Because these constraints are non-convex, the paper applies the Concave–Convex Procedure, linearizes each quadratic form around the current iterate, and solves the resulting convex program in CVX. The reported outcome is that enlarged coverage alone only partially recovers edge-subcarrier rates, whereas the full optimization slows the spectral-efficiency degradation; at a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.2, the proposed codebook recovers a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.3 more spectral efficiency than DFT, and the design guideline given is a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.4 at worst-case bandwidth (Yu et al., 2021).

Hybrid beamforming generalizes the mitigation problem to array architecture. For THz UPAs, one proposed design builds the frequency-flat analog combiner from the dominant eigenvectors of the subcarrier-averaged sample covariance and then applies phase-only projection. Because the analog part is derived from all subcarriers rather than only the carrier, it is less sensitive to squint. The same study emphasizes that a square UPA is intrinsically more robust than a ULA of the same aperture. In switch-based HBF, the reported contrast is sharper: the expected array gain of PS-based beamforming decreases monotonically with BSR and approaches a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.5, whereas the switch-based approximation is

a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.6

which stays above a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.7 when many antennas are connected. Under a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.8 GHz, a(f,θ)=[1,ej2πf(dsinθ)/c,,ej2πf(N1)(dsinθ)/c]T.a(f,\theta)=\bigl[1,e^{j2\pi f(d\sin\theta)/c},\dots,e^{j2\pi f(N-1)(d\sin\theta)/c}\bigr]^T.9, fcf_c0, and bandwidth up to fcf_c1 GHz, the proposed SW-HBF is reported to achieve over fcf_c2 higher spectral efficiency than fcf_c3-bit FC-PS-HBF and energy-efficiency gains exceeding fcf_c4 at fcf_c5 dB in the single-user case (Ma et al., 2023, Ma et al., 2022).

Other mitigations use diversity, delay elements, or geometric reconfiguration. A constant-modulus beamformer designed by semidefinite relaxation can switch between direct beamforming and Alamouti STBC based on the rank structure of the relaxed solution; at fcf_c6 GHz and fcf_c7, the reported band-edge gain loss is reduced from fcf_c8 dB to fcf_c9 dB, with throughput gains of θF\theta_F0 at θF\theta_F1 GHz bandwidth and θF\theta_F2 at θF\theta_F3 GHz bandwidth. Delay-adjustable metasurfaces for THz IRS communications choose per-element phase shifts and time delays to cancel the affine-in-frequency phase terms; in the representative θF\theta_F4 GHz, θF\theta_F5 GHz, θF\theta_F6 setting, the beam-gain ripple is reported as less than θF\theta_F7 in the far field and restored to within θF\theta_F8 of the peak in the near field. Wideband near-field suppression via movable antennas formulates a max–min analog-gain problem over antenna positions and solves it with an SGDA-based block-coordinate procedure, producing an almost perfectly flat gain curve across θF\theta_F9–βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.0 GHz. A related THz design jointly optimizes analog beamforming and 3D rotation; its reported minimum-gain improvement is βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.1 dB versus no rotation (Liu et al., 2018, Hao et al., 2022, Zhu et al., 2024, Xie et al., 11 Mar 2025).

Beam squint can also couple with other wideband impairments. In massive MIMO-OTFS, beam squint and Doppler squint form a doubly-squint effect. The cited work derives a peak-index-based channel estimator and a hybrid precoder with TTD, phase-shifter, and OTFS-domain compensation, and reports that the proposed design outperforms Doppler-only, delay–phase, and conventional PDMA precoding in achievable delay–Doppler-grid rate (Duan et al., 11 Apr 2025).

4. Beam squint as a sensing and localization resource

A recurrent theme in recent ISAC work is that beam squint need not only be mitigated; it can be engineered as a frequency-domain scanner. In a wideband massive-MIMO OFDM system with TTD lines, one design chooses the phase-shifter setting βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.2 and delays βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.3 so that the beam points to βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.4 at βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.5 and to βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.6 at βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.7. The resulting main-lobe direction satisfies

βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.8

so the subcarriers sweep monotonically across the desired angular interval. If inter-element spacing is enlarged beyond βn=2πfcd(n1)sinθF/c.\beta_n=-2\pi f_c\,d\,(n-1)\sin\theta_F/c.9, beam split creates additional lobes and expands the sensing range. The claimed operational consequence is that ψ=sinθ\psi=\sin\theta0 frequency-domain beams can be transmitted within a single OFDM symbol, reducing over-the-air training time by roughly a factor of ψ=sinθ\psi=\sin\theta1; with one extra intersection repeat for beam-split disambiguation, only two OFDM symbols are needed (Xu et al., 2022).

Near-field ISAC uses an analogous idea in joint angle–range space. For a phase-only near-field beamformer designed at ψ=sinθ\psi=\sin\theta2 and reference frequency ψ=sinθ\psi=\sin\theta3, the squinted focus on subcarrier ψ=sinθ\psi=\sin\theta4 follows

ψ=sinθ\psi=\sin\theta5

With TTDs, the start and end points of the trajectory can be fixed at chosen anchors ψ=sinθ\psi=\sin\theta6 and ψ=sinθ\psi=\sin\theta7, allowing the system to “draw” a trajectory of beam foci across the near-field volume. Localization then reduces to identifying the peak-power subcarrier and, in higher-accuracy variants, combining multiple sweeps and phase differences. The reported CBS-Low method reduces beam sweeps by ψ=sinθ\psi=\sin\theta8, while CBS-High with ψ=sinθ\psi=\sin\theta9 sweeps yields angle RMSE ξ=f/fc\xi=f/f_c0 and range RMSE ξ=f/fc\xi=f/f_c1 m at ξ=f/fc\xi=f/f_c2 dB SNR (Luo et al., 2023).

Wideband XL-MIMO localization extends this idea further by combining controllable beam squint and deep learning. The cited formulation derives CRBs for joint angle–range estimation under spatial non-stationarity, then proposes a three-stage CBS-based beam-training procedure: coarse angle, angular refinement by subcarrier grouping, and iterative range refinement. A ConvNeXt model then consumes the measurements and coarse estimates and regresses ξ=f/fc\xi=f/f_c3 directly. The reported performance is centimeter-level accuracy, with ξ=f/fc\xi=f/f_c4 cm and ξ=f/fc\xi=f/f_c5 rad at ξ=f/fc\xi=f/f_c6 dB (Lei et al., 2024).

These results correct a common oversimplification in beam-squint discussions. The phenomenon is indeed harmful to communication gain and rate under carrier-designed phase-only beamforming, but the same frequency dependence becomes informative when subcarriers are deliberately assigned distinct directions or focal points. This is the central methodological bridge between the mitigation literature and the sensing/localization literature (Xu et al., 2022, Luo et al., 2023, Lei et al., 2024).

5. Squint in online learning with expert advice

In online learning, Squint is a second-order algorithm for the expert problem. At round ξ=f/fc\xi=f/f_c7, the learner chooses ξ=f/fc\xi=f/f_c8, observes losses ξ=f/fc\xi=f/f_c9, and incurs dd00. The instantaneous regret to expert dd01 is

dd02

with cumulative regret dd03 and second-order term dd04. The defining Squint potential is

dd05

and the original update is

dd06

As summarized in later notes, this produces a simultaneous dd07-quantile regret guarantee in terms of the variance of the dd08-quantile expert (Luo, 3 Mar 2026).

The 2022 changing-environment extension, Squint-CE, begins from the observation that a conventional black-box meta-wrapper destroys Squint’s favorable second-order behavior: the induced overhead dd09 dominates the sublinear advantages coming from variance adaptation. Squint-CE therefore intertwines Squint’s surrogate reduction with a single layer of exponential-weights meta-combination over geometric intervals. For every contiguous interval dd10, it guarantees

dd11

where

dd12

In big-dd13 form, the bound is

dd14

so the changing-environment version preserves second-order dependence on interval variance up to logarithmic factors (Neuteboom et al., 2022).

A 2026 note proposes a simple variant that replaces the expert-specific dd15 by a single global dd16. The algorithm still computes

dd17

but then updates dd18, where dd19 is chosen as the root of

dd20

with

dd21

The resulting dd22-quantile regret bound has the same form as the original except that it depends on the global dd23 rather than dd24, and the note states that it resembles the guarantee obtained by Freund et al. for a variant of NormalHedge (Luo, 3 Mar 2026).

Within this literature, the main conceptual distinction is therefore between per-expert and global second-order control, and between static and changing-environment regret. The term “Squint” refers to the family of algorithms built around the same potential-based, learning-rate-mixture construction rather than to a single fixed update rule (Neuteboom et al., 2022, Luo, 3 Mar 2026).

6. Squint in fast visual reinforcement learning for robotics

In robotics, Squint is an off-policy, vision-based actor–critic algorithm built on Soft Actor–Critic and designed to minimize wall-clock training time in massively parallel GPU simulation while transferring zero-shot to a real dd25 DoF SO-101 robot arm. The high-level loop uses dd26 parallel ManiSkill3 environments at dd27 Hz, renders wrist-camera RGB images at dd28, downsamples them to dd29, appends proprioception, and stores transitions in a GPU-resident replay buffer of size dd30 M. After each environment step it performs dd31 gradient updates of a shared two-layer CNN encoder, two C51-style distributional critics and their EMA targets, a stochastic Gaussian policy, and an entropy temperature dd32 (Almuzairee et al., 24 Feb 2026).

The defining design choice is resolution squinting. The observation is not rendered directly at low resolution; instead the simulator renders at dd33 and applies area-interpolation downsampling to dd34,

dd35

The paper attributes two effects to this pipeline: lower compute for the two-layer CNN encoder and natural anti-aliasing that preserves object shape under heavy domain randomization. This component is coupled with LayerNorm after every linear layer in actor and critic heads, a tuned update-to-data ratio of roughly dd36, and a systems stack using torch.compile, CUDA Graphs, mixed-precision dd37 convolutions, and an entirely on-GPU replay buffer. The implementation is reported to achieve more than a dd38 end-to-end speed-up over a naive off-policy visual agent (Almuzairee et al., 24 Feb 2026).

The learning objectives retain standard SAC structure but add a distributional critic loss. The critic minimizes a soft Bellman residual, the actor minimizes

dd39

the temperature is adapted with a target-entropy loss, and each critic also minimizes a C51 cross-entropy to a projected Bellman target distribution. The encoder itself is deliberately small: two dd40 convolutional layers with channels dd41, ReLU activations, and batch size dd42 (Almuzairee et al., 24 Feb 2026).

Empirically, Squint is evaluated on eight SO-101 manipulation tasks—Reach Cube, Reach Can, Lift Cube, Lift Can, Place Cube, Place Can, Stack Cube, Stack Can—with heavy visual and physical domain randomization. Policies are trained for dd43 minutes on a single NVIDIA RTX 3090 GPU. The reported simulation mean success rate over all eight tasks after dd44 minutes is dd45, and most tasks converge in under dd46 minutes. In the real world, the zero-shot success rate across dd47 trials is dd48, which the paper describes as a dd49 absolute improvement over state-to-visual DAgger at dd50 when accounting for the time to train its state-based expert. A visual robustness ablation reports that removing color jitter reduces success from dd51 to dd52 (Almuzairee et al., 24 Feb 2026).

In this usage, “Squint” is not related to beam steering or expert-advice regret. It is the proper name of a visual SAC system whose central innovations are parallel simulation, a distributional critic, anti-aliased low-resolution observations, tuned update-to-data ratio, and GPU-level implementation choices (Almuzairee et al., 24 Feb 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Squint.