Ultra-Massive MIMO: Next-Gen Antenna Systems

Updated 5 January 2026

UM-MIMO is an antenna technology featuring thousands of elements in UPAs or AoSA configurations to achieve orders of magnitude spatial multiplexing for 6G and THz systems.
It leverages near-field and hybrid-field propagation with innovative channel estimation and hybrid beamforming techniques to optimize performance across diverse operational regimes.
Integration with AI methods and reconfigurable surfaces enhances joint sensing and communication, enabling precise localization and reduced hardware complexity.

Ultra-massive multiple-input multiple-output (UM-MIMO) denotes antenna systems in which the number of elements reaches the thousands or tens of thousands, typically in ultra-dense planar or array-of-subarrays (AoSA) configurations. Targeting next-generation wireless, particularly 6G and THz band systems, UM-MIMO enables spatial multiplexing orders of magnitude beyond conventional massive MIMO, with key use cases in integrated sensing and communication, Tbps wireless, and sub-cm positioning. The following sections systematically review the central architectural models, algorithmic advances, signal processing, and implementation considerations defining the present state of the art in UM-MIMO.

1. System Architectures and Physical Models

UM-MIMO arrays are realized via extremely large uniform planar arrays (UPAs) or (often at THz) as AoSA, where each subarray comprises $M$ plasmonic or nano-antenna elements, and $K$ subarrays are coordinated for coherent transmission; overall, $N=K\cdot M$ antenna elements are involved (Faisal et al., 2019). At THz, graphene-based plasmonic nano-antennas leverage surface plasmon polariton modes ( $\lambda_{\mathrm{SPP}}\ll\lambda_0$ ), enabling sub-millimeter inter-element spacing and thus extreme densification.

In the near-field, propagation is inherently spherical, and a generic baseband channel model is

$\mathbf{H}=\sum_{\ell=1}^{L}\gamma_{\ell}\,\mathbf{a}_r(\theta_{\ell, r},\phi_{\ell, r})\,\mathbf{a}_t^H(\theta_{\ell, t},\phi_{\ell, t}),$

where each path is characterized by unique angle/range tuples and path gain. The steering vector encodes spherical wavefronts, e.g., for the $i$ th transmit element,

$[\mathbf{a}_t(\theta,\phi,r)]_i = \frac{1}{\sqrt{N_t}}\exp\left(-j\frac{2\pi}{\lambda}\|\mathbf{p}_{t,i}-\mathbf{p}(\theta,\phi,r)\|\right)$

with $\mathbf{p}_{t,i}$ and $\mathbf{p}(\theta,\phi,r)$ as antenna and target positions (Wan et al., 29 Dec 2025, Cao et al., 2023).

At transceiver level, AoSA arranges nano-antenna SAs in a rectangular/hexagonal planar grid, with element spacing $\delta\sim\lambda_{\mathrm{SPP}}$ for highest packing; each SA could perform coherent beamforming, while switching/hybrid architectures may flexibly activate subsets for spatial modulation or energy savings (Sarieddeen et al., 2019).

2. Near-Field and Hybrid-Field Propagation Regimes

The boundary distinguishing near- from far-field is the Rayleigh distance $r_{\mathrm{RD}}=2D^2/\lambda$ , with $D$ the array aperture. In UM-MIMO, $r_{\mathrm{RD}}$ may exceed 10--100 meters, making near-field effects omnipresent even outdoors (Hussain et al., 18 Mar 2025). Further, the effective beam-focused Rayleigh distance (EBRD) refines the operational definition by considering finite beam-depth and path-loss compensation, specifying the regime where focal (range-angle selective) beams are physically feasible.

The MIMO channel thus exists as a mixture of planar and spherical modes. This motivates hybrid-field estimation, where the full array is modeled as supporting both planar (far) and spherical (near) eigenfunctions, and the link may cross regime boundaries over operational distances. Channel models and processing (e.g., beamforming codebooks, channel estimation) must adapt to this hybrid field (Yu et al., 2022, Tarboush et al., 2023, Gao et al., 2024).

3. Waveform, Signal Processing, and Channel Estimation Techniques

UM-MIMO channel estimation and data transmission leverage wideband waveforms (e.g., OFDM, OCDM) and innovative pilot-sharing, but are fundamentally challenged by dimensionality, hybrid regimes, and RF front-end constraints.

Sparsity-Exploiting Estimation:

For THz AoSA, compressed sensing (CS) is used to recover sparse channel representations in appropriate dictionaries (angular for far-field, polar for near-field) (Tarboush et al., 2023).
Cross-field strategies adaptively select the estimation method (SWM, PWM, HSPWM) per transceiver geometry, further reducing CS dictionary size via local support extraction, enabling 3–4 $\times$ lower computational cost (Tarboush et al., 2023).
Model-driven deep learning approaches (e.g., fixed-point networks) embed trainable DNN denoisers into iterative recovery, offering linear convergence and robustness to field regime (Yu et al., 2022, Yu et al., 2024).

Near-Field Beam Training:

Far-field DFT codebooks, while mismatched in the near-field, can still efficiently uncover spatial focus due to high path-gain at short ranges; correlation interferometry (CI-DFT) measures the angular spread of DFT beams, infers angle and range, thus reducing training overhead by up to 87.5% while preserving achievable rate (Hussain et al., 18 Mar 2025).

Hybrid Precoding and Beamforming:

Hybrid analog-digital transceivers, often with $N_{\mathrm{RF}}\ll N$ due to cost/power, use ADMM-based or DNN-based methods to factorize unconstrained digital precoders/beamformers into analog (constant modulus/quantized) and baseband components (Pavia et al., 2021, Murshed et al., 2022, Murshed et al., 2024).
Hybrid dynamic subarrays (HDS) structure the RX as a set of programmable subarrays, enabling reduced-dimension signal acquisition while preserving spatial DoF for accurate direction-of-arrival (DOA) estimation (Tian et al., 30 Jan 2025).

Joint Sensing and Communications (ISAC):

UM-MIMO enables precise near-field radar imaging via dedicated sensing subcarriers and virtual bistatic fusion, where every TX–RX pair forms a virtual bistatic radar; fusing range/velocity across pairs enables sub-m 3D localization and simultaneous channel sensing (Wan et al., 29 Dec 2025, Cao et al., 2023).
Sensor-aided channel estimation, using ISAC-derived location/range priors to initialize channel dictionaries, lowers pilot overhead and NMSE compared to blind approaches (Wan et al., 29 Dec 2025).

4. Architectures for Hardware Efficiency

The hardware and energy cost per antenna is a central barrier to practical UM-MIMO.

Fully connected architectures provide maximal flexibility, but scale poorly in RF chain and phase-shifter count.
Array-of-subarrays and dynamic array-of-subarrays (DAoSA) architectures restrict each RF chain to a subarray or dynamically assigned sets, reducing complexity and power by over 70% with moderate spectral efficiency loss (Pavia et al., 2021).
Hybrid dynamic subarray (HDS) allows trade-off between spatial information and RF count via tunable switch proportion $\rho$ ; Cramér-Rao bounds quantify the effect of RF-chain and snapshot count on estimation accuracy (Tian et al., 30 Jan 2025).
Reconfigurable intelligent surface (RIS) and reconfigurable holographic surface (RHS) architectures take the RF-to-antenna mapping paradigm to the limit, using passive metasurfaces with active amplitude or low-resolution phase control, thereby realizing ultra-thin, low-cost, and ultra-dense apertures (Zeng et al., 2023, Di et al., 2024). Experimental RIS-based testbeds demonstrate dual-stream 5 Gbps links at 26 GHz with an order-of-magnitude reduction in power (Zeng et al., 2023).

5. AI and Deep Learning for Physical Layer Design

AI-enhanced approaches address the complexity, modeling, and measurement bottlenecks endemic to THz UM-MIMO.

Model-driven DL integrates neural modules into bottlenecks of classical iterative algorithms (OAMP, ZF), guided by physics-based priors, and benefits from linear convergence guarantees and data/compute efficiency (Yu et al., 2024, He et al., 2022).
Foundation models for channel state information (CSI)—score-based neural networks trained via denoising score matching—act as priors in various tasks (estimation, sampling, detection), unifying module design and improving adaptation to site-specific dynamics (Yu et al., 2024).
Self-supervised pretraining techniques (e.g., contrastive learning) yield hybrid beamformers that are robust to severe CSI errors, maintaining near-optimal spectral efficiency even under 20 dB CSI SNR degradation (Murshed et al., 2024).
DNN architectures (CNN-LSTM, GNN-enhanced AMP) accelerate hybrid beamforming and MIMO data detection by over $10^4\times$ at minimal SE loss and scale robustly over arrays or user load (Murshed et al., 2022, He et al., 2022).
Pre-trained LLMs facilitate cross-layer orchestration, automated simulation/generation, and protocol adaptation in deployment (Yu et al., 2024).

6. Implementation Constraints and Physical Layer Limits

UM-MIMO system design must be grounded in electromagnetic and circuit-theory-consistent modeling. Salient constraints and guidelines include:

Tightly coupled arrays (spacing-to-radius ratio $\delta/a\approx1.93$ ) harness mutual coupling to sustain broadband operation and maximize achievable rate (bandwidth gain), in contrast to half-wavelength “disconnected” paradigms (Akrout et al., 2022).
To maximize DoF per aperture, the number of antennas should sample no more than the physical DoF, e.g., $\eta_{2D}=\pi L_xL_y/\lambda^2$ for a $L_x\times L_y$ array (Björnson et al., 2024).
Hybrid and beam-domain processing relaxes required RF chain count from $M$ to channel rank $L\ll M$ with commensurate power/complexity savings (Feng et al., 11 Jan 2025).
Quantized or ON/OFF metasurface elements (e.g., 1–2 bit PIN/varactor loaded) in RHS architectures provide sufficient aperture-field control for most beamforming tasks, with demonstrated sum-rate and energy efficiency surpassing conventional phased arrays at scale (Di et al., 2024).
Near-/far-field pilot assignment, codebook design, subarray layout, and analog hardware constraints must be jointly optimized for performance/complexity/energy trade-offs, with detailed design heuristics reported in (Ning et al., 2021, Feng et al., 11 Jan 2025, Tian et al., 30 Jan 2025, Di et al., 2024).

7. Future Directions and Open Challenges

Frontier research issues in UM-MIMO encompass:

Unified near-/far-field channel and DoF theory for arbitrary arrays, metasurfaces, and practical apertures (Di et al., 2024, Björnson et al., 2024).
Cross-field (SWM, HSPWM, PWM) channel modeling, estimation, and codebooks for multi-regime operation (Tarboush et al., 2023, Gao et al., 2024, Hussain et al., 18 Mar 2025).
Joint ISAC frameworks robust to spatial non-stationarity, mobility, clutter, and non-ideal hardware (Wan et al., 29 Dec 2025, Cao et al., 2023, Yu et al., 2024).
Real-world testbeds with $\mathcal{O}(10^3-10^4)$ antenna elements, enabling analysis of near-field beamfocusing, system-level networking, and energy/thermal constraints (Di et al., 2024, Zeng et al., 2023).
AI foundation models scaling to extreme array sizes with rigorous generalization guarantees and efficient online/federated adaptation (Yu et al., 2024).
Co-design of hybrid beams, power management, fronthaul compression, and antenna geometries for hardware-limited and green 6G deployments (Feng et al., 11 Jan 2025, Tian et al., 30 Jan 2025, Yu et al., 2024).

UM-MIMO is a multi-disciplinary technology whose trajectory fuses array physics, scalable signal processing, learning-based inference, and circuit-constrained architectures, underpinning physical-layer advances for 6G and beyond (Wan et al., 29 Dec 2025, Cao et al., 2023, Feng et al., 11 Jan 2025, Yu et al., 2024).