MHACL for Secure SIM-MIMO Systems
- The paper introduces a framework integrating geometric product-manifold optimization with multi-agent continual learning to boost secrecy rates in secure MIMO systems.
- MHACL utilizes Riemannian gradients and dual-scale policy updates to efficiently solve high-dimensional, non-convex joint optimization problems under practical hardware constraints.
- SIMHACL, a reduced-complexity variant, achieves millisecond-level training and near-optimal performance, making it viable for dynamic and secure wireless communications.
Manifold-Enhanced Heterogeneous Multi-Agent Continual Learning (MHACL) is a framework for solving high-dimensional, non-convex joint optimization problems that arise in secure, stacked intelligent metasurface (SIM)-assisted multi-user multiple-input multiple-output (MIMO) wireless systems. MHACL integrates geometric product-manifold optimization, multi-agent continual policy learning, and dual-scale adaptive policy updates to efficiently maximize weighted sum secrecy rate (WSSR) under practical physical and computational constraints. A reduced-complexity template termed SIMHACL further enables millisecond-level training and near-optimal communication secrecy in dynamic environments (Shi et al., 2 Feb 2026).
1. System Model and Optimization Formulation
MHACL is derived for a MIMO downlink system where a base station, referred to as Alice, is equipped with antennas and a -layered SIM. Each SIM layer comprises nearly-passive meta-atoms. single-antenna legitimate users (Bobs) and a single-antenna eavesdropper (Eve) are located in the far field of the SIM.
Each antenna transmits an independent Gaussian data stream. Wave-based beamforming is realized exclusively in the electromagnetic domain through phase shift manipulation:
- Phase-Shift Matrices: ; each phase element .
- Inter-Layer Coupling: Encoded by fixed complex-valued matrices for th layer ( connects antennas to SIM).
- Overall SIM Beamformer: .
The composite downlink channel for user is , with .
The k-th Bob's and Eve's stream- SINRs are given by:
The secrecy rate is .
The main objective—joint precoding optimization—is:
subject to , , and discretized phase constraints . This formulation is highly non-convex due to variable coupling, discrete unit-modulus phases, and the large solution space (Shi et al., 2 Feb 2026).
2. Product-Manifold Geometry and Riemannian Gradients
Phase coordination in MHACL is handled via geometric optimization over the product manifold , with each corresponding to a phase element on the unit circle. This approach provides inherent enforcement of the unit-modulus constraint for each phase shift, reducing extraneous parameterization. The manifold representation reduces the search to real dimensions.
- Riemannian Gradient: For WSSR objective , the Euclidean gradient is backpropagated through the beamformer. The Riemannian gradient at each phase scalar is
which is projected onto the tangent space for geometric consistency.
This manifold-based optimization preserves physical feasibility and supports efficient phase updates. It eliminates the need for auxiliary constraints and enables hardware-compatible phase mask updates (Shi et al., 2 Feb 2026).
3. Multi-Agent Continual and Dual-Scale Policy Learning
MHACL leverages a heterogeneous multi-agent formulation: each agent is associated with a decision variable—either BS power allocation or SIM phase shifts per layer. Continual learning is realized by separating adaptation into two timescales:
- Local (Fast) Updates: At each timeslot, agents perform a fixed number of Riemannian gradient steps on phase and power variables, using inner-loop step sizes .
- Global (Slow) Meta-Updates: After several slots or iterations, network-level parameters—masks, preconditioners, Transformer weights—are updated via Adam using step sizes accumulated from recent gradient activity.
This dual-scale architecture enables rapid response to fast channel variations while ensuring long-term stability and policy consolidation via meta-learning. Continual learning is enforced through a regularizer penalizing deviation from previous solutions, stored in a prioritized memory buffer (Shi et al., 2 Feb 2026).
4. Algorithm Structure and SIMHACL Low-Complexity Template
MHACL Algorithm Steps
- Initialization: Uniform power allocation, phase from prior memory.
- CSI Observation: Gradient tensors for power and phase are computed from the observed channel state.
- Inner Loop: Iterative Riemannian descent on power and phase, respecting system and manifold constraints.
- Instantaneous Loss Evaluation: Includes task loss and regularization.
- Meta-Update: Periodically aggregate inner-loop gradients to update higher-level network parameters.
- Memory Update: Update buffer and proceed to the next time slot.
SIMHACL Variant
SIMHACL reduces complexity by:
- Embedding all MN phases in a single compact coordinate and enforcing unit modulus via Riemannian flows.
- Replacing cubic-cost inversions with diagonal preconditioners, allowing phase updates at cost, compared to .
- Utilizing Proposition 1 for power: per-iteration normalization is sufficient, so power updates cost only .
The combined effect yields per-iteration complexity for SIMHACL, compared to for base MHACL and for classical alternating optimization (Shi et al., 2 Feb 2026).
5. Convergence, Complexity, and Theoretical Guarantees
MHACL's updates on the product manifold ensure that, under Lipschitz-continuous Riemannian gradients and sufficiently small step sizes, the iterates converge to a first-order stationary point. The addition of a continual-learning regularizer is shown to maintain bounded deviation from previously learned solutions, so the overall process converges to an -stationary regime when the meta-update step size is much smaller than the inner loop step size.
The low-complexity SIMHACL variant further attains near-optimal solutions with provably linear per-iteration cost, reducing hardware overhead and learning latency (Shi et al., 2 Feb 2026).
6. Simulation Setup and Performance Outcomes
MHACL and SIMHACL were validated in a setting where Alice had antennas, users, SIM layers , and meta-atoms per layer. Environmental conditions included a $28$ GHz carrier, $10$ MHz bandwidth, quasi-static correlated Rayleigh fading, and the eavesdropper situated at the user cluster center.
Key performance metrics and results are summarized below:
| Metric | MHACL | SIMHACL |
|---|---|---|
| Convergence (iterations) | (to within 1% of final WSSR) | |
| Training time per iteration | $1.4$ ms | $1.0$ ms (30% reduction) |
| WSSR gain (M up to 6 layers) | Comparable | |
| Phase quantization penalty (1 bit, ) | 10\% WSSR loss | \% gap at 4 bits |
| User scaling (WSSR) | Peaks at for | Similar trend |
| Power allocation at dBm | of MHACL | Gap closes at dBm |
Beyond layers, inter-layer loss saturates WSSR improvement. 1-bit phase quantization causes a ∼10% WSSR loss; this drops below 2% at 4-bit phase resolution. WSSR is maximized when the number of users matches the number of transmit antennas, with degradation beyond this point due to power dilution and increased inter-user interference. SIMHACL approaches MHACL performance as transmit power increases (Shi et al., 2 Feb 2026).
7. Context and Implications
The MHACL family enables efficient, scalable, and resource-conscious learning and adaptation in secure MIMO systems with SIMs, directly leveraging the physical geometry and system constraints. A plausible implication is that the product manifold and continual multi-agent learning principles embedded in MHACL may generalize to other high-dimensional, non-convex wireless optimization scenarios, particularly those involving hardware-constrained programmable metasurfaces. The linear per-iteration cost, millisecond response time, and near-optimal secrecy metrics position the approach as a candidate for future 6G secure communication deployments (Shi et al., 2 Feb 2026).