Sum-MSE Minimization Techniques

Updated 20 February 2026

Sum-MSE minimization is a strategy that reduces aggregate mean-squared error across multiple channels by optimizing system parameters in communications and sensor networks.
It leverages methodologies such as Lagrange multipliers, KKT conditions, and duality to achieve efficient designs via water-filling and convex optimization techniques.
Applications include MIMO transceiver design, distributed estimation, and ISAC waveform tuning, ensuring robust and scalable performance under practical constraints.

Sum-MSE minimization refers to the design and optimization of linear or nonlinear systems—often communication or estimation systems—so as to minimize the aggregate mean-squared error (MSE) across multiple data streams, users, or system components. It is fundamental in MIMO transceiver design, distributed estimation, network coordination, relay systems, ISAC waveform design, and sensor selection. The sum-MSE objective often enables tractable formulations and admits closed-form or efficiently solvable subproblems using tools from convex optimization, Lagrange duality, majorization theory, quadratic matrix programming, and stochastic approximation.

1. Formal Definition and System Models

Sum-MSE minimization arises in systems where multiple outputs (streams, users, sensors) contribute individual MSE terms that are aggregated into a single objective: $\mathrm{Sum\!-\!MSE} = \sum_{k=1}^K \mathbb{E}\|s_k - \hat{s}_k\|^2$ where $s_k$ and $\hat{s}_k$ are the true and estimated signals for index $k$ . In MIMO communications, the standard linear model is: $\mathbf{y} = \mathbf{H} \mathbf{F} \mathbf{s} + \mathbf{n}$ with channel $\mathbf{H}$ , precoder $\mathbf{F}$ , input $\mathbf{s}$ , and noise $\mathbf{n}$ . The sum-MSE traces the MSE matrix: $J(\mathbf{G},\mathbf{F}) = \mathrm{Tr}\left[\, (\mathbf{F}^H\mathbf{H}^H\mathbf{R}_n^{-1}\mathbf{H}\mathbf{F}+\mathbf{I})^{-1} \right]$ where $\mathbf{G}$ is the linear equalizer and $\mathbf{R}_n$ is the noise covariance (Xing et al., 2016). Similar quadratic forms appear in distributed estimation (sum-MSE over sensors), relay networks, and integrated sensing-and-communications (ISAC) systems (Fauß et al., 2019, Cui et al., 2022).

2. Optimization Frameworks and Algorithmic Techniques

2.1. Lagrange Multiplier and KKT-Based Approaches

Sum-MSE minimization with power constraints leads to KKT conditions that yield the optimal SVD structure for precoders and subsequent water-filling power allocation: $f_i^2 = \left(\frac{1}{\sqrt{\mu} h_i} - \frac{1}{h_i^2}\right)^+$ where $\mu$ is the Lagrange multiplier, and $h_i$ are channel singular values (Xing et al., 2016).

2.2. Majorization and Schur-Concavity

Majorization theory provides conditions for optimal precoder structure by examining Schur-concavity of the sum-MSE as a function of eigenvalues. When all weights are equal, the problem decomposes into eigenmode water-filling (Xing et al., 2016).

2.3. Duality and Geometric Programming

For multiuser MIMO broadcast, sum-MSE minimization is often non-convex in the downlink, but it can be recast as a convex uplink problem via MSE duality. The power allocation can then be handled via geometric programming (GP) for per-antenna, per-user, or per-symbol constraints (Bogale et al., 2013). Strong duality facilitates alternating optimization between uplink and downlink domains.

2.4. Quadratic Matrix Programming (QMP)

When the MSE is quadratic in block variables (e.g., transceivers, relay matrices), each block update can be cast as a QMP subproblem: $\min_{X} \mathrm{Tr}(D_0 X^H A_0 X)+2\mathrm{Re}\{\mathrm{Tr}(B_0^H X)\}+c_0$ with various quadratic constraints. Solution approaches include SDP relaxation, SOCP reformulation, or exploitation of closed-form updates for single or no constraints (Xing et al., 2012).

2.5. Stochastic Projection and Nonconvexity

In models where MSE is nonconvex or not analytically tractable, e.g., with mixed-Gaussian noise or nonlinearities, stochastic approximation algorithms (e.g., Robbins–Monro) are used, drawing samples to estimate gradients and applying projected updates (Flåm et al., 2012). This is effective for pilot/precoder design under mixed Gaussian input/output.

2.6. Block Coordinate Descent and Efficient Lattice Methods

When practical constraints introduce discrete variables (e.g. discrete phase RIS, quantized precoders), the sum-MSE minimization becomes a mixed-integer problem. Block Coordinate Descent (BCD) alternating between variable sets, solved via sphere decoding (SESD) for lattice problems, delivers stationary-point convergence with tractable complexity in realistic scenarios (Ramezani et al., 2024).

3. Extensions and Robustness Considerations

3.1. Weighted Sum-MSE and Matrix-Field Models

Weighted sum-MSE directly generalizes the objective, assigning significance to streams or users: $J(\mathbf{F}) = \sum_{i=1}^N w_i\,d_i$ The matrix-field weighted-MSE model further extends this, enabling objectives such as min-max MSE, sum-rate maximization, or nonlinear MIMO strategies. Solutions often require iterative updates over additional parameter blocks (Xing et al., 2013).

3.2. Robust Sum-MSE Under Uncertainty

For systems with channel uncertainty, norm-bounded or stochastic error models are incorporated. The robust sum-MSE minimization then either takes the worst-case or mean MSE over the error set. Duality persists but the effective channel/noise structures are modified, and care is taken in imposing per-antenna, per-user, or aggregate power constraints (Bogale et al., 2013).

3.3. Energy Allocation Jointly with Sum-MSE

In time-varying or block-fading systems, joint optimization over energy allocations (e.g. between training and data phases) and sum-MSE transceiver design exhibits decoupling properties in MMSE estimation contexts, enabling closed-form optimal energy partitions and plug-in to any standard MSE-minimizing algorithm (Tenenbaum et al., 2010).

4. Applications and System Contexts

MIMO transceiver and relay networks: Alternating optimization or closed-form updates for sum-MSE over multi-hop, multiuser, and coordinated MIMO systems (Xing et al., 2016, Wang et al., 2012, Xing et al., 2013, Xing et al., 2012).
RIS/Meta-Surfaces and Large-Intelligent Surfaces: Joint optimization of RIS phases, quantized precoders, and receiver filters for sum-MSE; block coordinate descent with exact lattice search for discrete constraints (Ramezani et al., 2024).
ISAC waveform design: Joint design of communication and sensing via sum-MSE minimization, with DoF-completion (waveform augmentation) guaranteeing closure of the sensing MSE gap versus communication-only precoding (Cui et al., 2022).
Distributed estimation and sensor networks: Weighted sum of MMSEs subject to prior uncertainty ball—tight bounds derived under KL divergence constraints; saddle-point conditions yield minimax-robust estimators (Fauß et al., 2019).
Over-the-air computation: Design of transmit/receive scaling laws to minimize worst-case sum-MSE in simultaneous sum recovery over shared multiple-access channels (Kakar et al., 2020).
Sensor selection: Cardiniality-constrained minimization of sum-MSE in Kalman smoothing, where the (non-)submodularity structure admits explicit greedy algorithm performance guarantees (Kohara et al., 2020).
Heterogeneous networks: Multi-tier sum-MSE optimization across macro and small cells, including alternating optimization, constraint-relaxed and normalization-based algorithms, and separate MSE strategies for scalable complexity (Dai et al., 2016).
Holographic beamforming: Hybrid analog/digital designs leveraging quadratic dependence of sum-MSE on surface weights to admit closed-form, low-complexity updates, scaling linearly with surface size (Sheemar et al., 21 Mar 2025).

5. Key Insights, Challenges, and Performance Guarantees

5.1. Structural Properties

Decoupling in alternating optimization: For fixed receive/transmit variables, each subproblem is convex and often admits water-filling- or eigenstructure-based solutions if the objective and constraints preserve convexity (Xing et al., 2016, Xing et al., 2012).
Uplink-downlink duality underpinning convex relaxation and power allocation (Bogale et al., 2013).
Nonconvexity in certain channel models (e.g., mixed-Gaussian) necessitates randomized restarts or stochastic update methods due to prevalence of local minima (Flåm et al., 2012).

5.2. Robust Design and Bounds

Tight upper/lower bounds on weighted sum of MMSEs can be computed by restricting the input prior to a divergence ball; extremal distribution construction via Riccati-type or fixed-point equations. Resulting linear estimators are minimax-robust (Fauß et al., 2019).
In sensor selection, sum-MSE objective’s lack of true submodularity is compensated by introducing submodularity ratio and curvature. Performance guarantees of greedy selection algorithms depend explicitly on spectral properties of the problem data (Kohara et al., 2020).

5.3. Numerical and System Performance

Optimized energy allocation for training/data phases confers 2–4dB SNR-equivalent improvement in practical MIMO downlink settings (Tenenbaum et al., 2010).
For relay and distributed MIMO networks, separable relaxations and closed-form designs yield scalable algorithms for 5G and beyond (Darsena et al., 2020).
In ISAC, DoF-completion eliminates the sensing MSE error floor that plagues communication-only precoding (Cui et al., 2022), and the corresponding SDP relaxations are globally tight.
Joint optimization of discrete and continuous control variables in RIS-MIMO or holographic systems via specialized algorithms (e.g., BCD+SESD, alternating MMSE) attains strict performance gains over quantized or random-phase baselines (Ramezani et al., 2024, Sheemar et al., 21 Mar 2025).

6. Comparative Overview and Algorithmic Summary

Problem Context	Key Methodology	Structural Feature
Point-to-point / MU-MIMO	Water-filling, SVD	Schur-concavity, KKT
Robust MIMO downlink/uplink	Geometric Programming	Duality, alternating opt.
Relay/AF-MIMO networks	QMP, SDP/SOCP, SVD	Matrix-monotone fns
RIS/Metasurface-assisted MIMO	Lattice decoding, BCD	SESD, block alternation
ISAC joint comm-sense	SDR, DoF completion	Tight lower-bound closure
Distributed estimation	KL-ball, Riccati eqs	Minimax robustness
Sensor selection	Greedy, spectral ratio	Submod. ratio/curvature
Holographic surfaces	Quadratic updates	Closed-form per element

7. Open Issues and Ongoing Research Directions

Handling extreme-scale or highly nonconvex MSE minimization in massive MIMO, LIS, and hybrid analog-digital architectures, especially with hardware nonlinearities and low-resolution quantization.
Tighter analytical performance bounds in the presence of practical constraints such as discrete phase shifts, finite rate control, or low-latency requirements.
Joint optimization across communication, sensing, and control objectives, moving beyond convex decoupling schemes to full end-to-end system performance maximization.

Sum-MSE minimization serves as a unifying objective across diverse high-dimensional and multi-agent networked systems, enabling both tractable optimization and robust, high-performance practical designs when correctly integrated with structural and algorithmic insights (Xing et al., 2016, Xing et al., 2012, Bogale et al., 2013, Wang et al., 2012, Ramezani et al., 2024, Cui et al., 2022, Tenenbaum et al., 2010, Fauß et al., 2019, Sheemar et al., 21 Mar 2025).