Binary Robust Least Squares Overview

Updated 15 October 2025

BRLS is a robust least squares estimation framework that explicitly addresses binary, adversarial, and quantization uncertainties.
It employs a minimax formulation with supermodular and submodular structures, enabling efficient convex-concave relaxations and combinatorial strategies.
Advanced techniques like hard thresholding, SDP reformulation, and greedy search yield provable risk bounds and enhanced recovery across noisy, real-world applications.

Binary Robust Least Squares (BRLS) refers to a broad class of least squares estimation procedures that explicitly address robustness in the presence of binary, adversarial, or quantized uncertainties in the data, labels, or measurement operators. These methods are designed to minimize the detrimental effects of structured, discrete, or heavy-tailed noise, including mislabels, sign flips, binary corruption, and quantization errors, which can severely impair classical least squares solutions. The BRLS paradigm encompasses both algorithmic and theoretical innovations—ranging from hard-thresholding and moment matching, to minimax regret and combinatorial optimization—enabling more reliable recovery, estimation, or classification in settings where canonical assumptions (e.g., Gaussian noise, continuous outputs) are violated.

1. BRLS Formulation and Problem Classes

The canonical mathematical formulation for BRLS is a minimax optimization problem:

$\min_{x \in \mathcal{X}} \max_{y \in \mathcal{Y}} \Theta(x, y) = \frac{1}{2} \|F(x) - C y\|^2$

where:

$\mathcal{X} \subset \mathbb{R}^m$ is a convex (often compact) set of parameters,
$\mathcal{Y} = \{0,1\}^n$ encodes binary uncertainties (e.g., mislabels, sign-flip indices, binary perturbations),
$F: \mathbb{R}^m \to \mathbb{R}^r$ models continuous regression or measurement prediction,
$C \in \mathbb{R}^{r \times n}$ (the "noise propagation matrix") determines the mode and structure of binary noise injected into the measurements.

This bi-level structure generalizes robust least squares with adversarial, quantized, or binary targets, and subsumes practical settings such as robust classification under uncertain labels, sparse recovery from 1-bit compressed measurements, or estimation from quantized data (Zhou et al., 13 Oct 2025). Special cases include:

$C = 0$ : classical least squares,
$C$ diagonal: robust classification with binary label flips,
$F$ nonlinear: robust phase retrieval (e.g., in signal and imaging domains),
$C$ quantization operator: least squares with operator uncertainty (Clancy et al., 2020).

The critical property is the explicit minimization over control variables $x$ coupled with worst-case binary (or discrete) maximization over $y$ .

2. Geometric Structure: Supermodularity and Submodularity

The geometry of $C$ provides insight into the tractability and algorithmic strategies for the BRLS inner maximization. By analyzing the pairwise angles $\theta_{ij} = \arccos\left( \frac{c_i^T c_j}{\|c_i\|\|c_j\|} \right)$ between columns $c_i, c_j$ of $C$ , two major regimes emerge:

Acute case ( $\theta_{ij} \leq \frac{\pi}{2}$ ): $\Theta(x, \cdot)$ is supermodular; adding multiple coordinates to $y$ consistently increases the objective. This structure enables convex-concave relaxations using the Lovász extension, supporting gradient-based minimax optimization with theoretical guarantees (e.g., $\epsilon$ -global minimax points in $O(\epsilon^{-2})$ steps) (Zhou et al., 13 Oct 2025).
Obtuse case ( $\theta_{ij} \geq \frac{\pi}{2}$ ): $\Theta(x, \cdot)$ is submodular; additional binary noise or flips can counteract previous contributions. This justifies usage of combinatorial algorithms, notably the deterministic double greedy algorithm, yielding $(1/3, \epsilon)$ -approximate minimax points in $O(\epsilon^{-2})$ iterations, with exact recovery for orthogonal $C$ .

This modularity classification directly informs optimization design and complexity bounds for BRLS, and generalizes to settings involving hypercube-constrained adversarial noise.

3. Algorithmic Frameworks and Solution Techniques

BRLS leverages specialized algorithms that accommodate discrete maximization and robustness constraints:

Projected Gradient Algorithms: In the supermodular setting (acute $C$ ), minimax optimization is performed over continuous extensions (Lovász extension). This permits saddle point computation for the convex-concave relaxation, enabling efficient convergence to global solutions—even with nonlinear $F$ (provided differentiability).
Double Greedy Subsolvers: For submodular (obtuse $C$ ) problems, the double greedy algorithm constructs lower-bound and upper-bound solutions by iteratively deciding inclusion/exclusion of binary perturbations, yielding provable approximation bounds.
Hard Thresholding for Binary Corruptions: In adversarial label or response corruption settings, algorithms such as TORRENT (Bhatia et al., 2015) iteratively threshold residuals to select “clean” subsets, alternately updating model parameters and active sets to efficiently recover robust predictors, even when $40$– $50\%$ of outputs are corrupted.
SDP Reformulation: In BRLS with bounded uncertainty in the data matrix and outputs, minimax regret formulations enable SDP-based design, bridging worst-case robustness and average-case performance (Vanli et al., 2013, Kalantarova et al., 2012).
Greedy Search for Bayesian Binary Selection: The Single Best Replacement (SBR) method in Bayesian $l_0$ -regularized LS provides computationally scalable robust variable selection, exploiting the binary nature of spike-and-slab priors (Polson et al., 2017).

These algorithmic advances are supported by theoretical guarantees on risk bounds (often achieving $O(d/n)$ rates without extra logarithmic factors (Audibert et al., 2010)), approximation factors, sampling complexity, and support recovery.

4. Theoretical Performance and Risk Bounds

BRLS methodologies obtain strong statistical performance, especially in heavy-tailed, quantized, or adversarial scenarios. Key results include:

Risk Concentration: The excess risk—difference between the estimator and oracle risk—can concentrate at $O(d/n)$ (dimension/sample size), without requiring exponential moment conditions on the noise (Audibert et al., 2010). PAC-Bayesian truncation analysis enables exponential deviation bounds even with only second or fourth moment conditions.
Minimax Estimation in Quantized and Binary Sensing: In 1-bit compressed sensing, cardinality-constrained LS decoding achieves the same minimax estimation rate $\sqrt{s \log n/m}$ as unquantized sensing; support recovery (exact variable selection) is achieved in $O(\log s)$ steps if signal magnitudes are above uncertainty thresholds (Ding et al., 2020).
Least Squares Decoding for Generative Priors: For signals with low intrinsic dimension measured via binary, noisy, or sign-flipped observations, LS decoding coupled with deep generative models achieves optimal error $O(\sqrt{(k \log(Ln))/m})$ as long as $m \gg k \log(Ln)$ (Jiao et al., 2021).
Robustness to Structured Noise: Numerical experiments on health status prediction and phase retrieval under hypercube uncertainty demonstrate BRLS substantially outperforms classical LS and LASSO in worst-case error and support recovery across increasing noise ratios (Zhou et al., 13 Oct 2025).

5. Specialized BRLS Variants and Application Domains

BRLS encompasses numerous variants and application contexts:

Binary Classification with Class Rebalancing: Non-iteratively Reweighted Recursive LS (NR-RLS) enforces class-specific weights for imbalanced binary classification problems, converging deterministically to batch solutions with $O(d^2)$ update cost per sample (Jang, 2023).
Mixtures of Binary Regression Models: Robust LS moment matching (LSMM/GLSMM) enables consistent and asymptotically normal estimation in finite mixtures of GLMs with binary output, using cross-moment tensors up to order 3 (Auder et al., 2018).
Quantized Neural Networks: Least squares binary quantization algorithms provably minimize mean squared error in network weights and activations, supporting computational efficiency via bitwise operations, and reducing accuracy gaps in inference after quantization (Pouransari et al., 2020).
Robust Symbolic Regression in Wearable Data: Hybrid neural-symbolic architectures (e.g., transformer compression models) achieve significant robustness over standard LS, particularly in health monitoring data with structured asymmetric noise (Gutierrez et al., 4 Aug 2025).
Robust LS for Quantized Data Matrices: Convex minimax formulations directly address operator uncertainty arising from finite precision, via a closed-form robust objective with parameter selection tied to quantization granularity (Clancy et al., 2020).

6. Comparative Evaluation and Practical Impact

BRLS methods consistently outperform classical least squares and classical robust regression approaches—such as Huber and SoftL1 loss minimization, direct $L_1$ penalization, or simple subsampling—in settings where the noise is binary, adversarial, or quantized:

Outlier and Heavy-Tailed Robustness: Soft-truncation, hard thresholding, and supermodular minimax designs reduce risk and variance by either down-weighting or suppressing the influence of extreme or misclassified points (Audibert et al., 2010, Bhatia et al., 2015, McWilliams et al., 2014).
Structured vs. Unstructured Noise: BRLS achieves improved worst-case and average case performance under both structured (e.g., hypercube, adversarial blocks) and unstructured noise, as evidenced in simulation and practical health prediction benchmarks (Zhou et al., 13 Oct 2025).
Efficient Recovery and Support Identification: For compressed sensing, sparse regression, and generative modeling from binary measurements, BRLS methods with appropriate combinatorial or generative priors yield sharp recovery (both in $\ell_2$ and support) with reduced computational overhead.

The convergence of combinatorial optimization, PAC-Bayesian theory, robust regression techniques, and neural architectures in BRLS research offers versatile solutions adaptable to rapidly evolving data modalities in signal processing, health informatics, neural networks, and large-scale online learning.

7. Future Research Directions

Open problems and next steps for BRLS include:

Extension to general types of structured uncertainty (e.g., hypergraphs, mixed discrete–continuous adversaries);
Further integration with deep generative priors and large-scale neural architectures for complex data;
Combinatorial frameworks for nonlinear and nonconvex robust LS problems outside supermodular/submodular regimes;
Sharper risk bounds under model misspecification, parameter coupling, and real-world, non-i.i.d. noise profiles.

Recent results (Zhou et al., 13 Oct 2025) suggest that continued exploration of the geometric and combinatorial properties of noise propagation and minimax structure in BRLS will yield increasingly powerful and flexible robust optimization algorithms for diverse statistical and machine learning applications.