BaLSI: Batch Least-Squares Identification

Updated 27 January 2026

BaLSI is a parameter estimation method that uses finite batches to minimize a quadratic cost function, yielding closed-form or convex solutions.
It is applicable to linear and nonlinear parameter-affine models, ensuring unbiased and consistent estimates under persistent excitation.
The method supports adaptive and safety-critical control through efficient, event-triggered updates and reduced computational complexity compared to recursive approaches.

Batch Least-Squares Identification (BaLSI) is a class of identification methods that estimate unknown parameters in linear or nonlinear models by minimizing a quadratic cost constructed from a finite batch of observed data. BaLSI approaches offer closed-form or convex solutions and play a central role in system identification, adaptive control, and model-based safety-critical control, with rigorous statistical and implementation properties across both deterministic and stochastic settings.

1. Mathematical Foundations and Formulations

BaLSI is applicable in both linear time-series models and nonlinear systems with parameters affine in the regression vector. In discrete-time autoregressive (AR) models, BaLSI estimates parameters by forming a normal equation from a stack of regression equations over a batch: $y(t) = \zeta(t)\, \theta + \psi(t) \qquad t\geq n$ where $y(t)$ is the output, $\zeta(t)$ the regressor (composed of lagged outputs or inputs), $\theta$ the parameter vector, and $\psi(t)$ the innovation process. For AR(1)/AR(2) models this yields the canonical stacked regression $Y = \Phi \theta + \Psi$ , with batch-least-squares cost

$J(\theta) = \frac{1}{2} \Vert Y - \Phi \theta \Vert^2$

whose stationary point gives the closed-form solution

$\hat{\theta} = (\Phi^\top \Phi)^{-1} \Phi^\top Y$

provided $\Phi^\top \Phi$ is nonsingular (Wafi, 2023).

For matrix-parameterized linear systems, BaLSI generalizes to

$Y_i \approx P X_i,\qquad P\in\mathbb{R}^{p\times n}$

where closed-form minimizer is

$P^* = S_{yx} S_{xx}^{-1},\qquad S_{yx} = \sum_{i=1}^N Y_i X_i^\top,\quad S_{xx} = \sum_{i=1}^N X_i X_i^\top$

with analysis and computational simplifications compared to vec-permutation-based treatments (Lai et al., 2024).

In nonlinear, parameter-affine systems

$\dot x = f(x) + \phi(x)^\top \theta + g(x)u$

integration over a time interval or batch yields batch regression forms in the parameter, with BaLSI updates solving

$\min_\vartheta \lVert \vartheta - \hat\theta_{\text{prev}} \rVert^2 \quad \text{subject to } G \vartheta = Z$

where $G, Z$ are batch-integrated data matrices (Karafyllis et al., 2018, Shen et al., 2024).

2. Algorithmic Structure and Implementation

The BaLSI procedure consists of the following fundamental steps:

Batch Selection: Choose batch size $N$ and, when desirable, the number of independent data batches $\kappa$ .
Data Aggregation: Stack inputs and outputs into regression form ( $Y, \Phi$ or $S_{yx}, S_{xx}$ , or in nonlinear cases cumulative $G$ , $Z$ ).
Solve Normal Equations: Compute $\hat{\theta}$ (or matrix $P^*$ ) via closed-form inversion or constrained projection if $G$ is singular or parameter constraints apply.
Empirical Analysis: If using multiple independent batches, compute means and covariances over batch estimates to study accuracy and convergence properties.
Update and Convergence Check: Iterate with different batch sizes or groupings until stabilization criteria are met (Wafi, 2023).

For regulation- or safety-triggered BaLSI (event-triggered identification), parameter updates are activated when Lyapunov or safety conditions are violated beyond a threshold, integrating all data since the last update (Karafyllis et al., 2018, Shen et al., 2024).

3. Statistical Properties and Convergence

Under standard linear regression with zero-mean white noise, the BaLSI estimate is unbiased with covariance

$\mathrm{Cov}[\hat{\theta}] = \sigma^2 (\Phi^\top\Phi)^{-1}$

Variance scales as $O(1/N)$ ; aggregation across multiple batches further reduces estimator uncertainty. For colored (correlated) noise processes, bias and variance increase, and the simple covariance expression no longer applies, motivating the use of pre-whitening or more sophisticated prediction error methods in system identification (Wafi, 2023).

In the matrix case, persistence of excitation in the regressor sequence ( $S_{xx}$ positive definite) is both necessary and sufficient for consistent identification: $P_N \to P_\mathrm{true} \quad \text{as} \ N \to \infty$ when the regressor is persistently exciting and measurement noise is sufficiently small (Lai et al., 2024).

In event-triggered adaptive control, BaLSI guarantees finite-time constancy of parameters: under batch-level excitation (rank increase in the batch Gram matrix), the true parameter is estimated within at most $p$ correction events (where $p$ is the parameter dimension), after which certainty equivalence prevails and no further correction is possible (Karafyllis et al., 2018, Shen et al., 2024).

4. Computational Complexity and Practicality

BaLSI methods are highly efficient. In the matrix setting, direct matrix BaLSI avoids the cubic computational and quadratic memory overhead of large Kronecker products or vec-permutation representations:

Matrix BaLSI: Complexity $O(n^3 + N n^2 + N p n)$ , memory $O(n^2 + p n)$
Vec-permutation: Complexity $O((pn)^3 + Np^2 n^2)$ , memory $O(p^2 n^2)$

Thus, direct matrix BaLSI achieves orders-of-magnitude savings in both computation and storage relative to vec-permutation approaches (Lai et al., 2024). All batch-least-squares methods benefit from closed-form updates, and can be extended to recursive (online) identification via recursions on Gram matrices and sufficient statistics.

For nonlinear, parameter-affine systems, the batch integrals required for BaLSI admit real-time computation via auxiliary ODEs, supporting event-driven updates and adaptive regulation or safety enforcement (Karafyllis et al., 2018, Shen et al., 2024).

5. Application Areas and Key Results

AR process identification

Simulation studies demonstrate that, for AR(1) and AR(2) processes of the form

$y(t) = \phi_1 y(t-1) + [\phi_2 y(t-2)] + \psi(t)$

BaLSI produces unbiased minimum-variance estimates under white noise. Increasing batch size $N$ roughly halves the variance. When the true model is AR(2), AR(2) BaLSI achieves lower mean-squared error than AR(1), but estimation variance increases in the presence of colored noise (Wafi, 2023).

Adaptive control

In regulation-triggered schemes, BaLSI delivers finite-time constancy of parameter estimates and ensures the closed-loop system tracks the nominal system associated with the final estimate, not the true parameter. Uniform exponential convergence rates are achieved without persistent excitation or parameter observability, as shown in state regulation for the wing-rock model (Karafyllis et al., 2018).

Adaptive safety-critical control

BaLSI enables robust parameter identification in adaptive safety control with Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs), ensuring forward invariance of safety constraints, bounding the number of correction events, and excluding Zeno behavior. Lyapunov-based parameter adaptation schemes may become ill-posed or overly conservative when relaxation terms are introduced for constraint handling, but BaLSI remains well-defined and data-driven even under such relaxations (Shen et al., 2024).

Model Predictive Control (MPC)

Matrix-form BaLSI is particularly effective in online identification of multiple-input, multiple-output (MIMO) systems for indirect adaptive MPC; avoiding Kronecker product overhead enables scalable real-time parameter adaptation in predictive-cost adaptive control architectures (Lai et al., 2024).

6. Theoretical Guarantees and Limitations

In batch LS with white noise, unbiasedness and minimum variance are guaranteed. Under batch-level persistent excitation, convergence to the true parameter is secured even in the presence of (sufficiently small) noise. In the nonlinear, event-triggered cases, BaLSI delivers finite-time exact cancellation of parameter error—the number of necessary correction events is upper bounded by the parameter dimension $p$ . Parameter updates are triggered only by batch-level violation of trajectory constraints or Lyapunov/safety thresholds. After the final batch correction, error remains orthogonal to all future regressors, enforcing certainty equivalence for the adaptive controller (Karafyllis et al., 2018, Shen et al., 2024).

In practice, biases and variances may still arise due to colored noise, insufficient excitation, or model misspecification. When safety or regulation triggers are used, BaLSI updates are in general sparse (“event-triggered”) rather than continuous, which may reduce adaptation speed in highly time-varying systems.

Setting	Guarantee (White Noise)	Guarantee (Colored Noise)
AR(1/2) Time Series	Unbiased, min-variance, $O(1/N)$ var	Bias, increased variance; MSE higher
Nonlinear Event-Triggered	Finite-time constancy, certainty-equivalence	At most $p$ correction events; noise robustness demonstrated empirically
Matrix-ID (MIMO)	Consistency under PE	As above (with caveats)

BaLSI is closely related to recursive least-squares (RLS), classical prediction error methods, and subspace identification, but distinguished by its batch-data structure and, when event-triggered, its adaptation strategy. In adaptive and safety-critical contexts, BaLSI’s update paradigm contrasts with continuous Lyapunov-based adaptive laws—providing robustness to relaxation terms and constraint violations.

Recent research focuses on:

Extensions to large-scale and nonlinear/nonaffine systems
Integration with advanced control architectures (e.g., adaptive MPC, safety-critical QP controllers)
Reduction of computational overhead via matrix-structured BaLSI (Lai et al., 2024)
Empirical and theoretical guarantees for adversarial or colored noise
Management of estimation trade-offs (variance, bias, interval length, batch size)
Avoidance of overparameterization and exclusion of Zeno phenomena in event-triggered implementations (Shen et al., 2024)

A plausible implication is that BaLSI frameworks will continue to expand in both theory and application, especially where fast, robust, and certifiable parameter estimation is required for control under safety or performance constraints.