Papers
Topics
Authors
Recent
Search
2000 character limit reached

Distributed ARX Estimation Techniques

Updated 29 January 2026
  • Distributed ARX estimation is a framework enabling sensor networks to collaboratively identify unknown ARX model orders and parameters using local and neighboring data.
  • Techniques integrate local information criteria with recursive least squares and consensus diffusion to ensure strong convergence and order consistency.
  • The approach is robust to noise and weak individual sensor excitation, making it valuable for adaptive signal processing and decentralized control applications.

Distributed ARX estimation addresses the collaborative identification of both model order and parameters for autoregressive systems with exogenous inputs (ARX) in multi-agent sensor networks. This problem is fundamental in scenarios where networked agents must learn the dynamics of an unknown stochastic system using only local and neighboring information, and where the model complexity (orders) is also unknown. Modern distributed ARX estimation schemes integrate local statistical model selection, recursive least squares (RLS), information diffusion, and cooperative excitation concepts to achieve strong convergence guarantees under minimal stochastic assumptions, without requiring global data centralization or independent input processes (Gan et al., 2021, Kar et al., 2013).

1. ARX Model Structures and Distributed Observation Setting

In the prototypical distributed ARX context, each of nn sensors (agents) observes, at discrete time tt,

yt+1,i=j=1p0bjyt+1j,i+l=1q0clut+1l,i+wt+1,i,y_{t+1,i} = \sum_{j=1}^{p_0} b_j\,y_{t+1-j,i} + \sum_{l=1}^{q_0} c_l\,u_{t+1-l,i} + w_{t+1,i},

where p0p_0 and q0q_0 are the (unknown) orders of the autoregressive and exogenous-input components, bjb_j and clc_l are the unknown system parameters, and wt+1,iw_{t+1,i} is zero-mean observation noise. The problem is to jointly estimate both (p0,q0)(p_0,q_0) and θ(p0,q0)=[b1,,bp0,c1,,cq0]T\theta(p_0,q_0) = [b_1,\ldots,b_{p_0},c_1,\ldots,c_{q_0}]^T in a distributed fashion, leveraging the inter-agent communication graph for cooperation.

Model compactness for arbitrary candidate orders (p,q)(p, q) is achieved by defining the regression vector

ϕt,i(p,q)=[yt,i,,yt+1p,i,ut,i,,ut+1q,i]T,\phi_{t,i}(p,q) = [y_{t,i},\ldots,y_{t+1-p,i}, u_{t,i},\ldots,u_{t+1-q,i}]^T,

such that

yt+1,i=θ(p,q)Tϕt,i(p,q)+wt+1,i.y_{t+1,i} = \theta(p,q)^T \phi_{t,i}(p,q) + w_{t+1,i}.

Each agent maintains local estimates for candidate orders and parameters, and exchanges information with its neighborhood NiN_i as specified by the network topology (Gan et al., 2021).

2. Local Information Criteria for Distributed Order Selection

Selection of the correct ARX order pair (p0,q0)(p_0,q_0) is realized via a distributed Local Information Criterion (LIC) framework. At each time tt, agent ii computes, for each candidate (p,q)(p,q),

Lt,i(p,q)=σt,i(p,q,θt,i(p,q))+(p+q)at,L_{t,i}(p,q) = \sigma_{t,i}\bigl(p,q,\theta_{t,i}(p,q)\bigr) + (p+q) a_t,

with

σt,i(p,q,β)=jNiaij[σt1,j(p,q,β)+(yt,jβTϕt1,j(p,q))2],σ0,i()=0,\sigma_{t,i}(p,q,\beta) = \sum_{j \in N_i} a_{ij} \left[ \sigma_{t-1,j}(p,q,\beta) + \left(y_{t,j} - \beta^T \phi_{t-1,j}(p,q)\right)^2 \right], \quad \sigma_{0,i}(\cdot) = 0,

where (aij)(a_{ij}) are neighbor weights and ata_t is a non-decreasing penalty sequence, typically atlogta_t \sim \log t. The first term accumulates squared prediction errors (locally and from neighbors), while the penalty controls model complexity, suppressing overfitting as tt increases. The current model order estimate at node ii is

(pt,i,qt,i)=argmin0pp,0qqLt,i(p,q),(p_{t,i},q_{t,i}) = \underset{0\leq p \leq p^*, 0\leq q \leq q^*}{\operatorname{argmin}} L_{t,i}(p,q),

where (p,q)(p^*, q^*) are known upper bounds, or—if unknown—are replaced by an expanding search set (Gan et al., 2021).

3. Distributed Recursive Least Squares and Information Diffusion

Given an order selection (p,q)(p, q), each agent implements a consensus-type distributed Recursive Least Squares (RLS) algorithm. The "adaptation" step at sensor ii reads: θˉt+1,i=θt,i+dt,iPt,iϕt,i[yt+1,iϕt,iTθt,i], Pˉt+1,i=Pt,idt,iPt,iϕt,iϕt,iTPt,i, dt,i=[1+ϕt,iTPt,iϕt,i]1,\begin{aligned} \bar\theta_{t+1,i} &= \theta_{t,i} + d_{t,i} P_{t,i} \phi_{t,i} \left[ y_{t+1,i} - \phi_{t,i}^T \theta_{t,i} \right], \ \bar P_{t+1,i} &= P_{t,i} - d_{t,i} P_{t,i} \phi_{t,i} \phi_{t,i}^T P_{t,i}, \ d_{t,i} &= \left[1 + \phi_{t,i}^T P_{t,i} \phi_{t,i}\right]^{-1}, \end{aligned} followed by a "diffusion" (consensus) step: Pt+1,i1=jNiaijPˉt+1,j1, θt+1,i=Pt+1,i(jNiaijPˉt+1,j1θˉt+1,j).\begin{aligned} P_{t+1,i}^{-1} &= \sum_{j\in N_i} a_{ij} \bar P_{t+1,j}^{-1}, \ \theta_{t+1,i} &= P_{t+1,i} \left( \sum_{j\in N_i} a_{ij} \bar P_{t+1,j}^{-1} \bar\theta_{t+1,j} \right). \end{aligned} Alternatively, gradient-form stochastic approximation updates or consensus++innovations laws can be used, as in the general distributed exponential family estimation framework (Kar et al., 2013). Here, the update at agent nn is: xn(t+1)=xn(t)βt ⁣ ⁣lΩn(t) ⁣ ⁣[xn(t)xl(t)]+αtKn(t)θlogpn(yn(t)xn(t)),\mathbf{x}_n(t+1) = \mathbf{x}_n(t) - \beta_t \!\! \sum_{l\in\Omega_n(t)}\!\! [\mathbf{x}_n(t) - \mathbf{x}_l(t)] + \alpha_t K_n(t) \nabla_\theta \log p_n(y_n(t)|\mathbf{x}_n(t)), with innovation stepsize αt\alpha_t, consensus stepsize βt\beta_t, and adaptive gain Kn(t)K_n(t).

4. Cooperative Excitation and Global Identifiability

The cooperative excitation condition is devised to ensure identifiability of system orders and parameters even under regressors that are correlated and/or nonstationary, i.e., weakening classical persistent excitation. Formally, there exists a scalar sequence ata_t \to \infty such that, for the maximally over-parameterized settings (p,q0)(p^*,q_0) and (p0,q)(p_0,q^*),

logrt(p,q)at0, atλminp,q(t),\begin{aligned} & \frac{\log r_t(p^*,q^*)}{a_t} \to 0, \ & a_t\, \lambda_{\min}^{p,q}(t) \to \infty, \end{aligned}

for all (p,q){(p,q0),(p0,q)}(p, q)\in \{(p^*, q_0), (p_0, q^*)\} almost surely, with rt(p,q)r_t(p,q) and λminp,q(t)\lambda_{\min}^{p,q}(t) defined as

rt(p,q)=λmax[P01(p,q)]+i=1nk=0t1ϕk,i(p,q)2,r_t(p,q) = \lambda_{\max}\big[ P_0^{-1}(p,q) \big ] + \sum_{i=1}^n \sum_{k=0}^{t-1} \| \phi_{k,i}(p,q) \|^2,

λminp,q(t)=λmin(j=1nP0,j1(p,q)+j=1nk=0tDG1ϕk,j(p,q)ϕk,j(p,q)T).\lambda_{\min}^{p,q}(t) = \lambda_{\min} \left( \sum_{j=1}^n P_{0,j}^{-1}(p,q) + \sum_{j=1}^n \sum_{k=0}^{t-D_G-1} \phi_{k,j}(p,q) \phi_{k,j}(p,q)^T \right).

Collective network excitation, even in the presence of individually weak sensors, guarantees the statistical growth of the covariance matrices in all directions, ensuring convergence of both order and parameter estimates (Gan et al., 2021).

5. Statistical Guarantees and Convergence Theory

Under the martingale difference noise model and graph connectivity, the following consistency results are established:

  • Order Consistency: (pt,i,qt,i)(p0,q0)(p_{t,i},q_{t,i}) \rightarrow (p_0,q_0) almost surely for all ii (Theorem 3.1).
  • Parameter Consistency: θt,i(pt,i,qt,i)θ(p0,q0)\theta_{t,i}(p_{t,i},q_{t,i}) \rightarrow \theta(p_0,q_0) almost surely for all ii (Theorem 3.2).

Proof strategies combine martingale convergence arguments, stochastic Lyapunov techniques for RLS-type updates, and careful analysis of the local information criteria under correct and incorrect model orders. The double-array martingale limit theorem is crucial for establishing convergence when the model order itself is time-varying (Gan et al., 2021).

For fixed-order estimation, the consensus++innovations estimator achieves the asymptotic efficiency (inverse centralized Fisher information) under global observability and mean connectivity of the network. The estimate at each node attains

t(θ^(t)θ)dN(0,Ic1),\sqrt{t}\left( \hat\theta(t) - \theta^* \right) \to_d \mathcal{N}(0, I_c^{-1}),

with centralized Fisher information Ic=1σ2i=1NE[ϕi(t)ϕi(t)T]I_c = \frac{1}{\sigma^2} \sum_{i=1}^N \mathbb{E}\big[ \phi_i(t)\phi_i(t)^T \big] (Kar et al., 2013).

6. Order and Parameter Estimation Without Upper Bounds

When prior upper bounds (p,q)(p^*,q^*) are unavailable, the order search space is incrementally enlarged, e.g., to {0,,logt}\{0, \ldots, \lfloor \log t \rfloor\}. A nested minimization is applied:

  1. For s=0,,logts = 0, \ldots, \lfloor \log t \rfloor, run the diffusion-RLS at order (s,s)(s,s), compute

Lˉt,i(s,s)=σt,i(s,s,θt,i(s,s))+(2s)aˉt.\bar L_{t,i}(s,s) = \sigma_{t,i}(s,s,\theta_{t,i}(s,s)) + (2s)\bar a_t.

  1. Select m^t,i=argmin0slogtLˉt,i(s,s)\hat m_{t,i} = \operatorname{argmin}_{0 \leq s \leq \lfloor \log t \rfloor} \bar L_{t,i}(s,s), then minimize over p,qm^t,ip,q \leq \hat m_{t,i}.
  2. Rerun RLS at the chosen order (p^t,i,q^t,i)(\hat p_{t,i}, \hat q_{t,i}).

A modified cooperative excitation condition and double-array martingale arguments yield that m^t,im0=max(p0,q0)\hat m_{t,i} \to m_0 = \max(p_0, q_0) and ultimately (p^t,i,q^t,i)(p0,q0)(\hat p_{t,i}, \hat q_{t,i}) \to (p_0, q_0) almost surely (Gan et al., 2021).

7. Applications, Practical Considerations, and Extensions

Distributed ARX estimation is robust to stochastic feedback and correlated input scenarios, as it does not require independence or stationarity of the regression process. The cooperative excitation framework enables the network to succeed even when individual nodes fail to satisfy classical persistent excitation, highlighting the advantage of sensor cooperation.

Potential extensions of the distributed ARX estimation paradigm include:

  • Distributed ARMAX (inclusion of moving-average terms),
  • Time-varying parameter ARX models,
  • Nonlinear or kernelized ARX estimators (adapting the LIC penalty and local recurrence structures accordingly).

For the distributed consensus++innovations method, parameter stepsizes αt=1/(t+1)\alpha_t=1/(t+1) and consensus weights βt=b/(t+1)τ2\beta_t = b/(t+1)^{\tau_2} with 0<τ2<1/20 < \tau_2 < 1/2 and βt/αt\beta_t/\alpha_t\to\infty are recommended for achieving optimal rates and covariance properties. Adaptive gain tuning via Fisher information consensus is practical when sensor models are heterogeneous (Kar et al., 2013).


Summary Table of Key Elements in Distributed ARX Estimation

Component Key Equation/Concept Reference
ARX Model (node ii) yt+1,i=θTϕt,i+wt+1,iy_{t+1,i} = \theta^T \phi_{t,i} + w_{t+1,i} (Gan et al., 2021)
Local Information Criterion Lt,i(p,q)=σt,i+(p+q)atL_{t,i}(p,q) = \sigma_{t,i} + (p+q)a_t (Gan et al., 2021)
Distributed RLS Update Adaptation + Diffusion (consensus) (Gan et al., 2021)
Consensus++Innovations xn(t+1)=...\mathbf{x}_n(t+1) = ... (Kar et al., 2013)
Cooperative Excitation atλminp,q(t)a_t \lambda_{\min}^{p,q}(t) \to \infty (Gan et al., 2021)
Statistical Guarantees Strong consistency, efficiency (Gan et al., 2021, Kar et al., 2013)

Distributed ARX estimation presents a unified framework for decentralized system identification in networked environments with unknown dynamics and is substantiated by rigorous convergence analysis, with broad applicability to adaptive signal processing, control, and sensor networks.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Distributed ARX Estimation.