Warped Linear Prediction for Count Time Series

Updated 18 February 2026

Warped Linear Prediction is a semiparametric approach that extends Gaussian DLMs by applying a transformation and rounding mechanism to generate count-valued observations.
It achieves analytic conjugate inference through a selection-normal framework, supporting both batch and online estimation with efficient recursive updates.
The method rigorously handles exact count support, including zero-inflation and multivariate counts, while offering flexibility via nonparametric transformation options.

Warped Linear Prediction (WLP) is a semiparametric methodology that extends Gaussian Dynamic Linear Models (DLMs) to time series of counts by introducing a two-step warping mechanism—a flexible transformation followed by a rounding operator. This approach enables analytic conjugate inference for count-valued time series, supporting both batch and online algorithms, and yields a framework that unifies and extends a variety of discrete time series models while retaining probabilistic and recursive updating properties (King et al., 2021).

1. Gaussian Dynamic Linear Model Foundation

The backbone of WLP is the Gaussian DLM, specified in state-space form for a latent continuous process $\{z_t\}$ :

State Evolution: $x_t = G_t\, x_{t-1} + \omega_t$ , with $\omega_t \sim N_p(0, W_t)$
Observation Model: $z_t = F_t^\top x_t + \nu_t$ , with $\nu_t \sim N_n(0, V_t)$
Prior Distribution: $x_0 \sim N_p(a_0, R_0)$

Here, $x_t \in \mathbb{R}^p$ is the latent state, $z_t \in \mathbb{R}^n$ is the latent "observation," $F_t$ and $G_t$ are known design matrices, and $V_t$ , $W_t$ are known or estimated covariance matrices. In standard form, the Kalman filter and smoother yield closed-form recursive updates for the filtered and smoothed posterior moments of $x_t$ .

2. Warping Procedure: Transformation and Rounding

WLP introduces a two-step warping of the latent Gaussian sequence $\{z_t\}$ to generate count-valued observations $y_t \in \mathbb{N}^n$ :

Transformation $T$ : A continuous, strictly monotonic operator $T:\mathbb{R}^n \to \mathbb{R}^n$ . Options include parametric forms (e.g., log, sqrt, identity) or a nonparametric construction via the empirical CDF of $y$ :

$\widehat T(z) = \bar y + \hat s_y\, \Phi^{-1}(\tilde F_y(j)) \quad \text{for } z \in [j, j+1),$

with smooth interpolation to ensure monotonicity.

Rounding $R$ : A discrete operator $R:\mathbb{R}^n \to \mathbb{N}^n$ $R : R^{n} \to N^{n}$ , typically the component-wise floor with a zero-modification:
- $R(w) = 0$ if $w < 0$ ,
- $R(w) = \lfloor w \rfloor$ otherwise.

The observed count is then:

$y_t = R\left(T\left(F_t^\top x_t + \nu_t\right)\right).$

The design admits exact count support, easily accommodating zero-inflation, upper bounds, and various marginal distributions.

3. Conjugate Analytic Inference via Selection-Normal Family

Conditioning on a count $y_t$ is equivalent to conditioning on a truncation event in latent space: $z_t \in C_t = T^{-1}(R^{-1}(y_t))$ . Under this construction, all marginal and joint posteriors over latent states, such as $p(x_t \mid y_{1:t})$ and $p(x_{1:T} \mid y_{1:T})$ , fall within the Selection-Normal (SLCT-N) family:

$[x \mid z \in C] \sim \mathrm{SLCT\text{-}N}_{n,p}(\mu_z, \mu_x, \Sigma_z, \Sigma_x, \Sigma_{z\,x}, C).$

The SLCT-N density is proportional to

$\phi_p(x; \mu_x, \Sigma_x)\, \overline\Phi_n(C; \mu_z + \Sigma_{z\,x} \Sigma_x^{-1}(x - \mu_x), \Sigma_z - \Sigma_{z\,x} \Sigma_x^{-1} \Sigma_{z\,x}^\top),$

where $\phi_p$ is the $p$ -dimensional normal density and $\overline\Phi_n$ is the truncated cumulative normal. The filtering and smoothing recursions for state prediction, filtering update, and joint smoothing maintain SLCT-N conjugacy with closed-form parameter updates.

4. Batch and Online Computational Algorithms

Algorithmic inference in WLP leverages the SLCT-N structure for both batch (offline) and online (streaming) settings:

Direct Monte Carlo (MC) Sampling: For batch analysis, draws are generated from SLCT-N by:
1. Sampling $V_0 \sim N_{d_1}(0, \Sigma_z)$ , truncated to $C - \mu_z$ ,
2. Sampling $V_1 \sim N_{d_2}(0, \Sigma_x - \Sigma_{z\,x}^\top \Sigma_z^{-1} \Sigma_{z\,x})$ ,
3. Setting $x = \mu_x + V_1 + \Sigma_{z\,x}^\top \Sigma_z^{-1} V_0$ , which yields exact draws from the desired selection-normal distribution.
Optimal Particle Filter: For online inference, at each time $t$ and for each particle $x_{t-1}^{(s)}$ , a new particle is drawn from

$q(x_t \mid x_{t-1}, y_t) = p(x_t \mid x_{t-1}, y_t) = \mathrm{SLCT\text{-}N}_{n,p}(\dots, C_t),$

with weights proportional to $\overline\Phi_n(C_t; F_t G_t x_{t-1}^{(s)}, V_t + F_t W_t F_t^\top)$ . This choice of proposal minimizes weight variance, enhancing online filter efficiency.

5. Forecasting and Predictive Distribution

WLP supports both analytic and simulation-based forecasting for count time series:

Analytic One-Step Forecast: The one-step predictive likelihood is given by the ratio of model marginal probabilities:

$p(y_{t+1} \mid y_{1:t}) = \frac{p(y_{1:t+1})}{p(y_{1:t})} = \frac{\overline\Phi_{n(t+1)}(C_{1:t} \times C_{t+1}; \mu_z, \Sigma_z)}{\overline\Phi_{n t}(C_{1:t}; \mu'_z, \Sigma'_z)},$

where parameters come from joint smoothing up to $t+1$ . This is evaluated over a local grid of $y$ values.

Simulation-Based Forecast: Posterior samples $x_t^{(s)}$ $x_{t}^{(s)}$ are propagated forward:
1. $x_{t+1} \sim N(G_{t+1}x_t^{(s)}, W_{t+1})$ ,
2. $z_{t+1} \sim N(F_{t+1}x_{t+1}, W_{t+1})$ ,
3. $y_{t+1}^{(s)} = R(T(z_{t+1}^{(s)}))$ , constructing an empirical forecast distribution.

6. Strengths and Practical Limitations

WLP offers several key advantages:

Data Coherency: $y_t$ has exact count support, with direct handling of zero-inflation, discrete bounds, and heaping.
Flexible Marginals: The transformation $T(\cdot)$ can be nonparametric, supporting semiparametric modeling.
Analytic Recursions: Selection-Normal conjugacy allows closed-form updating for filtering, smoothing, and likelihood.
Multivariate and Missing Data Handling: Direct support for multivariate counts ( $\mathbb{N}^n$ ), missingness, and mixed-frequency data.
Computation: Algorithms are suitable for both offline (MC, batch Gibbs) and online (optimal particle filter) settings with minimal reliance on traditional MCMC.

Limitations include:

High-Dimensionality: Sampling from large-dimensional truncated normals may become a computational bottleneck.
Growing Filter Dimension: Naive filtering leads to latent block dimension growth as $t$ increases, requiring attention to algorithm scalability (e.g., marginal or particle implementations).
Transformation Estimation: Nonparametric $T$ requires (possibly repeated) estimation—especially challenging in streaming contexts.
Variance Estimation: Covariances $W_t$ , $V_t$ are typically estimated via marginal likelihood maximization or modeled as hyperpriors, adding a layer of inference.

In sum, Warped Linear Prediction applies a semiparametric warping to the Gaussian DLM, yielding a coherent, conjugate, and flexible framework for count time series modeling that preserves the structural and computational benefits of Kalman-type models while ensuring respect for the discreteness of observed data (King et al., 2021).

Markdown Report Issue Upgrade to Chat

References (1)

Warped Dynamic Linear Models for Time Series of Counts (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Warped Linear Prediction (WLP).