Papers
Topics
Authors
Recent
Search
2000 character limit reached

ABC-Parametrization in HMMs & Regression

Updated 21 April 2026
  • ABC-parametrization is defined as two advanced frameworks: one using approximate Bayesian computation for hidden Markov models and another employing abundance-based constraints in regression.
  • In HMMs, it replaces intractable likelihood evaluations with kernel-smoothed substitutes, balancing bias and Monte Carlo variance through the tuning parameter epsilon.
  • In regression, the method enforces weighted sum-to-zero constraints, ensuring invariant main effect interpretations and improved efficiency compared to classical coding schemes.

The term abc-parametrization denotes two distinct, advanced frameworks in contemporary statistical methodology: (1) the use of Approximate Bayesian Computation (ABC) for static parameter inference in hidden Markov models (HMMs), and (2) the abundance-based constraints (ABC) parametrization for categorical effect modification in linear regression models. Both are unified by the objective of producing computationally feasible, interpretable, and robust parameter estimates under strong modeling or inferential challenges, albeit in different domains. The sections below address each framework, with explicit technical detail and references to foundational papers (Ehrlich et al., 2012, Kowal, 2024).

1. ABC-Parametrization in Hidden Markov Models

The ABC-parametrization for HMMs is motivated by scenarios where the observation (emission) density gθ(y∣x)g_\theta(y|x) is intractable but can be sampled from. Traditional maximum likelihood estimation (MLE) requires evaluating gθ(y∣x)g_\theta(y|x), which is prohibitive for complex or simulator-based models. ABC circumvents this by defining an auxiliary-variable HMM: for each latent state, pseudo-observations u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x) are generated and compared to the true observation yy using a kernel Kϵ(y∣u)K_\epsilon(y|u), yielding a kernel-smoothed likelihood substitute. The parameter ϵ>0\epsilon > 0 is explicitly the "ABC parameter," regulating the fidelity of the approximation: as ϵ↘0\epsilon \searrow 0, the ABC approximation becomes exact, but smaller ϵ\epsilon typically results in higher Monte Carlo variance in particle weights. The ABC-parametrized likelihood is given by

gθ,ϵ(y∣x)=∫Kϵ(y∣u) gθ(u∣x) duCϵg_{\theta,\epsilon}(y|x) = \frac{\int K_\epsilon(y|u)\,g_\theta(u|x)\,du}{C_\epsilon}

where CϵC_\epsilon is a normalization constant, independent of state and parameter.

Under strong regularity conditions (joint Lipschitz continuity and boundedness for gθ(y∣x)g_\theta(y|x)0, gθ(y∣x)g_\theta(y|x)1, and their gradients), the log-likelihood and gradient computed from the ABC-parameterized model are biased by at most gθ(y∣x)g_\theta(y|x)2, with gθ(y∣x)g_\theta(y|x)3 the sample size (Ehrlich et al., 2012). This facilitates static parameter estimation without direct likelihood evaluations.

2. Sequential Monte Carlo (SMC) Methods under the ABC-Parametrization

The ABC-parameterized HMM is naturally amenable to SMC implementation for both filtering and marginal likelihood computation. At each time gθ(y∣x)g_\theta(y|x)4, for a set of gθ(y∣x)g_\theta(y|x)5 particles:

  • Predict latent states using the Markov transition,
  • Simulate associated pseudo-observations,
  • Assign weights via gθ(y∣x)g_\theta(y|x)6,
  • Perform resampling if necessary.

The unbiased SMC estimator of the ABC marginal likelihood is

gθ(y∣x)g_\theta(y|x)7

where each factor is the mean kernel weight over the particle set. Second-order Taylor bias correction is often applied to log-likelihood estimators. The computational cost is linear in gθ(y∣x)g_\theta(y|x)8 per time step.

The choice of gθ(y∣x)g_\theta(y|x)9 is pivotal: larger values stabilize weight degeneracy but increase bias, while small u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)0 yields more precise approximations at the cost of greater Monte Carlo variance (Ehrlich et al., 2012).

3. SPSA-Based Static Parameter Estimation in ABC-HMMs

Simultaneous Perturbation Stochastic Approximation (SPSA) is utilized for MLE or recursive MLE in ABC-HMMs. At each time u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)1, the gradient of the ABC log-likelihood with respect to u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)2 is obtained by finite differencing:

  • Evaluate log-likelihood increments at u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)3 using two SMC runs,
  • Compute the gradient estimate as u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)4,
  • Update u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)5.

Step-size sequences u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)6 follow standard stochastic approximation rules, and u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)7 are Rademacher vectors. The procedure avoids direct calculation of u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)8 or its derivatives, making parameter learning practical even when u∼gθ(⋅∣x)u \sim g_\theta(\cdot|x)9 is a black box (Ehrlich et al., 2012).

4. Abundance-Based Constraints (ABC) Parametrization in Categorical Regression

For linear models involving categorical covariates and their interactions with continuous predictors (cat-modified models), the ABC-parametrization (abundance-based constraints) provides an alternative to reference group or sum-to-zero coding (Kowal, 2024). The core principle is to constrain each set of category-specific coefficients (main effects and interactions) to have weighted sums of zero, where weights are the empirical sample proportions of each category. Formally, for categorical variable yy0 with levels yy1,

yy2

where yy3 is the proportion of observations in category yy4.

The ABC-parametrization ensures that main effects are interpreted as abundance-weighted averages across groups and that estimates and standard errors (SEs) of main effects are invariant when categorical modifiers are included, provided group variances are homogeneous. Furthermore, standard errors for main effects are non-increasing when cat-modifiers are added (Kowal, 2024).

5. Comparison with Other Parametrizations in Linear Regression

Classical reference-group encoding (RGE) and sum-to-zero (STZ) constraints suffer from changed interpretations, bias toward reference groups (under regularization), and increased SEs when interaction terms are introduced. In contrast, the ABC approach:

  • Removes reference-group arbitrariness,
  • Stabilizes main effect estimation under model enrichment with modifiers,
  • Maintains or improves the efficiency (lower or equal SEs) of main effects,
  • Facilitates transparent interpretation, as main effects represent global averages.

A structured approach involves constructing the design matrix, computing empirical proportions, forming the constraint matrix, applying QR-based reduction to project onto the constraint nullspace, fitting the unconstrained regression, and back-transforming to the original parameter space. ABC-based penalized estimation (e.g., lasso, ridge) proceeds analogously.

Parametrization Reference Group Bias Invariance w/ Interactions SE Inflation
Reference-group (RGE) Yes No Yes
Sum-to-zero (STZ) No No Can increase
Abundance-based (ABC) No Yes (under homogeneity) No

6. Application and Practical Guidelines

Empirical studies confirm the theoretical claims of ABC parametrization. In hidden Markov models, application to both linear-Gaussian and chaotic Lorenz systems demonstrates that bias is dominated by yy5, variance by yy6, and that with appropriate particle numbers and kernel widths, consistent estimation is routine (Ehrlich et al., 2012). In regression, simulation and real data analyses show that OLS estimates for main effects are preserved under model enrichment (e.g., addition of categorical interactions), while SEs are typically reduced (Kowal, 2024).

Key recommendations include:

  • For ABC-HMMs, tune yy7 to balance bias and Monte Carlo variance, typically identifying a "sweet spot" where both are acceptable.
  • For ABC in regression, center continuous covariates, check for variance/covariance homogeneity across groups, and employ empirical proportions in constraint formation.

7. Theoretical and Methodological Conditions

The performance and interpretation guarantees of ABC-parametrization rely on specific assumptions:

  • In HMMs: Lipschitz and boundedness of transition/observation densities and their derivatives for bias control (Ehrlich et al., 2012).
  • In regression: centered covariates, variance/covariance homogeneity within groups, absence of perfect collinearity, and classical OLS error assumptions (Kowal, 2024).

Under these conditions, ABC parametrization provides a principled, data-driven mechanism for model specification, estimation, and inference in complex latent or heterogeneous effect settings. It ensures clear, global interpretations of effects, consistent estimation under model extension, and efficient utilization of data for both likelihood-free and regression settings.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to abc-parametrization.