Papers
Topics
Authors
Recent
2000 character limit reached

Language-Aided Particle Filter (LAPF)

Updated 17 November 2025
  • Language-Aided Particle Filter is a probabilistic state estimation framework that fuses human language with sensor data to improve dynamic system tracking.
  • It utilizes pretrained Sentence-BERT models and a two-layer MLP to convert text reports into quantifiable likelihoods for Bayesian fusion.
  • Empirical results demonstrate LAPF’s reduced estimation error and increased robustness compared to conventional filtering methods.

The Language-Aided Particle Filter (LAPF) is a probabilistic state estimation framework that systematically incorporates human-generated natural language reports into particle filtering for dynamic physical systems. By quantizing human observations and leveraging pretrained natural language encoders, LAPF models humans as probabilistic sensing agents and structurally fuses text-based evidence alongside conventional sensor data during filtering and inference.

1. Formulation and Mathematical Foundations

Let xtRnx_t \in \mathbb{R}^n denote the state of the physical system at time tt and utRmu_t \in \mathbb{R}^m the control input. The system evolves according to:

xt=f(xt1,ut,wt)x_t = f(x_{t-1}, u_t, w_t)

where wtw_t is process noise from a known distribution. Observations are of two forms:

  • Conventional sensor readings: ytRpy_t \in \mathbb{R}^p with likelihood p(ytxt)p(y_t | x_t).
  • Human-generated text reports: ltTl_t \in \mathbb{T}, treated as observations from a “human sensor.”

The filtering objective is the joint posterior over states given all observations:

p(xty1:t,l1:t)p(x_t | y_{1:t}, l_{1:t})

recursively computed via:

  • Prediction:

p(xty1:t1,l1:t1)=p(xtxt1,ut)  p(xt1y1:t1,l1:t1)  dxt1p(x_t | y_{1:t-1}, l_{1:t-1}) = \int p(x_t | x_{t-1}, u_t)\;p(x_{t-1} | y_{1:t-1}, l_{1:t-1})\;dx_{t-1}

  • Update:

p(xty1:t,l1:t)p(yt,ltxt)  p(xty1:t1,l1:t1)p(x_t | y_{1:t}, l_{1:t}) \propto p(y_t, l_t | x_t)\;p(x_t | y_{1:t-1}, l_{1:t-1})

Assuming observation conditional independence (ytltxty_t \perp l_t | x_t), their joint likelihood factorizes:

p(yt,ltxt)=p(ytxt)  p(ltxt).p(y_t, l_t \mid x_t) = p(y_t \mid x_t)\;p(l_t \mid x_t).

2. Particle Filter Weighting and Language Likelihood

LAPF uses NpN_p weighted particles {(xt(i),wt(i))}\{(x_t^{(i)}, w_t^{(i)})\}:

  • Prediction: xt(i)p(xtxt1(i),ut)x_t^{(i)} \sim p(x_t | x_{t-1}^{(i)}, u_t).
  • Weight update:

wt(i)wt1(i)  p(ytxt(i))  p(ltxt(i))w_t^{(i)} \propto w_{t-1}^{(i)}\;p(y_t | x_t^{(i)})\;p(l_t | x_t^{(i)})

The critical innovation is p(ltxt(i))p(l_t | x_t^{(i)}). Human-generated texts are mapped to quantized observation labels qt{1,,m}q_t \in \{1,\ldots,m\} via a latent space. The likelihood expands as:

p(ltxt)=j=1mp(lt,qt=jxt)=j=1mp(ltqt=j)  p(qt=jxt)p(l_t | x_t) = \sum_{j=1}^m p(l_t, q_t = j | x_t) = \sum_{j=1}^m p(l_t | q_t = j)\;p(q_t = j | x_t)

Assuming uniform p(qt=j)p(q_t = j) and by Bayes’ rule (Prop. 1):

p(ltxt)j=1mp(qt=jlt)  p(qt=jxt)p(l_t | x_t) \propto \sum_{j=1}^m p(q_t = j | l_t)\;p(q_t = j | x_t)

Here,

  • p(qt=jlt)p(q_t = j | l_t) is the probability assigned by the LLM to label jj given text ltl_t.
  • p(qt=jxt)p(q_t = j | x_t) is the likelihood the human’s internal measurement falls within quantization bin Λj\Lambda_j, given state xtx_t:

p(qt=jxt)=yΛjp(yxt)  dyp(q_t = j | x_t) = \int_{y \in \Lambda_j} p(y | x_t)\;dy

where p(yxt)p(y | x_t) is the distribution of the human observer’s real-valued assessment (yH,ty_{H,t}).

3. Natural Language Processing Pipeline

The NLP module computes p(ql)p(q | l) as follows:

  • Text encoding: Pretrained Sentence-BERT models map text ll to eRde \in \mathbb{R}^d (e.g., “sentence-bert-base-ja”, d=768d=768).
  • Classification: ee is input to a two-layer MLP ($128$ and $64$ hidden units, ReLU) producing ψRm\psi \in \mathbb{R}^m.
  • Softmax yields probabilities:

p(q=jl)=exp(ψj)k=1mexp(ψk)p(q = j | l) = \frac{\exp(\psi_j)}{\sum_{k=1}^m \exp(\psi_k)}

This classifier is trained via cross-entropy using a dataset of text and true quantized labels.

4. Pseudocode and Workflow Summary

The procedural workflow for LAPF is:

1
2
3
4
5
6
7
8
9
10
11
Algorithm LAPF(N_p, m, {Λ_j}, π_0, T)
1. Initialize: x_0^{(i)} ∼ π_0, w_0^{(i)} = 1 / N_p
2. For t = 1…T:
   a) Propagate: x_t^{(i)} ∼ p(x_t | x_{t-1}^{(i)}, u_t)
   b) For each particle:
      – L_num = p(y_t | x_t^{(i)})
      – L_lang = ∑_{j=1}^m p(q_t = j | l_t) · p(q_t = j | x_t^{(i)})
      – w_t^{(i)} ← w_{t-1}^{(i)} · L_num · L_lang
   c) Normalize weights
   d) Resample {x_t^{(i)}} with probabilities {w_t^{(i)}}
3. Return weighted particle approximation of p(x_t | y_{1:t}, l_{1:t})

5. Empirical Application: Irrigation Canal Water Level Estimation

A case study applies LAPF to estimating water levels in five adjacent segments of an irrigation canal:

  • State: xtR5x_t \in \mathbb{R}^5, segment water levels.
  • Dynamics: xt=proj[0,5](Axt1+wt)x_t = \text{proj}_{[0,5]}(A x_{t-1} + w_t), AA as per Eq. (19), wtN(ut,Q)w_t \sim \mathcal{N}(u_t, Q) with ut=[1,0,0,0,0]Tu_t = [1, 0, 0, 0, 0]^T, Q=diag(1.0,0.1,0.1,0.1,0.1)Q = \text{diag}(1.0, 0.1, 0.1, 0.1, 0.1).
  • Sensing: Human observer perceives state via yH,t=proj[0,5](CHxt+vt)y_{H,t} = \text{proj}_{[0,5]}(C_H x_t + v_t), CH=[1,0,0,0,0]C_H = [1, 0, 0, 0, 0], vtN(0,1)v_t \sim \mathcal{N}(0,1). Language ltl_t generated via lookup from yH,ty_{H,t}.
  • Quantization: m=5m=5 bins over [0,5][0,5].
  • Dataset: 2,454 crowdsourced (text, ratio) pairs; train, validation, and test splits as $1,882/205/289$.
  • Text encoder: “sentence-bert-base-ja”; MLP: [768128645][768 \rightarrow 128 \rightarrow 64 \rightarrow 5], trained 100 epochs, lr =1×105= 1\times10^{-5}, batch =16= 16.

6. Comparative Performance and Robustness

Quantitative results (1,000 Monte-Carlo trials, T=100T=100 steps, Np=1,000N_p=1,000):

Method Avg. MSE
No obs. 0.73 ± 0.13
EDAPF 0.52 ± 0.08
LAPF 0.49 ± 0.08

Out-of-domain robustness (dialectal text for yH,t<0.2y_{H,t} < 0.2):

Method Avg. MSE
EDAPF 0.75 ± 0.15
LAPF 0.53 ± 0.08

Key findings are:

  • Incorporating language observations via LAPF reduces estimation error relative to an externally trained DNN-aided particle filter (EDAPF).
  • The probabilistic fusion of natural language through p(ql)p(q|l) offers robustness under out-of-domain language shifts, outperforming EDAPF.

This suggests the value of probabilistic language calibration for reliable human-in-the-loop sensing in practical settings.

7. Conceptual Significance and Connections

LAPF establishes a mathematically grounded approach for integrating human linguistic reports into Bayesian state estimation, leveraging neural NLP models as calibrated probabilistic sensors. Unlike generic DNN-based post-processors, LAPF structures the language likelihood via quantized latent representations and direct probability fusion with physical models. This preserves the interpretability and fusion rigor of the filtering process and facilitates robustness against linguistic variability.

While "Language-Aided Particle Filter" in (Miyoshi et al., 14 Nov 2025) is distinct from the "Localized Adaptive Particle Filter" (also abbreviated LAPF) of (Rojahn et al., 2022), both frameworks pursue efficient assimilation of heterogeneous and spatially distributed observations for large-scale dynamic systems. The LMCPF extension (Rojahn et al., 2022) further generalizes the particle filter using Gaussian uncertainty and localized mixtures, providing a framework for operational global forecasting with millions of variables.

A plausible implication is that future work may consider hybridizing these schemes—e.g., introducing language-derived observation models within localized Gaussian mixtures—to leverage human sensing in high-dimensional, operational contexts. This could address open challenges including observation quality control, adaptive resampling under linguistic uncertainty, and kernel selection strategies for robust ensemble spread and bias correction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Language-Aided Particle Filter (LAPF).