Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sequential Diagnosis Benchmark (SDBench)

Updated 1 July 2025
  • Sequential Diagnosis Benchmark (SDBench) provides formalized frameworks, datasets, and methodologies for rigorously evaluating algorithms performing sequential diagnosis under uncertainty.
  • At its core, SDBench utilizes a Bayesian model to address the joint problem of detecting an unobservable change in a system's state and identifying the specific new regime or fault type.
  • SDBench serves as a unified benchmark that subsumes classic problems like change detection and multi-hypothesis testing, applicable across diverse domains such as industrial fault isolation.

The Sequential Diagnosis Benchmark (SDBench) refers to formalized frameworks, datasets, and methodologies for evaluating algorithms and systems that perform sequential diagnosis—a process where evidence is acquired and actions are taken in sequence to efficiently identify the true state (often the fault or cause) of a system under uncertainty. SDBench structures benchmark problems and evaluation criteria to unify and rigorously assess detection, isolation, and identification strategies across diverse domains, including industrial fault isolation, multi-class hypothesis testing, and large-scale combinatorial systems.

1. Bayesian Sequential Diagnosis Model

At the core of SDBench is a rigorous Bayesian formulation of the sequential diagnosis problem. Observations (Xn)n1(X_n)_{n \geq 1} are initially drawn from a known distribution P0P_0, but at an unobservable “disorder time” θ\theta, the distribution shifts to one of MM alternatives, PμP_\mu (μ{1,,M}\mu \in \{1, \ldots, M\}); both θ\theta and μ\mu are unknown. The sequential diagnosis problem is then to:

  1. Detect the change as soon as possible after it occurs,
  2. Identify which new regime PμP_\mu the process has entered with maximal accuracy.

This setup is embedded in a Bayesian framework, with prior distributions for both the disorder time (θ\theta—often a geometric law) and for the regime identity (μ\mu). The expected loss of a strategy (τ,d)(\tau, d)—where τ\tau is the stopping time and dd is the declared regime—includes terms for detection delay, false alarm, and misidentification cost: R(δ)=cE[(τθ)+]+E[a0d1{τ<θ}+aμd1{θτ<}]R(\delta) = c\,\mathbb{E}[(\tau-\theta)^+] + \mathbb{E}[a_{0d}\mathbf{1}_{\{\tau<\theta\}} + a_{\mu d}\mathbf{1}_{\{\theta \leq \tau < \infty\}}] where cc is the per-period delay cost, and a0ja_{0j}, aija_{ij} are false alarm/misidentification penalties.

The online belief state is the vector of posterior probabilities: Πn(0)=P{θ>nFn},Πn(i)=P{θn,μ=iFn}\Pi_n^{(0)} = \mathbb{P}\{\theta > n \mid \mathcal{F}_n\},\quad \Pi_n^{(i)} = \mathbb{P}\{\theta \leq n, \mu = i \mid \mathcal{F}_n\} where Fn\mathcal{F}_n is the observation history; these probabilities are recursively updated, forming a sufficient Markov statistic on the probability simplex.

2. Optimal Sequential Decision Strategy

Analyzing the Bayesian sequential change diagnosis problem reduces to an optimal stopping problem on the Markov process of posterior probabilities. The Bayes risk is expressed: R(δ)=E[n=0τ1c(1Πn(0))+1{τ<}j=1M1{d=j}i=0MaijΠτ(i)]R(\delta) = \mathbb{E}\left[\sum_{n=0}^{\tau-1} c(1-\Pi_n^{(0)}) + \mathbf{1}_{\{\tau < \infty\}} \sum_{j=1}^{M}\mathbf{1}_{\{d=j\}} \sum_{i=0}^M a_{ij} \Pi_\tau^{(i)}\right] The dynamic programming equation for the value function V0(π)V_0(\pi) is: V0(π)=min{h(π),c(1π0)+(TV0)(π)}V_0(\pi) = \min\{h(\pi),\, c(1-\pi_0) + (T V_0)(\pi)\} where h(π)h(\pi) is the minimal expected terminal loss, and (TV0)(π)(T V_0)(\pi) is the integrated expected cost-to-go after new observations. The optimal policy is to stop as soon as the posterior vector πn\pi_n enters the stopping region Γ\Gamma, defined by V0(π)=h(π)V_0(\pi) = h(\pi), and declare the most likely regime based on current posteriors.

For practical implementation, the recursion and integration require discretization of the posterior simplex and precomputation of stopping boundaries, especially for larger MM.

3. Numerical Algorithms and Computational Considerations

The numerical scheme for this framework involves the following steps:

  1. State Discretization: The MM-simplex of posterior probabilities is discretized, typically on a fine grid.
  2. Value Iteration: The value function is iterated via BeLLMan recursions, typically truncated at a finite horizon NN for tractability:

VN(π)=(MNh)(π),V0N+1(π)=min{h(π),c(1π0)+(TV0N)(π)}V^N(\pi) = (M^N h)(\pi),\quad V_0^{N+1}(\pi) = \min\{h(\pi), c(1-\pi_0) + (T V_0^N)(\pi)\}

Iteration stops when convergence is reached.

  1. Integration: Computing expectations over future observations, with system-specific densities, using the D(π,x)D(\pi, x) transition functions.
  2. Boundary Representation: The computed stopping regions Γ\Gamma are “offline” encoded (e.g., via splines or regression) for efficient online evaluation.

For systems with a moderate number of alternatives, the grid-based approach is feasible. For larger MM, employing high-fidelity function approximators or Monte Carlo-based techniques is advised—referencing standard sources in dynamic programming and regression.

4. Geometric Properties of the Optimal Strategy

The diagnosis benchmark highlights the rich geometric structure of the optimal stopping region:

  • For MM alternatives, Γ\Gamma resides in the MM-simplex and typically consists of MM non-empty, convex, closed, bounded sets Γ(j)\Gamma^{(j)}, with each corresponding to a distinct hypothesized regime.
  • The geometry—notably, the connectedness/disconnectedness of both stopping and continuation regions—depends on the cost structure and prior parameters.
  • Visualization for M=2,3M=2,3 (simplex as a triangle/triangle in plane or tetrahedron in 3D) allows for insight into how optimal policies deform with problem parameters, as seen in detailed figures in the original paper.

This geometric insight underpins both theoretical understanding and practical implementation, as offline boundary computation allows for efficient, accurate online deployment.

5. Special Cases and Unified Framework

SDBench’s Bayesian formulation subsumes classic problems:

  • Bayesian Change Detection (Shiryaev’s Problem): When M=1M = 1, the problem reduces to quickest change detection with standard Bayes risk.
  • Bayesian Sequential Multi-hypothesis Testing: When p0=1p_0 = 1 (change occurs at time $0$), detection disappears, and the problem matches Wald’s sequential multi-hypothesis testing (Arrow-Blackwell-Girshick).
  • The model provides a unified lens connecting previously disparate strands in sequential analysis, allowing direct comparison and informed parameterization for diverse diagnosis contexts.

6. Application: Component Failure Detection with Suspended Animation

The benchmark extends to diagnose systems with multiple concealed components, such as industrial or defense systems experiencing “suspended animation” upon component failure:

  • Each component fails independently (geometric hazard rate); the system switches from nominal (f0f_0) to a new regime (fμf_\mu), corresponding to the failed component.
  • The diagnostic goal becomes joint detection (failure occurrence) and identification (localization of the failed component), updating posteriors as observations accrue.
  • The same Bayesian optimal stopping machinery applies, allowing for dynamically updating beliefs about both timing and type of fault and optimizing intervention strategies accordingly.

The approach generalizes to counting and identifying multiple failures by appropriately encoding the diagnosis variable μ\mu.

Conclusion

SDBench, as formalized in Bayesian sequential change diagnosis, provides a rigorous, unified, and practically computable framework for the joint detection and identification of abrupt, unobservable changes in stochastic systems. The benchmark enables the systematic evaluation of sequential diagnosis strategies, accommodating classic and modern problems, illuminating geometric properties of optimal policies, and guiding efficient numerical implementation. This both advances theoretical understanding and provides a robust, actionable reference point for developing and benchmarking real-world diagnostic systems across fields such as process control, defense, and automated fault isolation.