APTBench: Parametric Timed Model Checking
- APTBench is a benchmark library for parametric timed automata, featuring 56 benchmarks and 119 models to evaluate model checking algorithms.
- It integrates rigorous metadata, formal structure definitions, and standardized workflows to ensure reproducible and comparative evaluations.
- Empirical results show a median synthesis time of 2.82 s for 70% of property queries, highlighting both efficiency and current tool limitations.
APTBench for Parametric Timed Model Checking is a structured, publicly available benchmark library that provides a comprehensive suite of parametric timed automata (PTA) models and properties designed for evaluation, comparison, and advancement of algorithms for parametric timed model checking. It encompasses academic and industrial case studies as well as toy and unsolvable instances, with rigorous metadata, formal structure definitions, standardized workflow, and empirically grounded evaluation methodology (Étienne, 2018, André et al., 2019, André et al., 2021).
1. Formalism of Parametric Timed Automata
The foundation of APTBench is the formalism of PTA, an extension of the classic timed automata of Alur and Dill to incorporate undetermined, synthesizeable parameters—symbolic quantities that stand for real-time constants or uncertain timing constraints. A PTA is formally defined as a tuple
where:
- is a finite set of locations, with as the initial location.
- is a set of real-valued clocks.
- is a finite set of parameters (over or ).
- is a set of action labels for synchronization.
- assigns to each location an invariant—typically a conjunction of linear constraints involving clocks and parameters.
- is the set of edges, where each edge has a guard (conjunction of clock/parameter constraints), an action, a reset operation (possibly parametrized), and a destination.
- The semantics permits time elapse in a location as long as the invariant holds, and discrete transitions when guard constraints are satisfied, possibly resetting clocks to 0 or linear-parametric expressions.
Supported constructions in APTBench (and the IMITATOR tool) include stopwatches (pauseable clocks), global rational discrete variables, parametric linear resets and invariants, and multi-rate clocks (Étienne, 2018, André et al., 2021).
L/U-PTA restrictions—where parameters occur only as lower or upper bounds—impose crucial decidability and monotonicity properties, enabling specialized decision procedures and complexity bounds (Étienne, 2018, André et al., 2019).
2. Design and Taxonomy of the APTBench Suite
APTBench is designed to offer both breadth and depth in real-time and parametric verification benchmarks. It contains:
- 56 benchmarks, 119 PTA models, and 216 property queries in the extended suite (André et al., 2021).
- Each model is associated with rich metadata (name, number of locations, clocks, parameters, discrete variables, syntactic and semantic features, categories, properties, solvability status, parameter domains).
Table 1: Overview of benchmark categories (extracted (Étienne, 2018, André et al., 2021))
| Origin/Category | Subclass/Features | Example Benchmarks |
|---|---|---|
| Academic | Protocols, scheduling, circuits, textbook patterns | Fischer mutual exclusion, job-shop, flip-flop |
| Industrial | Automotive, automation, remote protocols | FMTV challenge, SIMOP, RCP |
| Toy/Unsolvable | Minimal yet unsolvable for current tools | toy:1/n, toy:n |
| Liveness/Monitor | Cycle, deadlock, trace preservation | Bounded Retransmission, observer |
| Extended Constructs | Stopwatches, multi-rate, global variables | Present in 17% of models |
Salient metrics include 0, 1, 2, discrete variables, invariants, stopwatch usage, L/U subclass membership, and scalability (parameterizable by size). Properties cover reachability/safety (EF synthesis), optimal reachability, unavoidability, robustness, liveness (cycle detection), parameter minimization, trace preservation, and pattern matching (Étienne, 2018, André et al., 2021).
3. Repository Structure, Formats, and Interoperability
APTBench is organized as a structured repository with standardized directory and metadata conventions:
- Each benchmark resides in its own directory, with core files:
model.imi(IMITATOR PTA model)model.prop(property queries)meta.json(schema incl. 3, 4, 5, features, solvability)- Implementation of additional formats (HyTech, UPPAAL) where available.
- JANI-formatted equivalents provided for cross-tool evaluation.
Metadata files explicitly record domain, scalability, known results, and recommended parameter domains, facilitating automated bench execution and result aggregation (André et al., 2021).
The suite supports tool-neutral specification using PTA-XML or JSON (André et al., 2019). Models are also directly runnable in IMITATOR (across versions 2.x and 3.x) and can be ported, with possible information loss, into other TA model checkers.
4. Methodology for Parametric Model Checking and Properties
Typical workflow for executing APTBench leverages IMITATOR's parameter synthesis commands:
- For EF-synthesis (safety/reachability): 8
- For optimal reachability: 9
- For opacity or robustness synthesis: analogous invocation with
AFsynthorRobustSynth.
Timeout limits (typically 300 s) are imposed for fair tool comparison. Output consists of (possibly partial) symbolic solutions—zones or polyhedral constraints over parameter space—representing exactly (if computation terminates) or underapproximate (if interrupted) the feasible parameter valuations (Étienne, 2018, André et al., 2021).
Collected metrics per run include CPU time, number of symbolic states (parametric zones), output constraint complexity (no. of disjuncts), and memory usage (not logged by default) (André et al., 2021). Each model-property pair is annotated as "solvable," "unsolvable," or "partial," establishing baselines for tool evaluation.
5. Property and Evaluation Spectrum: Safety, Liveness, Opacity
Properties targeted in APTBench are both safety- and liveness-oriented, covering wide-ranging real-time and security objectives:
- Reachability/Safety (EF): Synthesize all parameter valuations for which a target location is (or is not) reachable (e.g., mutual exclusion not violated).
- Liveness/Cycle Synthesis: Parameter ranges for which infinite accepting traces exist (e.g., non-terminating protocols).
- Deadlock-freedom: Ensuring progress under all or specific parametric conditions.
- Timed opacity: Parameter synthesis for durations and parameter assignments so that adversaries cannot determine if secret behavior occurred, with formal definitions for timed-opacity at duration 6 and full opacity (André et al., 2019).
- Robustness: Preservation of behaviors under perturbation of the parameters.
- Quantitative objectives: Parameter minimization/maximization (e.g., minimal time to reach a state).
Example: For the Fischer 2-process model, reachability synthesis finds all 7 for which mutual exclusion is maintained; for Bounded Retransmission, liveness synthesis identifies parameter zones admitting infinite cycles (André et al., 2021).
Opacity synthesis algorithms in the suite leverage parametric zone graphs, simultaneous reachability with and without secret passage, and return parametric constraint formulae as output (e.g., conjunctions of linear inequalities over parameters and durations) (André et al., 2019).
6. Empirical Results and Community Impact
Empirical evaluation (e.g., with IMITATOR 3.0) demonstrates:
- Of 216 property queries across the extended suite, 70% are solvable within 60 s, with a median synthesis time of 2.82 s and median explored states 580.
- 15% of benchmarks (including all "toy/unsolvable" cases) are intentionally constructed to be beyond current capabilities, highlighting algorithmic limits.
- Extended features (stopwatches, multi-rate) appear in 17% of models, with synthesis time increasing by a factor of four, indicating bottlenecks for further research.
- The remaining entries are partially solvable or time out at scale, delineating research frontiers (André et al., 2021).
APTBench’s persistent role is to define a reference point for assessing advances in symbolic techniques (e.g., DBM vs polyhedral zone representations, extrapolation, goal-directed search) and establishing regression baselines for future synthesis tools (Étienne, 2018, André et al., 2021).
7. Extensibility and Directions
APTBench is actively maintained and extensible: new benchmarks, model features, and property types are integrated via standardized directory and metadata conventions. Current and potential extensions include diagnosability, timed pattern matching, weighted/priced parametric automata, and probabilistic model checking. All models and associated data are CC-BY 4.0 licensed and can be extended by contribution via GitHub or direct library submission (Étienne, 2018, André et al., 2021).
A plausible implication is that APTBench’s extensibility and rigorous organization will continue to drive comparative, reproducible research, expose the boundaries of algorithmic tractability, and serve as a public resource for both security (e.g., timed-opaque protocol design) and classic real-time system analysis (André et al., 2019, André et al., 2021).