Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Decision Repair (ADReFT)

Updated 3 July 2025
  • Adaptive Decision Repair (ADReFT) is a family of formal, algorithmic, and data-driven techniques that dynamically identify and fix erroneous or suboptimal system states.
  • It applies sequential decision theory and Bayesian inference to select and execute repair actions that minimize costs and disruptions.
  • Its broad applications in robotics, autonomous driving, program repair, and cyber-physical systems enhance safety, fairness, and overall system resilience.

Adaptive Decision Repair (ADReFT) is a family of formal, algorithmic, and data-driven techniques for dynamically identifying, selecting, and executing repair actions in response to erroneous, unsafe, or suboptimal system states. ADReFT emphasizes adaptivity: repair decisions are informed by current context, evidence, and the evolving state of the system, with the goal of achieving correctness, safety, fairness, or compliance while minimizing intervention costs or disruptions. It originates in sequential decision theory but spans multiple domains, including diagnosis, cyber-physical systems, robotics, autonomous driving, program repair, and socio-technical decision-making.

1. Sequential Decision-Theoretic Foundations

ADReFT formalizes troubleshooting as a sequential decision problem, as described in "Decision-Theoretic Troubleshooting: A Framework for Repair and Experiment" (1302.3563). At each decision point, the system selects among three fundamental classes of actions: observation (gathering information), repair (direct intervention on a component), and configuration change (active modification of system parameters). The process is governed by Bayesian inference over a probabilistic model, typically a Bayesian network representing component dependencies and observed behaviors.

The repair strategy minimizes the expected repair cost, defined by: ECR(I)=i=1n[j=1i1(1pj)](Ci+piCiR)ECR(I) = \sum_{i=1}^n \left[ \prod_{j=1}^{i-1} (1 - p_j) \right](C_i + p_i C_i^R) where pip_i is the probability that repairing component cic_i restores normal operation, given current information II. After each action, beliefs and expected costs are updated, and the next action is dynamically chosen so as to minimize future cost. This iterative, adaptive process allows troubleshooting to respond efficiently to new evidence, leveraging the causal structure of the system.

2. Monitoring, Localization, and Hierarchical Repair

Run-time adaptation and repair require mechanisms to monitor system evolution and localize faults. The monitoring approach developed for Architectural Design Rewriting (ADR) (1310.4574) implements a tree-like structure that records the sequence of productions (rules) and reconfigurations applied to the system's architecture. Each vertex in the tracking forest is annotated with the production applied and the architectural element affected.

Given an erroneous system state, the monitoring structure enables rapid localization of affected subtrees, supporting localized and hierarchical repairs. Reparative actions are formally defined through graph transformation and rewriting rules, with correctness contracts enforced by weakest precondition computations: wp(p,inv)=ψwp(p, inv) = \psi This enables adaptive repair at appropriate levels of granularity, with algorithmic mechanisms for proposing and applying targeted reconfigurations that maintain system invariants and minimize global disruptions.

3. Data-Driven and Model-Based Adaptive Repair

ADReFT extends to cyber-physical and hybrid systems by integrating model-based pattern transformation with automated synthesis. The REAFFIRM methodology (1902.04064) exemplifies this approach, comprising:

  • Model Transformation: Application of resilience patterns (HATL scripts) that encode adaptive decision logic (e.g., sensor trust selection, mode switching) onto a system model.
  • Parameter Synthesis: Automated search (using Breach falsification) for parameters that restore safety with respect to formal requirements (e.g., Signal Temporal Logic).

This enables rapid, automated adaptation of decision points in complex systems such as adaptive cruise control under sensor spoofing, stabilizing power systems under attack, and missile guidance systems with varying sensor reliability. The process guarantees that system repairs preserve correctness as expressed by precise logical specifications.

System Adaptive Decision Point Sample Pattern Parameter Synthesized
ACC Sensor trust switch Mode copy + threshold θ=7.08\theta^* = 7.08
SMIB Chattering mitigation Dwell time guard θ=0.12\theta^* = 0.12
Missile Guidance Sensor fusion weight Linear combination task-dependent

4. Adaptive Decision Repair in Learning and Control Systems

Uncertainty-aware online repair is vital for autonomous robots and driving systems operating in dynamic environments. In "A Decision Tree-based Monitoring and Recovery Framework..." (2308.00944), adaptive repair is achieved through:

  • Interpretable Failure Detection: Decision trees trained on discrepancies between model-predictive control (MPC) predictions and observed states classify probable failures.
  • Uncertainty Quantification: Perturbation-based distances define the set of plausible failure hypotheses.
  • Safe Recovery Selection: Reachability analysis and Bayesian updates guide selection of corrective controllers that guarantee safety with respect to all plausible fault modes.

For autonomous driving, "ADReFT: Adaptive Decision Repair for Safe Autonomous Driving via Reinforcement Fine-Tuning" (2506.23960) presents a transformer-based runtime module with two heads: a State Monitor (identifying safety-critical states) and a Decision Adapter (generating repair actions). ADReFT is pre-trained via weak supervision (using coarse violation-to-label mapping), then fine-tuned by reinforcement learning to generate contextually appropriate, minimally disruptive repairs. Empirically, ADReFT increases successful incident repair to 85% on Roach ADS and 76% on Pylot ADS, outperforming specification-based and anomaly-based baselines, with significantly lower unnecessary intervention rates.

5. Fairness-Guided and Minimal Repair in Decision Systems

In data-driven decision pipelines, ADReFT also encompasses fairness-guided repair, as operationalized in FairRepair (2011.11001). Here, adaptive repair is cast as an SMT (Satisfiability Modulo Theories) optimization over decision tree outputs, with the objective of achieving group fairness: i,jcrirjric\forall i, j \quad c \cdot r_i \leq r_j \leq \frac{r_i}{c} (ensuring similar positive classification rates across sensitive groups), while minimizing semantic difference from the original model. The repair process guarantees both soundness and completeness with respect to the fairness specification, scaling to random forests with tens of thousands of nodes and retaining high predictive accuracy after repair.

Similarly, "Less is More: Adaptive Program Repair with Bug Localization and Preference Learning" (2503.06510) introduces adaptive program repair (AdaPR): generating code fixes that maximize test correctness with minimal code modifications. The AdaPatcher framework separates bug localization (via self-debug learning and dynamic trace analysis) from targeted patch synthesis (location-aware, hybrid, and preference-based learning), resulting in higher accuracy and greater preservation of original code structure compared to strong LLM baselines.

Model/Method Accuracy (%) Consistency (%) Failed Repair Rate (%)
AdaPatcher (CG) 67.57 48.69 12.83
GPT-4o (Instr.) 62.77 27.40 34.65

6. Collaborative, Equitable, and Agentic Decision Repair

ADReFT extends to collaborative decision-making under uncertainty, as exemplified by the agentic LLM framework for adaptive decision discourse (2502.10978). In this paradigm, multiple LLM-based agents, each embodying distinct stakeholder personas (e.g., mayor, scientist, advocate), engage in iterative dialogue to generate, challenge, and refine recommendations in scenarios such as disaster response to flooding. Adaptivity is realized through:

  • Dynamic Assembly: Agents summon additional expertise in response to new information or recognized gaps.
  • Breadth-First Exploration: The assembly evaluates diverse alternatives before converging, reducing risk of premature or narrow decisions.
  • Information-Theoretic Optimization: The assembly aims to maximize the sum of unique and synergistic knowledge among agents, minimizing redundancy.

The result is a decision repair process that is not only adaptive and context-aware but also explicitly reasoning about equity, robustness, and stakeholder value trade-offs.

7. Significance and Future Prospects

ADReFT unifies a range of adaptive, evidence-informed, and context-sensitive repair methodologies across technical domains. Common properties include:

  • Integration of probabilistic inference, formal specification, model-based synthesis, and learning-based adaptation.
  • Localized, minimally invasive correction focused on safety, fairness, or correctness.
  • Iterative updating as new observations accrue, enabling continual improvement.
  • Soundness guarantees when built atop formal methods (e.g., contract-based repair, SMT-based fairness repair).

Ongoing and future research points to expansion in the following directions:

  • Broader violation types and richer action spaces (beyond collision avoidance or program repair).
  • Direct incorporation of uncertainty quantification and robustness to perception noise.
  • Application to large-scale, multi-agent, or socio-technical systems, using agentic and information-theoretic design principles.
  • Tighter integration with digital twin environments and online deployment for autonomous and human-in-the-loop systems.

The field continues to advance both in algorithmic sophistication and breadth of applicability, offering foundational techniques for resilient, trustworthy, and adaptive system maintenance in diverse real-world contexts.