Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 41 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 124 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

LTLCrit: Temporal Logic LLM Critic

Updated 25 October 2025
  • LTLCrit is a temporal logic-based critic for LLM planners that supervises high-level decision-making by enforcing formal safety and efficiency constraints.
  • It features a modular actor-critic architecture with an online loop for rapid action verification and an offline loop for inducing and refining logic-based constraints.
  • Empirical evaluations in Minecraft benchmarks show significant improvements, reducing unsafe actions from 23% to 4.5% and increasing task efficiency.

LTLCrit denotes a temporal logic-based LLM critic architecture in which LLM-planned trajectories are supervised and improved by logic-based critics for embodied, long-horizon decision-making tasks. LTLCrit provides formal guarantees of safety and efficiency, decoupling high-level planning from constraint refinement, and is designed for modular integration with any LLM-based planner (Gokhale et al., 4 Jul 2025).

1. Modular Actor-Critic Architecture

The LTLCrit framework is based on a hierarchical actor-critic paradigm, comprising distinct online actor and offline critic loops:

  • Online loop: The LLM actor receives a comprehensive natural language description of the current environment state, ϕfull(s)\phi_\text{full}(s), and selects a high-level action from a fixed set. Actions are then checked against the current pool of linear temporal logic (LTL) constraints via a formal verification protocol. The verifier, instantiated as a Büchi automaton, examines whether the abstract state ϕabstract(s)\phi_\text{abstract}(s) together with the candidate action satisfy both safety and efficiency constraints. Invalid actions prompt replanning; valid actions are delegated to a low-level controller.
  • Offline loop: LTLCrit analyzes full observed trajectories to identify failures (unsafe episodes) or inefficient paths. It then induces new or refined LTL constraints which are injected back into the online system, updating the verifier's rule set for future action selection.

This modular separation allows rapid, reactive planning (online actor), while maintaining global property improvement and safety (offline critic).

2. Temporal Logic Constraint Formulation

Communication and supervision between critic and actor occurs via LTL constraints of the form:

G(φsX(φa))G(\varphi_s \rightarrow X(\varphi_a))

where GG denotes the "globally" temporal operator, φs\varphi_s is a Boolean condition over symbolic state features, and φa\varphi_a is a Boolean condition over actions. Constraints can encode policy-level restrictions (e.g., resource nonduplication, subgoal ordering), representing hard safety rules or soft efficiency guidelines. For instance:

G(agent_has_wooden_pickaxeX(¬action_craft_wooden_pickaxe))G(\text{agent\_has\_wooden\_pickaxe} \rightarrow X(\neg \text{action\_craft\_wooden\_pickaxe}))

prevents crafting duplicate wooden pickaxes by barring redundant tool production as soon as the agent possesses one.

All constraints are compiled into automata for machine-verification and, due to LTL's canonical structure, support human interpretability and editing.

3. Safety and Efficiency Assurance

LTLCrit enforces two categories of constraints:

  • Safety (hard constraints): Hand-authored rules (e.g., requiring baseline equipment before engaging in risky actions) are injected by domain experts to prevent catastrophic failures. An example is: G(¬obs_iron_pickaxe_equippedX(¬action_mine_diamond))G(\neg\text{obs\_iron\_pickaxe\_equipped} \rightarrow X(\neg\text{action\_mine\_diamond})), disallowing diamond mining without proper equipment.
  • Adaptive efficiency (soft constraints): The critic automatically analyzes trajectories for suboptimal behavior such as loops, redundant actions, or avoidable delays. Using graph traversal formalism (actions as edges, costs as unit steps), the critic induces new constraints to prune inefficient branches from the exploration tree. Over-constrained, deadlocked states are detected and resolved by relaxing the constraint set as needed.

Constraint induction is driven by failure feedback, environmental rewards, and detection of inefficient state transitions.

4. Model-Agnostic Integration

LTLCrit is designed as a symbolic wrapper, agnostic to the underlying LLM actor. It has been demonstrated on planners such as SayCan and InnerMonologue, operating independently of their internal mechanics. The only requirement is that the actor expose symbolically tractable representations of state and action for verification. This modularity allows LTLCrit to generalize to various embodied agent architectures and planning domains.

5. Empirical Evaluation

Empirical assessment is conducted on the Minecraft diamond-mining benchmark. Key findings:

  • Task completion: Augmenting LLM planners with LTLCrit yields a 100% task completion rate, outperforming baselines wherein standard planners fail to reach the goal.
  • Efficiency: LTLCrit reduces the mean number of actions to reach key subgoals (e.g., average diamond-mining steps drop from approximately $45.5$ to $35.8$).
  • Safety: Introduction of logic-based supervision reduces failed (unsafe) actions from 23%23\% to 4.5%4.5\%.

These results demonstrate that logic-guided constraint supervision substantially enhances both reliability and efficiency in long-horizon planning.

6. Broader Implications and Future Directions

LTLCrit provides a formal bridge between statistical LLM reasoning and symbolic control/verification, positioning LLMs for deployment in safety-critical systems such as robotics, autonomous vehicles, and healthcare. The interpretability and editability of constraints support regulatory requirements, operator supervision, and domain adaptation.

Future work is directed towards automating the discovery of atomic propositions (features for constraint construction) and expanding critic capabilities to multi-agent settings. Online versions of the critic loop—with continuous constraint refinement—promise tighter coupling of learning and supervision.

7. Summary Table: LTLCrit Components

Component Function Communication Interface
LLM Actor High-level action selection from natural language Abstract symbolic state/action
LTLCrit Critic Trajectory analysis and constraint induction LTL constraint set
Verifier Checks compliance of current action/state pair Büchi automaton

This architecture exemplifies how LLM reasoning can be systematically coupled with symbolic constraint-based guidance, leveraging the strengths of both paradigms for safe, robust, and efficient autonomous decision making.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to LTLCrit.