Papers
Topics
Authors
Recent
Search
2000 character limit reached

EnCompass: Python Agent Framework

Updated 5 December 2025
  • EnCompass is a Python framework that formalizes agent programs via probabilistic angelic nondeterminism, separating core workflow from inference-time search strategies.
  • It compiles annotated Python functions into explicit search spaces, enabling integration with diverse algorithms like DFS, BFS, beam search, and MCTS.
  • The framework reduces code overhead and LLM calls, as evidenced by empirical gains in speed and scalability across code translation and hypothesis search tasks.

EnCompass is a Python framework for agent programming that operationalizes "probabilistic angelic nondeterminism" (PAN), enabling the principled separation of agent workflow logic from inference-time search strategies. The framework empowers researchers and practitioners to efficiently develop, experiment with, and deploy LLM-based agents by compiling annotated Python functions into explicit search spaces over execution paths, allowing for seamless workflow specification and flexible, modular search algorithm integration (Li et al., 3 Dec 2025).

1. Foundations: Probabilistic Angelic Nondeterminism

Probabilistic angelic nondeterminism (PAN) formalizes the agent program as a nondeterministic, probabilistic process, where execution is partitioned at explicitly marked "branchpoints" to define decision steps within the agent's workflow. The program state is defined by qQ={branchpoint locationsend-of-program}q \in Q = \{\text{branchpoint locations} \cup \text{end-of-program}\} and memory snapshot Γ\Gamma (including locals and shared variables). Between marked points, transitions are deterministic except for LLM calls or other random oracles, modeled as one-step probabilistic transitions:

δ:(q,Γ)Dist(Q×Γ)\delta : (q, \Gamma) \rightarrow \text{Dist}(Q \times \Gamma)

A complete program run is a path τ=((q0,Γ0)(q1,Γ1)(q,Γ))\tau = ((q_0, \Gamma_0) \to (q_1, \Gamma_1) \to \dots \to (q^*, \Gamma^*)), with probability:

P(τ)=kδ((qk,Γk)(qk+1,Γk+1))P(\tau) = \prod_k \delta((q_k, \Gamma_k) \to (q_{k+1}, \Gamma_{k+1}))

Unlike demonic nondeterminism (worst-case branching), PAN enables "angelic" search: guided exploration over all possible execution paths to maximize a user-defined score S(Γ)S(\Gamma^*), yielding the optimal trajectory

τ=argmaxτS(Γ(τ))\tau^* = \arg \max_{\tau} S(\Gamma^*(\tau))

The search process constructs and explores a tree over program states (q,Γ)(q, \Gamma), with search algorithms selecting which leaf (checkpoint) to expand next. This approach presents a rigorous, modular foundation for agent search and evaluation.

2. Separation of Workflow Specification and Search Strategy

Traditional agent programming often entangles workflow logic—such as loops, accumulators, and scoring heuristics—with inference-time search (e.g., varying N in best-of-N sampling, implementing beam search). EnCompass, through PAN, enforces separation by providing:

  • Workflow layer: Authored as single-threaded code, with designated primitive calls marking unreliability (e.g., branchpoint) and evaluation (e.g., record_score).
  • Search layer: Realized as an external engine operating over the tree of checkpoints, parameterized independently of the workflow.

This separation avoids code refactoring when experimenting with alternative search algorithms; adding branchpoints or modifying search parameters suffices. Compilation via the EnCompass decorator transforms routines into explicit search spaces for exploration.

The following pseudocode excerpt illustrates the compilation process:

1
2
3
4
5
6
7
8
9
10
11
@encompass.compile
def agent(x):
    branchpoint()
    y = LLM(x)
    record_score(evaluate(y))
    return y

def cps_agent(frame, resume):
    frame['y'] = LLM(frame['x'])
    resume1 = lambda fr: resume(fr)
    return (frame, resume1)

3. Framework Implementation and API

The core of EnCompass is a Python decorator, @encompass.compile, which processes the abstract syntax tree (AST) of the agent function, applies continuation-passing style (CPS) transformation and tail-call optimization, then exposes enriched runtime objects supporting the search space API.

After decoration, functions become SearchSpace objects supporting:

Method Description Return Type
start() Initializes to first program state (checkpoint) Checkpoint
search(algo, **kwargs) Executes specified search to find best return best value
search_multiple(...) Runs search to return list of (value, score) pairs List of tuples

Primitives within decorated functions include:

  • branchpoint(name=…), branchpoint_choose(choices)
  • record_score(v), record_score(evaluator, target, label=…)
  • record_costs(api_cost=…), early_stop_search(), optional_return(v)
  • protect(expr, Exception), searchover(fn(...)) (for nested searches)

Exemplary usage:

1
2
3
4
5
6
7
8
@encompass.compile
def my_agent(inp):
    branchpoint()
    out = LLM(inp)
    record_score(quality(out))
    return out

best = my_agent(x).search("dfs", default_branching=10)

For multi-level beam search:

1
2
3
4
5
6
7
8
9
10
11
12
13
@encompass.compile
def translate_file(repo):
    branchpoint(name="file_step")
    skeleton = llm_stub(repo)
    record_score(verify_stub(skeleton))
    for fn in skeleton.methods:
        branchpoint(name="method_step")
        code = llm_translate(fn)
        record_score(verify_translation(code))
        skeleton.add(code)
    return skeleton

result = translate_file(r).search("beam", beam_width=2, default_branching=3)

4. Supported Search Algorithms and Execution Strategies

All strategies within EnCompass operate by repeatedly invoking step() on chosen Checkpoint objects. The following algorithms are supported natively:

Two novel variants are introduced:

  • Re-expand Best-First: Permits re-expansion of nodes to incorporate updated scores.
  • Explorative Re-expand Best-First: Augments node selection with an upper-confidence-bound bonus:

U(τ)=Q(τ)+clnNn(τ)U(\tau) = Q(\tau) + c\sqrt{\frac{\ln N}{n(\tau)}}

where Q(τ)Q(\tau) is the score estimate, NN the total node expansions, and n(τ)n(\tau) the count for node τ\tau.

Beam search is illustrated below:

1
2
3
4
5
6
7
8
9
initialize beam = [initial_checkpoint]
repeat until cost budget:
    candidates = []
    for ckpt in beam:
        for _ in range(branching_factor):
            candidates.append(ckpt.step())
    sort candidates by .score descending
    beam = candidates[:beam_width]
return beam[0].return_value

This catalogue of strategies enables granular cost-performance tuning and broad experimental flexibility without altering core workflow logic.

5. Empirical Evaluation and Case Studies

Performance and reliability gains are demonstrated in three agent domains:

5.1 Code-Repository Translation (Syzygy-style agent)

  • Five branchpoints added to ∼600 LOC base; rapid experimentation across global BoN, local BoN, and hierarchical beam search.
  • Achievement: Beam search (file=2, method=3) attains near-perfect self-validation as cost increases, outperforming simpler strategies.
  • Across additional assignments (ps1–ps4, 5 756 LOC), coarse+fine beam search consistently supersedes pure sampling under cost equivalence.
  • Code modifications required: ∼400 LOC in plain Python versus ∼80 with EnCompass (5× reduction).

5.2 Hypothesis Search (ARC-AGI)

  • Baseline: two-step agent yields 4.3% GPT-3.5 accuracy.
  • One branchpoint + global BoN (N=8N=8): 11.7%.
  • Two branchpoints + parallel BFS (branch=8): 15%.
  • Matches or surpasses contemporary ADAS meta-search at comparable cost.
  • Code overhead: +21 LOC plain, +8 LOC with EnCompass.

5.3 Reflexion (Iterative Refinement)

  • Foundation: Reflexion on LeetCodeHard, ∼35% pass rate for n=5n=5 loops.
  • Enhancement: Branchpoints at initiation and per-loop, reexpand-BeFS strategy.
  • Result: ∼36% pass rate with reduced LLM cost; superior scaling over naive loop increments.
  • Code overhead: +27 LOC plain, +9 LOC EnCompass.

6. Performance Scaling Laws and Practical Implications

Empirical studies consistently indicate a "log-linear" relationship between performance and inference cost:

1
Performance ≈ a + b·log(Cost)

Structured search methods (beam, MCTS, reexpand-BeFS) yield improved scaling coefficients relative to simple random sampling or best-of-N selection. For example, in code translation tasks (ps0), the slope for hierarchical beam strategies is significantly greater (p<0.03p < 0.03) versus basic search. On LeetCodeHard with Reflexion, reexpand-BeFS matches unconstrained loop-based top performance at 30–40% reduced LLM budget.

Summary of empirical gains:

  • 3–6× reduction in code overhead when implementing or switching strategies.
  • Over 2× decrease in LLM calls required to achieve target performance.
  • Explicit modular separation: branchpoints and .search(...) parameterization isolate search configuration from workflow.

This suggests that EnCompass offers a rigorous, scalable, and low-overhead pathway for LLM-based agent development, particularly where reliabilty, rapid prototyping, and systematic search experimentation are prioritized (Li et al., 3 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to EnCompass Framework.