Papers
Topics
Authors
Recent
2000 character limit reached

Executing LLM Systems

Updated 3 November 2025
  • Executing LLMs are large language models that invoke actions, computations, or tool calls within unified, stateful workflows.
  • They integrate structured data management with code execution and artifact versioning, ensuring auditability and reproducibility.
  • Systems like TableVault apply ACID transactions and concurrency controls to manage hybrid AI-data workflows efficiently.

An Executing LLM is a LLM system (or agent) whose outputs are not limited to text but instead are directly invoked as actions, computations, or tool calls—often within structured or composable workflows where data, code, and LLM-generated content are interleaved, versioned, and managed as part of broader, stateful processes. Executing LLMs form the computational core of LLM-augmented data management, agentic automation, and hybrid AI-data system architectures, unifying traditional database management principles with LLM-driven code and artifact execution (Zhao et al., 23 Jun 2025). The progression from passive LLM inference to executing LLMs marks a fundamental shift in both architectural requirements and the interaction between human intent, automated reasoning, and concrete effectuation in real-world software systems.

1. System Architecture and Core Abstractions

An executing LLM system, exemplified by TableVault (Zhao et al., 23 Jun 2025), is architected to provide unified management of structured data, LLM-driven code execution, and workflow artifacts. The principal abstractions are:

  • Tables: Each logical output or process is stored as a directory ("table"), with multiple versioned instances tracking each complete state or snapshot.
  • Instances: A specific version of a table, parameterized by input, time, and configuration.
  • Builders: Declarative YAML specifications that encode the recipe for producing columns/rows via code or LLM APIs. These serve as both provenance record and executable instruction.
  • Artifacts: Externally generated or associated files (documents, images) attached at the table, column, or row level.

All persistent state—data, artifacts, builder specifications, metadata, and logs—is maintained as a file hierarchy, designed for auditability and direct human inspection.

Metadata is centralized: dataframes, artifacts, dependencies, and operational logs are all registered in a dedicated metadata directory. This enables lineage tracking, explicit dependency recording, and consistent locking (exclusive/shared, hierarchical) for cross-process safety and recovery (Zhao et al., 23 Jun 2025).

2. Transactionality and Concurrent Execution

Executing LLM workloads present highly concurrent, state-modifying task graphs, requiring rigorous guarantees for consistency and durability:

  • Concurrency Model: All core operations (instance generation, data processing, artifact generation) execute as background threads, supporting concurrent runs even in single-threaded kernels such as Jupyter Notebooks.
  • Locking and Isolation: Hierarchical locks are acquired on tables and instances, ensuring that multi-writer or overlapping workflows cannot yield conflicting or corrupt state.
  • Transactional Safety (ACID):
    • Atomicity: Operations complete in full, or the system is rolled back to a previous consistent state.
    • Consistency: Updates to metadata and to data/outputs are interlocked, preserving invariants.
    • Isolation: In-flight operations are protected by locks, blocking interference.
    • Durability: Write-ahead logging ensures that no logical state transition is made visible before the corresponding disk writes are committed.

Write Operation Algorithm:

1
2
3
4
5
6
7
8
9
10
11
12
def execute_write_op(operation):
    log_id = begin_log(operation)
    acquire_locks(operation.targets)
    try:
        persist_metadata(log_id)
        output = perform_operation()
        commit_metadata(log_id, output)
        release_locks()
    except Exception as e:
        rollback_to_safe_state(log_id)
        release_locks()
        raise e
This transactional flow is robust to interruption/pausing and is directly inspired by log-based recovery protocols such as ARIES.

3. Reproducibility and Provenance

TableVault's executing LLM model enforces reproducibility through a combination of:

  • Builder Specifications: Each table instance is derived from an explicit (versioned) builder YAML that encodes:
    • Code or LLM APIs employed
    • Parameters and inputs (including references to other tables/instances via TableString)
    • Execution settings (threads, datatypes, savepoints)
  • Deterministic Workflow Snapshots: The tuple (builder specification, referenced data, artifact lineage) encapsulates the full provenance. Given the same configuration, workflows are re-executable, modulo LLM non-determinism.
  • Artifact Versioning: All supporting files are kept with explicit linkage to their producing data and builder state, ensuring complete reconstructability.

All transformations, user actions, and parameters are centrally logged—a full audit trail underpinning reproducibility guarantees (Zhao et al., 23 Jun 2025).

4. Data Versioning and Selective Materialization

Version control is enforced at multiple levels:

  • Instance-level versioning: Each table instance is timestamped and uniquely identified.
  • Per-column versioning: Individual columns may be independently regenerated, enabling partial retargeting of workflows and cache-efficient recomputation.
  • Data dependency tracking: Explicit reference syntax (TableString) encodes dependencies; thus both forward and backward lineage (what data was used, and what was derived) are fully navigable.
  • Retention and Re-generation: Old data instances may be pruned for storage efficiency, retaining sufficient logs and builder configs to enable on-demand re-materialization.

5. Compositional Data Pipelines via TableString

The TableString abstraction is central to composing complex, multi-stage LLM workflows. It encodes references to:

$\texttt{<TABLE>::<INSTANCE>::<COLUMN>::<ROW_FILTER>}$

This allows:

  • Reduce/Aggregation: Summarize columns or rows across an instance.
  • One-to-one referencing: Direct alignment of rows for per-record dataflow (e.g., translating, summarizing).
  • Windowed/Convolutional access: Sliding or indexed ranges over prior outputs as in time-series or sequential modeling.
  • Filtered joins: Row selection via predicates, supporting conditional logic or class-based routing.

Builder specs may declaratively compose TableStrings as inputs to LLM function calls, code execution, or artifact generation, chaining arbitrary workflows with explicit dependency traceability.

Builder Types:

  • Code Builder: Executes programmatic logic over input rows/columns, optionally orchestrating LLM API invocations.
  • Generator Builder: Produces new records or rows, synthesizing data or ingesting new artifacts.
  • Artifact Columns: Associates file blobs of arbitrary type directly within the tabular model.

The integration of TableString enables declarative, reproducible, and trackable construction of complex dataflows with hybrid LLM and code steps, as in multi-step prompt chaining or transductive artifact pipelines.

6. Workflow Optimization and Execution Efficiency

TableVault employs several optimization techniques for executing LLM-augmented workflows:

  • Dependency-aware recomputation: Any regeneration operation computes diffs with prior builder specs and data, re-materializing only rows/columns that are new or impacted by modified dependencies (fine-grained cache).
  • Partial Materialization: Builders are resolved in topological order according to dependency graphs; only minimal necessary computation is performed.
  • Multithreaded execution: Explicit threading parameters allow horizontal acceleration for row- or column-wise data generation, including concurrent LLM calls.
  • Safe concurrent execution: All workflow invocations are protected transactionally, enabling distributed and collaborative development across data and code boundaries without version drift or loss.

7. Integration of Database Principles with LLM-Driven Requirements

TableVault formally unifies traditional database guarantees (transactionality, versioning, isolation) with the needs of LLM-augmented data workflows:

Database Principle TableVault/LLM Workflow Analog
ACID Transactions All LLM- or code-driven steps are atomic, isolated, and durable
Snapshot Isolation Versioned builder and data instances, re-materializable
Views/Dataflows Declarative builder specs, TableString references, provenance
Log-based Recovery Central operation log, rollbacks, write-ahead
Human/Programmatic Audit File-based storage, explicit provenance for governance and reproducibility

This approach supports both traditional structured data and evolving unstructured/artifact workflows, including human-in-the-loop audit.


TableVault thus exemplifies the executing LLM paradigm: a robust, composable, and transparent data/artifact management plane that elevates LLM invocations to first-class, transactional, and provenance-tracked computational entities. Its design is immediately applicable to collaborative, multi-stage data engineering, human-AI curation, and any setting requiring scalable orchestration of LLM-based computation and artifact workflows (Zhao et al., 23 Jun 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Executing LLM.