- The paper presents a unified notebook interface and a multi-stage finite-state transducer for adaptive data science automation.
- It details flexible DFS-like planning, incremental execution, and self-debugging strategies to overcome code execution failures.
- Experimental results demonstrate robust performance and cost efficiency on benchmarks like DSBench, InfiAgent-DABench, and MatplotBench.
Overview of "DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation"
The paper "DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation" focuses on creating an effective framework for automating data science tasks using LLMs. The main contributions include a unified interaction representation within computational notebooks and a multi-stage architecture based on finite-state transducers to facilitate adaptive, robust task execution across diverse data scenarios.
Unified Interaction Representation
DatawiseAgent leverages computational notebooks as the central interface for performing data science tasks. This unified interaction representation expresses all agent interactions as sequences of markdown and executable code cells. This design choice aims to mimic human data scientists' workflows, facilitating long-horizon planning and progressive solution development.
Figure 1: DatawiseAgent performs diverse data science tasks across various models by operating entirely within a computational notebook.
The notebook-centric approach integrates environment details, tool descriptions, and user instructions into a coherent format, enabling the seamless execution of tasks with rich feedback and interaction.
FST-Based Multi-Stage Architecture
The framework introduces a multi-stage architecture modeled as a non-deterministic finite-state transducer (NFST) to govern the agent's behavior. This modular architecture facilitates transitions across four functional stages: DFS-like planning, incremental execution, self-debugging, and post-filtering.
Figure 2: State transition diagram of the FST-based multi-stage architecture.
The NFST enables adaptive exploration and robust recovery from execution failures, guiding the agent through complex task completions in a structured manner. This architecture supports modular extension and fine-grained ablation of components for optimizations.
Stage Details
DFS-Like Planning and Incremental Execution
These stages allow flexible exploration and progressive task completion by dynamically selecting actions based on task progress and feedback. The agent constructs tree-structured task trajectories through non-linear planning and executes these incrementally using markdown and code cells.
Figure 3: Illustration of DatawiseAgent’s task-completion process through DFS-like planning and incremental execution.
Code Repair via Self-Debugging and Post-Filtering
This module focuses on robust recovery from code execution failures. It integrates advanced debugging techniques and generates concise diagnostic reports to prevent misinformation accumulation and guide future decisions.
Experimental Results
The paper evaluates DatawiseAgent on benchmarks including DSBench, InfiAgent-DABench, and MatplotBench, demonstrating consistent state-of-the-art performance across tasks and models.
Figure 4: Performance comparison showing inference time across various tasks.
Figure 5: Performance across Qwen2.5 models showcasing robust results.
DatawiseAgent excels in adaptability and robustness under varying model capacities, achieving high success rates and relative performance gaps while offering competitive cost efficiencies.
Conclusion
DatawiseAgent provides a novel framework for robust data science automation, enabling adaptive planning and recovery strategies through a notebook-centric, multi-stage architecture. This design offers scalability and robustness across diverse LLM configurations and data scenarios, setting a strong baseline for future adaptive data science agent frameworks.
The implementation highlights the potential for these frameworks to automate complex workflows efficiently, catering to resource-constrained environments and enhancing real-world applicability. Future work could explore tool integrations and human-in-the-loop collaboration in broader domain contexts.