Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 59 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 110 tok/s Pro
Kimi K2 210 tok/s Pro
GPT OSS 120B 461 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications (2411.18915v4)

Published 28 Nov 2024 in cs.LG and cs.CL

Abstract: Business documents often contain substantial tabular and textual information with numerical values, requiring mathematical reasoning for effective document understanding. While Small LLMs (SLMs) still struggle at this task, tool-augmented multi-step agents perform better, at the cost of relying on closed-source or larger models, external data, or extensive prompt-engineering. This work introduces MATATA, a novel weakly supervised end-to-end approach to train multi-step reasoning language agents for document tabular applications. MATATA presents an annotation-free paradigm for each agent to enhance 3.8B/8B SLMs. During its two-stage training, MATATA uses the final outcome of the multi-step reasoning chain as weak supervision. This approach avoids having to individually supervise each intermediate agent in the reasoning chain. By employing an adaptive planner and shared tools across different datasets, MATATA shows robust performance. Experiments demonstrate that MATATA achieves state-of-the-art on FinQA, and on TAT-QA among reasoning methods based on open-source SLMs. Although being SLM-based, MATATA closely matches GPT-4-based frameworks on TabMWP. This novel weakly supervised approach enables training an end-to-end multi-step reasoning agent without intermediate supervision, supporting future developments of cost-effective powerful agentic systems.

Summary

  • The paper presents MATATA, a framework enhancing small language models (SLMs) for mathematical reasoning on tabular data using weak supervision and tool assistance.
  • MATATA uses tool utilization for decomposition and weak supervision via iterative refinement with SFT and KTO alignment.
  • MATATA demonstrates competitive performance on tabular data math problems using SLMs, offering a privacy-preserving and efficient solution for sensitive applications.

Overview of "MATATA: a Weak-Supervised MAthematical Tool-Assisted reasoning for Tabular Applications"

The paper presents a new methodology, termed MATATA, targeting the enhancement of mathematical reasoning capabilities in LLMs when applied to tabular data challenges. Recognizing the increasing potency of LLMs augmented by external tools, the authors propose a novel, cost-effective approach leveraging Small LLMs (SLMs) with emphasis on data privacy—a crucial consideration in sensitive business environments.

Contributions and Methodology

The authors introduce the MATATA framework centered on two key mechanisms—tool utilization and weak supervision—to enable mathematical reasoning. This approach avoids the pitfalls of heavily relying on large, closed-source models such as GPT-4, thereby sidestepping privacy issues and significant computational overheads.

  1. Tool-Augmented Framework: MATATA employs a planner that uses predefined, reusable tools allowing the decomposition of complex problems into simpler subtasks. Each subtask is then processed through specific fine-tuned small models, an approach that maintains robust performance while enhancing model scalability.
  2. Weak-Supervised Learning: By utilizing a self-improvement paradigm incorporating progressive, iterative fine-tuning stages, the MATATA framework refines its model performance initially with few-shot prompts to establish a baseline. Using reasoning trajectories generated by baseline models, these prompts are subsequently replaced, reducing the input size and improving inference speed.
  3. Alignment Process: The framework implements a two-stage training process involving Supervised Fine-Tuning (SFT) followed by Kahneman-Tversky Optimization (KTO) for model alignment—a method suited to leveraging weak supervision through binary correctness signals without relying on multiple samplings or additional models.

Experimental Validation

The paper provides comparative analysis on datasets such as FinQA, TAT-QA, and TabMWP, illustrating MATATA's competitive performance against other models, including frameworks employing extensive prompt engineering or relying on larger models. MATATA-8B, for instance, surpasses certain fine-tuned models and shows proximity in performance to models like TAT-LLM-70B despite using an order of magnitude fewer parameters.

Further, the results underscore MATATA's scalability through the sharing of tools across datasets, suggesting potential for improved breadth and depth of model capability when trained on diverse data.

Implications and Future Directions

The MATATA framework is presented as a promising avenue for developing high-performance, privacy-centric mathematical reasoning systems. Its ability to operate efficiently with SLMs and minimal manual prompt engineering effort makes it attractive for business applications where data sensitivity is paramount. The research posits that further developments could involve the scaling of SLMs to incorporate broader datasets and tasks, thereby enhancing reasoning capabilities further while maintaining the approach's cost-effectiveness.

In conclusion, this paper aligns itself with the broader trajectory in AI research seeking to optimize computational resources and ensure data privacy without compromising on performance, providing a viable path forward for real-world applications involving complex data interpretation and reasoning.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 7 likes.