Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 71 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 22 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 81 tok/s Pro

Kimi K2 172 tok/s Pro

GPT OSS 120B 434 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation (2504.16408v2)

Published 23 Apr 2025 in cs.CL

Abstract: The LLMsR@XLLM25 formulates a low-resource structural reasoning task that challenges LLMs to generate interpretable, step-by-step rationales with minimal labeled data. We present Less is More, the third-place winning approach in the LLMsR@XLLM25, which focuses on structured reasoning from only 24 labeled examples. Our approach leverages a multi-agent framework with reverse-prompt induction, retrieval-augmented reasoning synthesis via GPT-4o, and dual-stage reward-guided filtering to distill high-quality supervision across three subtasks: question parsing, CoT parsing, and step-level verification. All modules are fine-tuned from Meta-Llama-3-8B-Instruct under a unified LoRA+ setup. By combining structure validation with reward filtering across few-shot and zero-shot prompts, our pipeline consistently improves structure reasoning quality. These results underscore the value of controllable data distillation in enhancing structured inference under low-resource constraints. Our code is available at https://github.com/JhCircle/Less-is-More.

Summary

Enhancing Structured Multi-Agent Reasoning through Quality-Guided Distillation

The paper "Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation" addresses the complex challenge of structured reasoning in LLMs under low-resource conditions. Participating in the XLLM@ACL2025 Shared Task-III, the authors propose a sophisticated solution using only 24 labeled examples to generate high-quality, interpretable reasoning processes for multi-agent systems. The approach emphasizes a multi-agent framework incorporating reverse-prompt induction, retrieval-augmented reasoning synthesis, and dual-stage reward-guided filtering.

Methodology Overview

The proposed method operates on a modular framework intended for structured multi-agent reasoning, comprising several key stages:

Reverse-Prompt Induction: Leveraging task-specific instructions through reverse thinking, this component provides the foundational guidance necessary for extracting high-quality structured reasoning signals. Utilizing a limited number of labeled examples, reverse thinking induces optimal prompts that ensure logical coherence in the reasoning processes.
Retrieval-Augmented Reasoning Synthesis: Through integration with GPT-4o, contextually grounded annotations are generated at scale. Using an embedded retrieval approach, the framework contrasts unlabelled instances against similar examples, thereby inferring structured annotations essential for in-depth reasoning synthesis.
Dual-Stage Filtering: This stage focuses on ensuring semantic fidelity by integrating structural pruning and reward-based selection. The reward model assesses the fidelity and coherence of each reasoning trace, thus retaining only high-quality data for model fine-tuning.

Experimental Results

The proposed system was evaluated against various data filtering strategies, revealing notable improvements in the structured reasoning task measured by standard metrics such as Question F1, Statement F1, Evidence F1, and Reasoning F1. Notably, reward filtering using an averaged score outperformed the structural filtering baseline, significantly enhancing the semantic precision of the reasoning models. This indicates that quality-centric data distillation offers a feasible path to performance improvements, especially under low-resource conditions.

Theoretical and Practical Implications

The research presents crucial implications for the theoretical development and practical deployment of LLMs in real-world settings. By prioritizing the quality of training data over sheer quantity, the paper suggests that substantive performance gains in understanding complex logical scenarios can be achieved even with minimal supervision. This methodology could be especially beneficial in areas requiring precise reasoning, such as legal analysis, scientific research, and strategic decision-making.

Furthermore, the insights derived from this paper contribute to the understanding that modular, controllable distillation processes enable scalable reasoning capabilities, potentially impacting future advancements in AI research by providing a framework that can adapt to different domains with constrained resources.

Future Directions

Building upon the findings, future research could expand the applicability of quality-guided distillation to further refine multi-agent systems, allowing for even denser interaction between reasoning elements. Additionally, exploring integration with other emerging AI technologies capable of enhancing the interpretability and efficiency of agents can push the boundaries of structured reasoning.

Lastly, the robustness of the quality-guided distillation approach could be tested across broader applications, verifying its effectiveness in handling varied logic-intensive tasks beyond the ACL shared task settings. Such endeavors could fortify structured reasoning systems to be more adaptable and reliable in diverse scenarios, thus reinforcing their utility in addressing intricate inferential challenges.