Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 120 tok/s Pro
Kimi K2 221 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models (2505.00147v2)

Published 30 Apr 2025 in cs.CL

Abstract: In-context learning (ICL) allows a LLM to improve its problem-solving capability when provided with suitable information in context. Since the choice of in-context information can be determined based on the problem itself, in-context learning is analogous to human learning from teachers in a classroom. Recent works (Didolkar et al., 2024a; 2024b) show that ICL performance can be improved by leveraging a frontier LLM's (LLM) ability to predict required skills to solve a problem, popularly referred to as an LLM's metacognition, and using the recommended skills to construct necessary in-context examples. While this skill-based strategy boosts ICL performance in larger models, its gains on small LLMs (SLMs) have been minimal, highlighting a performance gap in ICL capabilities. We investigate this gap and show that skill-based prompting can hurt SLM performance on easy questions by introducing unnecessary information, akin to cognitive overload. To address this, we introduce AdaptMI, an adaptive approach to selecting skill-based in-context Math Instructions for SLMs. Inspired by cognitive load theory from human pedagogy, our method only introduces skill-based examples when the model performs poorly. We further propose AdaptMI+, which adds examples targeted to the specific skills missing from the model's responses. On 5-shot evaluations across popular math benchmarks and five SLMs (1B--7B; Qwen, Llama), AdaptMI+ improves accuracy by up to 6% over naive skill-based strategies.

Summary

  • The paper proposes AdaptMI and AdaptMI+, frameworks that dynamically select skill-based examples based on question difficulty.
  • It employs a two-stage approach with difficulty classification and adaptive example selection to enhance math performance in small language models.
  • Experimental results demonstrate up to a 6% improvement in accuracy for smaller models, with iterative refinement further boosting performance.

"AdaptMI: Adaptive Skill-based In-context Math Instruction for Small LLMs" (2505.00147)

Introduction

This paper addresses the challenge of improving in-context learning (ICL) for small LLMs (SLMs) through adaptive skill-based instruction. While larger LLMs exhibit robust ICL capabilities, SLMs face significant difficulties, especially when the selection of in-context examples is misaligned with question difficulty. The authors propose AdaptMI, a framework inspired by cognitive load theory, which aims to enhance ICL performance by dynamically selecting skill-based in-context examples only for difficult questions.

Problem Statement and Motivation

The disparity in ICL performance between large and small LLMs is a well-documented phenomenon. SLMs often struggle to capitalize on skill-based in-context examples, as these can result in cognitive overload for simple tasks. The authors identify that skill-based prompting can degrade performance on easy questions by introducing unnecessary complexity. This observation sets the stage for AdaptMI, which targets adaptive selection strategies informed by task difficulty.

AdaptMI Framework

Stage 1: Difficulty Classification

AdaptMI begins by classifying questions as either easy or difficult using a reward model that evaluates the model’s responses. This classification does not rely on ground truth labels, thus making it adaptable to different datasets and settings. The difficulty detection employs a process reward model to assess correctness based on a predefined threshold.

Stage 2: Adaptive Example Selection

Once classified, AdaptMI applies skill-based example selection exclusively to difficult questions, while using fixed examples for easy ones. This stage involves two strategies:

  • AdaptMI: Uses skill-based examples for difficult questions identified in Stage 1.
  • AdaptMI+: Further refines the approach by incorporating specific skill examples that address missing skills identified in the model's responses to difficult questions.

Experimental Results

The application of AdaptMI and AdaptMI+ was evaluated across several math-focused datasets using five SLMs ranging from 1B to 7B parameters. Key findings include:

  • AdaptMI+ improved ICL performance by up to 6% compared to naive skill-based strategies.
  • Improvements were more pronounced in smaller models, highlighting the approach's effectiveness where the gap in ICL capability is greatest.
  • Iterative application of AdaptMI+ (where the process of skill-based example selection and model evaluation is repeated) further enhanced accuracy, indicating its utility in progressively calibrating model responses to tackle unsolved problems. Figure 1

    Figure 1: Accuracy and average output length of Qwen2.5-3B-Instruct on questions of Level 1–5 defined in the MATH dataset.

Discussion

Why Adaptive Selection Works

The paper provides a detailed analysis showing that skill-based examples harm SLM performance on easy questions due to overthinking and extraneous cognitive load. On difficult problems, however, these examples enhance performance by aligning more complex task demands with appropriate cognitive strategies.

Trade-offs and Considerations

  • Computational Efficiency: Reward model filtering introduces additional computational steps, emphasizing the need for efficient implementation.
  • Generalizability: The process reward model's ability to operate effectively across various datasets underscores AdaptMI's adaptability. Figure 2

    Figure 2: AdaptMI and AdaptMI+ are 2-stage adaptive in-context example selection methods.

Conclusion

AdaptMI and its enhanced version, AdaptMI+, represent a significant advancement in the adaptive training of SLMs. By tailoring in-context learning strategies to question difficulty, these frameworks mitigate cognitive overload and exploit metacognitive capabilities, facilitating improved performance on complex tasks. Future work may investigate extending these methodologies to train better SLMs through adaptive instruction informed by Frontier LLM outputs.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.