Papers
Topics
Authors
Recent
2000 character limit reached

Dynamic Cheatsheet (DC) Framework

Updated 13 October 2025
  • Dynamic Cheatsheet is a lightweight framework that equips language models with persistent, adaptive memory for test-time learning by curating problem-solving strategies.
  • The framework uses a dual-module system—with Generator and Curator modules—and integrates retrieval synthesis to efficiently update its external memory during inference.
  • Empirical results show that DC significantly boosts accuracy on arithmetic, reasoning, and code tasks by enabling cumulative learning without modifying model parameters.

Dynamic Cheatsheet (DC) is a lightweight framework that equips black-box LLMs with persistent, adaptive memory to achieve test-time learning—enabling models to incrementally store, curate, and apply distilled problem-solving strategies, code snippets, and heuristics across sequential inference queries. Unlike static prompting or parameter finetuning, DC supplies external, self-curated memory that evolves during inference, allowing models to recall and reuse high-impact insights and systematically improve performance over time, even without explicit labels or human supervision. This approach bridges the gap between isolated inference and cumulative, experience-driven learning characteristic of human cognition.

1. Architectural Composition

Dynamic Cheatsheet operates as a modular extension on standard large LMs, introducing two principal components:

Generator Module (Gen):

  • Receives current query xix_i and curated memory MiM_i.
  • Produces candidate output:

y~i=Gen(xi,Mi)\tilde{y}_i = \text{Gen}(x_i, M_i)

  • Integrates LM reasoning and stored prior insights.

Curator Module (Cur):

  • Evaluates each output y~i\tilde{y}_i post-inference.
  • Updates memory non-parametrically:

Mi+1=Cur(Mi,xi,y~i)M_{i+1} = \text{Cur}(M_i, x_i, \tilde{y}_i)

  • Curation emphasizes correctness, generality, and brevity; only essential strategies, heuristics, and executable snippets are retained.

Retrieval Synthesis (DC-RS):

  • Augments memory update by pre-selecting the top-kk most similar historical input–output pairs.
  • DC-RS integrates retrieval into the memory, further biasing the Generator toward relevant historical solutions.

This architecture is external to the LM core; it does not modify model parameters and operates solely at inference.

2. Persistent, Evolving Memory

The hallmark of DC is a self-curated, persistent memory MiM_i that accumulates problem-solving knowledge throughout a session:

  • Memory is maintained outside LM weights—models are black-box.
  • After each answer, Cur extracts transferable solution details—code routines, algebraic strategies, reference guides—pruning irrelevant or erroneous information.
  • DC avoids context bloat from naive transcript appending; the curated memory consists of concise, implementation-ready artifacts for rapid reuse.
  • Memory is dynamic: as better strategies emerge or errors are detected, content is updated; failed heuristics are discarded, correcting the LM’s behavior in subsequent inference.

This adaptive curation mechanism is analogous to human note-taking—selecting only generalizable, high-yield insights for retention.

3. Quantitative Impact on Performance

Dynamic Cheatsheet demonstrably yields substantial accuracy improvements across diverse tasks:

Task Baseline Accuracy Accuracy with DC
Game of 24 (GPT-4o) 10% 99%
AIME 2024 (Claude 3.5) 23.3% 50%
Arithmetic Balancer 45–50% 98–100%
GPQA-Diamond +9%
MMLU-Pro +8%
  • In Game of 24, DC enabled GPT-4o to discover and store a brute-force Python solution; reusing this code eliminated manual errors, escalating accuracy from 10% to 99%.
  • On AIME math exams, Claude retained algebraic insights and templates, leading to more than double the baseline accuracy.
  • In error-prone numerical tasks (Equation Balancer), both models approached perfect accuracy by recalling validated computational snippets.
  • For knowledge-demanding tasks, performance gains were attributed to cumulative retention of reference tables and facts.

4. Comparative Analysis to Baselines

DC’s persistent memory distinguishes it from prevalent alternatives:

  • Static Prompting: Baseline approaches concatenate predefined instructions; ML performance plateaus, lacking cumulative adaptation.
  • Full Transcript History: Includes prior context in each query, flooding context window with irrelevant or redundant detail, impairing focus and efficiency.
  • DC-\emptyset (Structured, Non-Evolving): Uses fixed, general prompts; lacks learning-by-retention, and performance remains comparable to static baselines.
  • Dynamic Retrieval (DR): Only retrieves historical answers but does not curate; inferior to DC in accuracy gains.

DC surpasses these by systematic retention and curation—fostering informed, error-corrected responses without internal model modification.

5. Application Domains and Use Cases

Dynamic Cheatsheet is broadly applicable across knowledge-intensive and error-prone domains:

  • Mathematical Reasoning: Models store algebraic, combinatorial strategies, reusing them across exam-style problems (e.g., AIME, MMLU-Pro).
  • Heuristic Puzzles/Coding: For problems reliant on algorithmic logic (Game of 24), DC propagates and reuses validated scripts.
  • Arithmetic/Equation Tasks: Recalling code routines prevents repeated calculation mistakes.
  • Domain Knowledge (Engineering, Physics): Retaining tables, formulas, and reference results boosts performance on general knowledge exams (GPQA).

In each use case, DC bridges isolated inferences, constructing a session-specific “cheatsheet” of distilled, actionable knowledge.

6. Self-Curation Mechanism and Error Correction

DC’s Curator module adopts a strategy of continual refinement:

  • Extracts correct and reusable solution fragments.
  • Prunes irrelevant, erroneous, or overly specific content.
  • Curation occurs without ground-truth labels—internal model validation or heuristic checks are used for correctness.
  • If a newly generated solution is found to supersede prior content (e.g., more general, less error-prone), memory is updated; if mistakes are detected, faulty heuristics are removed.
  • This active self-curation prevents error propagation and enables the model to incrementally adapt and improve per-task.

7. Implications and Future Directions

Dynamic Cheatsheet advances LM test-time learning without finetuning or supervision. The framework holds several implications and open directions:

  • Continuous Post-deployment Evolution: By maintaining an external, adaptively refined “cheatsheet,” models can incrementally adapt to new domains and user needs.
  • Wrappers for Black-Box APIs: Approaches like DC enable intelligent augmentation of proprietary/commercial LMs absent access to internal parameters or large-scale retraining capabilities.
  • Tool Use and Automation: The frequent adoption of code routines and computational heuristics suggests future integration with external APIs or tool-chains for more robust reasoning.
  • Scalability: Hierarchical or domain-specialized memory architectures may facilitate efficient retention of diverse reasoning strategies.
  • AI Reliability and Robustness: DC’s paradigm of session-based cumulative learning raises the standard for adaptive, reliable AI systems in real-world deployments.

Summary

Dynamic Cheatsheet (DC) implements a dual-module framework that retrofits black-box LMs with a persistent, self-curating external memory. By systematizing session-specific retention of high-impact strategies and code artifacts, DC achieves substantial, label-free test-time learning and robustness. Performance improvements are robust across mathematical reasoning, algorithmic puzzles, and knowledge-intensive tasks, demonstrating that dynamic, self-curated memory marks a promising path in augmenting LLMs with human-like, cumulative reasoning capabilities (Suzgun et al., 10 Apr 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Dynamic Cheatsheet (DC).