2000 character limit reached

Self-Abstraction from Grounded Experience (SAGE)

Updated 15 November 2025

Self-Abstraction from Grounded Experience (SAGE) is a framework that abstracts grounded sensorimotor data into reusable symbolic or policy-level models to enhance AI generalization.
SAGE frameworks span domains such as predicate logic, LLM-based agents, and robotics, providing structured plans, logical inference, and memory summarization.
SAGE employs a multi-step process—from data collection to abstract representation generation—to improve performance metrics and robust task execution across modalities.

Self-Abstraction from Grounded Experience (SAGE) refers to a suite of formal and algorithmic frameworks in artificial intelligence whereby agents abstract higher-level models, policies, or logical representations from their direct, low-level interactions with the world. This process leverages grounded experience—structured logs of actions, observations, and outcomes—to induce reusable, symbolic, or policy-level guidance. SAGE appears in multiple lines of research, including probabilistic predicate logic, plan-guided refinement for language-model agents, grounded memory for robotics, and affordance-based action interpretation in developmental robotics. These instantiations share the central tenet: self-derived abstractions, extracted from one's own sensorimotor history, serve as the substrate for generalization, reasoning, and “learning to learn.”

1. Formal Principles: Lifting from Grounded Data to Abstraction

SAGE frameworks implement a canonical pipeline:

collect empirical data from agent–environment interaction;
induce an intermediate abstract representation (logical model, plan, memory summary);
deploy this abstraction to improve inference, decision-making, or understanding.

For predicate logic (Kido, 19 Feb 2025), let $D = \{d_1, \dots, d_K\}$ be $K$ grounded observations, each mapped via a grounding function $m: D \to \mathcal{M}$ to interpretations $m_n$ over a fixed logical vocabulary (constants $\mathcal{C}$ , functions $\mathcal{F}$ , predicates $\mathcal{P}$ ). The process “lifts” the empirical data distribution $P_D$ to a distribution $P_M$ over models, and further, via a controlled logical noise parameter $\mu \in (1/2, 1]$ , to a distribution over formulas $\alpha$ : $P(\alpha) = \sum_n P(\alpha|m_n)P_M(m_n)$ where $P(\alpha|m_n)$ upweights formulas true in $m_n$ ( $\mu$ ) and downweights those that are false ( $1-\mu$ ). This abstraction mechanism grounds all logical symbols in observed data and supports probabilistic inference over formulas and entailments.

In plan-guided policy refinement (Hayashi et al., 8 Nov 2025), the agent operates as a Markov Decision Process $\mathcal{M} = (\mathcal{S}, \mathcal{A}, T, R, \gamma)$ where experience takes the form of trajectories $\tau_\pi^T = (s_1, a_1, r_1, \ldots, s_T, a_T, r_T)$ . SAGE induces abstractions $\psi$ as concise, structured plans comprising high-level steps $h_i$ , dependencies $d_i$ , and constraints $c_i$ : $\psi = \{ (h_i, d_i, c_i) \}_{i=1}^N$ These abstractions serve as reusable, options-style guides in future executions.

In robotics (Lan et al., 22 Jul 2025), SAGE emerges as the process of summarizing closed-loop episodes into short natural-language memories $E_i$ , indexed by scene descriptors $K_i$ and stored for retrieval-augmented planning.

2. Algorithmic Methodologies

Table: SAGE Instantiations Across Domains

Domain	Abstraction Type	Induced From
Predicative Logic	Distribution over closed formulas	Empirical distribution over models
LLM-based Agents	High-level plan with dependencies	Raw execution trace (commands, logs)
Robotics/VLMs	Natural-language episode summaries	Action-observation-feedback tuples
Affordance-based Robotics	Bayesian affordance-LLM	Self-performed actions and effects

In predicative logic (Kido, 19 Feb 2025), SAGE is realized by defining a joint probability $P(D, M, \alpha)$ , marginalizing to $P_M(m_n)$ , and performing all formula-based inference (including conditional entailment and handling contradiction) via summing over the at-most- $K$ grounded models. The computational core reduces to efficient $O(K)$ time summations, though searching the logical formula space $L$ remains combinatorially complex; practical implementations trim this search using heuristics or neural proposals.

Plan abstraction in LLM-guided software agents (Hayashi et al., 8 Nov 2025) proceeds in three steps: (A) collect grounded trajectory $\tau$ by executing the environment; (B) induce a plan $\psi$ from $\tau$ via a planner LLM (potentially distinct from the rollout LLM to reduce self-bias), extracting steps, dependencies, and invariants; (C) re-execute conditioned on this plan. The architecture is agnostic to agent backend or LLM backbone, requiring only that actions and plans are textually rendered and inserted into the LLM’s prompt.

In vision-language robotics (Lan et al., 22 Jul 2025), SAGE is manifested as ExpTeach, which logs each episode as a sequence of $(o_k, a_k, o_{k+1}, r_{k+1})$ , summarizes the short-term memory sequence $m_T$ into a short natural-language $E$ , and stores $(K, E)$ in long-term memory. Retrieval for new tasks is realized via embedding $K$ and cosine similarity. Reflection and annotation (Algorithm 2) serve to refine skill parameterization in real time.

Affordance-grounded action understanding (Saponaro et al., 2019) utilizes a Bayesian fusion of a self-learned affordance-language BN (from robot manipulation episodes) with an HMM-based mapping from observed human gestures to robot action labels, allowing SAGE-driven inference and explanation over others' behavior.

3. Generalization, Logical Inference, and Handling Contradiction

In logical SAGE (Kido, 19 Feb 2025), classical implication and undecidability are addressed concretely. When $\mu = 1$ , probabilistic formula entailment collapses to majority-vote over data-supported models (“empirical logical consequence”): $P(\alpha | \Delta) = \dfrac{ \sum_{m \in [\Delta]^{poss}} 1_{m \models \alpha} P_M(m) }{ \sum_{m \in [\Delta]^{poss}} P_M(m) }$ yielding $P(\alpha|\Delta)=1$ iff every possible model $m$ consistent with the premises $\Delta$ also satisfies $\alpha$ . For inconsistent $\Delta$ , SAGE computes marginalization over maximal possible subsets, generalizing paraconsistent inference and avoiding explosion. This treatment ensures every logical object (predicate, quantifier, connective) remains data-grounded, and computation never exceeds $O(K)$ models.

4. Empirical and Quantitative Performance

SAGE delivers robust improvements across tasks and agent types:

In LLM-based software engineering agents [(Hayashi et al., 8 Nov 2025), Table 1], SAGE improves Pass@1 rates by 2.2 to 7.2 relative percent across Mini-SWE-Agent and OpenHands CodeAct frameworks, with absolute rates reaching 73.2–74.0% using established benchmarks (SWE-Bench Verified). Cross-model plan induction further boosts resolution rates.
In robotic manipulation (Lan et al., 22 Jul 2025), reflection via short-term memory increases average success rates from 36% to 84% across several manipulation tasks. Retrieval-augmented abstraction (long-term memory) raises single-trial generalization from 22% to 80% on 12 challenging scenarios. These gains are accompanied by emergent behaviors (tool use, occluder management) and persistent recall of task-specific knowledge.

No parametric policy updates or gradient-based learning are required—SAGE operates as a form of test-time or episodic self-improvement. All derived metrics, results, and workflow details are reported directly in the cited works, with performance averaged over multiple runs as indicated.

5. Applications Across Modalities

In logic, SAGE enables empirical reasoning over propositional and predicate logics, handles undecidability by containing inference to data-supported models, and provides a basis for integrating probabilistic and symbolic AI.
In code synthesis and repair, SAGE provides plan-level scaffolding that can be instantiated in commercial agent frameworks, immediately raising code reliability and test-pass rates. The framework is compatible with all mainstream LLMs and execution engines.
In robotics, SAGE-style memory (as in ExpTeach) allows vision-LLMs to overcome domain shift from internet data to embodied tasks, supporting robust task transfer, creative tool use, and rich feedback-driven adaptation.
In developmental robotics, SAGE as probabilistic affordance fusion bridges sensorimotor experience to language and social interpretation, enabling robots to infer intent, anticipate effects, and verbalize observations about others, by repurposing self-induced models.

6. Limitations, Variants, and Research Directions

Logical SAGE is limited by the necessity of searching potentially vast formula spaces and assumes the availability or learnability of grounding functions for mapping data to models.
In LLM agent settings, current SAGE formulations perform a single abstraction and re-execution cycle; extending to multi-episode or online variants remains an open avenue. Long context windows and plan summarization bottlenecks constrain scalability. Cross-model plan abstraction (i.e., using a different agent for plan induction vs. execution) can offer improvements but introduces deployment complexity.
Robotics applications may be hindered by errors in low-level perception (e.g., segmentation in annotation routines), semantic drift in memory retrieval, and absence of multimodal feedback (e.g., touch or force). The robustness of experience retrieval depends critically on choice of embedding and clustering strategies.

Potential extensions include:

Iterated SAGE (multi-round abstraction and execution),
Automated abstraction libraries for zero-shot transfer between tasks,
Hybridization with static program analysis or formal specification,
Broader application to domains like Web navigation, human–robot interaction, and multi-agent collaboration.

SAGE contrasts with established paradigms such as end-to-end gradient-based policy improvement and is distinct from pure memory-augmented architectures (which often lack abstraction) or direct reflection methods (which summarize but do not structurally abstract task experience). Notably, SAGE’s use of self-generated abstractions, rather than externally provided rules or demonstrations, supports stronger domain generalization and more robust, grounded reasoning.

Prior techniques (e.g., REFLECT (Lan et al., 22 Jul 2025)) use only single-task summarization and lack persistent cross-task memory; frameworks such as ReplanVLM and ComeRobot replan but without persistent LTM. Approaches by Sarch et al. distill experience into “programs of thought” but without SAGE’s retrieval-augmented self-reflection pipeline. SAGE’s novelty lies in its closed-loop, data-grounded abstraction for interpretation, planning, and explanation.

In all instantiations, reproducibility is emphasized: all algorithms and updates reduce to simple, tractable computations over the collected experiences, and all necessary steps are documented for independent implementation.

SAGE thus provides a rigorously formulated methodology—spanning logic, LLMs, and embodied agents—for extracting generative abstractions from grounded, self-generated data, enabling flexible, probabilistic, and robust generalization across AI domains.