Obvious Record: Symbolic Memory for LLMs
- Obvious Record is a symbolic, non-parametric memory mechanism that stores explicit cause–result mappings for large language models.
- It employs semantic feature extraction and novelty assessment using EnEx-k and external group entropy to capture and refine rare or novel events.
- The approach enhances one-shot learning and generalization by balancing semantic diversity with high task utility through entropy-guided retrieval.
Obvious Record is a symbolic, non-parametric memory mechanism designed to augment LLMs with explicit, human-interpretable cause–result (e.g., question–solution) mappings. Unlike standard parametric memory within model weights, Obvious Record stores persistent, semantically compact representations of rarely observed or novel experiences as discrete entries, allowing LLMs to emulate deliberate, method-oriented human learning and better generalize to low-resource or previously unseen scenarios. This memory enables persistent acquisition, rapid recall, refinement, and deliberate exploration of methods distinct from inductive generalization by next-token prediction alone (Su, 14 Dec 2025).
1. Formal Definition and Representation
Each Obvious Record entry has the structure
where:
- : a compact, human-interpretable representation of the input scenario (e.g., question or failure report). Typically, the top- distinctive keywords are extracted using EnEx-k, emphasizing elements maximally distant in semantic space from existing records.
- : a single solution or a small set of alternative methods.
All records are stored as symbolic key-value pairs, external to the LLM, providing direct, readable mappings maintained independently of model weights.
2. Observation Recording and Continuous Improvement
When encountering a new scenario, the following pipeline is executed:
a. Feature Extraction (EnEx-k):
- From the raw input, select words/phrases whose embeddings maximize semantic distance from features in existing records. This forms the distilled representation, (similarly, for solutions).
b. Novelty Assessment (External Group Entropy):
- Compute $\text{EN}_{cos}(A,S) = \min_{s \in S} [1 - \cosine\_sim(A, s)]$, where is the new feature, is the set of stored features.
- Accept as novel if , where is a user-specified threshold enforcing semantic novelty.
c. Storage and Update:
- If is novel, add to the record.
- If already present, compare with existing :
- If , both methods are retained.
- If , select the method with higher utility as determined by task-specific evaluation (e.g., correctness, speed, robustness).
This combined mechanism balances the retention of semantically distinct solutions and the replacement/augmentation of similar ones based on effectiveness.
3. Objective Functions and Memory Constraints
Memory expansion and refinement are guided by a trade-off between diversity and task utility:
- Novelty Threshold (): Only entries with semantic distance above are considered for addition, preventing memory saturation with nearly redundant mappings.
- Continuous Improvement: When updating, maximally diverse methods with high utility are preferred, ensuring the record set collectively spans the semantic space relevant to rare problems.
- Budgeted Coverage Objective: The memory set evolves under the objective
$\max_{S, |S|=n} \min_{q \in unseen} \max_{s \in S} \cosine\_sim(q, s)$
subject to semantic-diversity constraints and a budgeted number of entries. The implementation uses a greedy, entropy-maximizing heuristic to select which methods to retain (Su, 14 Dec 2025).
4. Retrieval Mechanisms
Query-time retrieval comprises two modes: routine similarity-based and exploratory entropy-based.
A. Routine (Similarity-Based) Retrieval:
- Extract for the query.
- Compute $\cosine\_sim(\text{feature}_{cause,new},~ \text{feature}_{cause,i})$ for all .
- Identify $i^* = \arg\max_i \cosine\_sim(\cdot)$.
- Return as the corresponding method.
B. Exploration (Entropy-Based) Retrieval:
- If the routine method fails or an alternative is requested, enumerate all candidate results for the given cause.
- Compute
selecting the method maximally semantically dissimilar from those previously tried.
Pseudocode Sketch:
1 2 3 4 5 6 7 8 |
input: new_query c_new ← extract_top-k_features(new_query) if exists c in Record with cosine_sim(c_new,c) ≥ τ: r ← Record[c[argmax cosine_sim(c_new,c)]] if apply(r) succeeds: return r methods ← all Record entries for c_new (or global set if none match) r ← argmax_r min_{r'∈tried} [1 – cosine_sim(embed(r), embed(r'))] return r |
5. Empirical Evaluation and Performance
Performance of Obvious Record, in conjunction with Maximum-Entropy Method Discovery, has been demonstrated on the QSS60 benchmark comprising 60 semantically diverse question–solution pairs. Notable empirical outcomes include:
| Metric | MaxEn (n=10) | RanCho (n=10) | Δ |
|---|---|---|---|
| Average max similarity to unseen question (external coverage) | 0.5566 | 0.5099 | +0.0467 |
| Internal diversity (sum of pairwise similarities) | 12.23 | 14.36 | –2.13 |
Across varying memory sizes (), the entropy-guided (MaxEn) method consistently outperforms the random baseline (RanCho) in both external coverage and internal diversity. This indicates that the combined mechanism produces a more semantically distributed and generalizable set of solutions, equipping LLMs to address rare or unseen problems more reliably (Su, 14 Dec 2025).
6. Significance and Applications
Obvious Record equips LLMs with persistent, explicit, and interpretable memory structures akin to human episodic memory for rare events. This enables one-shot learning, deliberate method refinement, and purposeful exploration of alternatives—capabilities not natively provided by conventional next-token predictors with only parametric memory. Its integration is crucial for domains characterized by rare-event learning, low-resource anomaly resolution, niche hardware deployment, and infrequent IoT behaviors, where conventional data-driven learning is insufficient. A plausible implication is that such mechanisms can systematically enhance robustness and coverage in real-world deployments by explicitly capturing and recalling rare but high-impact scenarios (Su, 14 Dec 2025).