Mem4Teeth: Prototype Memory Completion

Updated 5 March 2026

Mem4Teeth is a machine learning framework that uses a learned prototype memory to complete incomplete, noisy data for robust reconstruction.
The system integrates encoder–decoder architectures with prototype retrieval and fusion, improving quality in tasks like point-cloud dental completion and episodic memory.
Empirical results demonstrate significant performance gains using methods like BCPNN for prototype extraction, enhancing reconstruction accuracy and mitigating data variability.

Mem4Teeth (Prototype Memory Completion) refers to a class of machine learning architectures in which a system encodes incoming (possibly incomplete) data into a compressed or partial representation, then leverages a learned bank of prototypical features to reconstruct or complete the original object. The “prototype memory” mechanism enables robust inference in situations with missing information, restricted supervision, or high data variability, by reusing learned structural regularities across samples. Mem4Teeth, as instantiated in recent point-cloud completion, episodic generative memory, low-shot generation, and large-scale classification, unifies approaches in which prototype retrieval and fusion form the backbone of memory-guided completion.

1. Architectural Principles and Motivation

Prototype memory completion arises in settings where direct instance memorization is insufficient for effective reconstruction, particularly when inputs are extremely partial or noisy. Canonical encoder–decoder networks tend to overfit contextually implausible structures when faced with large missing regions, as the decoder must “hallucinate” the absent data based solely on the compressed code derived from an incomplete input. Mem4Teeth addresses this by learning a compact bank of prototypical representations—either learned code vectors, manifold anchors, or class centroids—which provide structural priors that regularize the completion process. The partial input is mapped into a global descriptor, which retrieves the most relevant prototype(s) from memory. A fusion mechanism combines the query and prototype features, before passing the result to a reconstruction module that synthesizes the final output. This paradigm has been applied to point-cloud dental completion (Sun et al., 3 Dec 2025), generative episodic memory (Fayyaz et al., 2021), few-shot table-to-text generation (Su et al., 2021), large-scale face representation (Smirnov et al., 2021), and continual learning with memory replay (Ho et al., 2021).

2. Core Frameworks and Algorithms

A. Point Cloud Completion (Dental Reconstruction)

Dual Encoders process both partial and (during training) ground-truth point clouds into $d$ -dimensional global descriptors.
Prototype Memory: A learnable memory bank $M = \{ m_k \in \mathbb{R}^d \}$ stores $K\ll$ dataset size prototypes.
Retrieval: At test time, the prototype $m_i$ closest to the partial input’s descriptor $F_{\pi}$ (Euclidean distance) is retrieved. At training, $F_{\rm gt}$ (from full data) is used.
Fusion: Features are fused as $F' = (1-\alpha) F_{\rm query} + \alpha m_{\rm retrieved}$ with confidence $\alpha \in [0,1]$ estimated per input.
Decoder: A grid-based network (e.g., FoldingNet) reconstructs the final point cloud (Sun et al., 3 Dec 2025).
Losses: Chamfer distance for geometry, vector quantization loss for prototype tracking, alignment loss to bridge partial–GT domain gap.

B. Generative Episodic Memory

VQ-VAE: Inputs (e.g., images) are mapped through an encoder $f_\phi$ to a feature grid, then quantized onto a learned codebook of $K$ prototype vectors $\{e_k\}$ ; the quantized indices form an index matrix $Z$ .
Attention: Only a subset $T$ of the spatial indices is attended and stored (the “gist”), representing the partial memory trace.
Semantic Completion: At recall, given $Z_T$ , a PixelCNN autoregressively fills in missing indices using the distribution $p(Z_{\bar T} | Z_T)$ .
Decoding: The completed index map is converted back to codebook vectors and fed to the decoder $g_\theta$ for reconstruction (Fayyaz et al., 2021).
Losses: Reconstruction, codebook, and commitment losses for VQ-VAE; negative log-likelihood for PixelCNN; possible joint fine-tuning.

C. Few-Shot Table-to-Text Generation

Prototype-to-Generate (P2G): Tables are linearized; top- $n$ prototype sentences are retrieved and ranked (BERT-based similarity), forming the per-example prototype memory.
Sequence Generator: Conditioned jointly on the table and the selected prototypes.
Losses: ML cross-entropy and content-aware unlikelihood to avoid copying mismatched prototype content (Su et al., 2021).

D. Large-scale Representation (Face Recognition)

Prototype Memory Bank: Only $M$ ( $\ll N$ ) class prototypes are kept in device memory, avoiding “prototype obsolescence.”
On-the-Fly Update: Each batch generates/refreshes prototypes for present classes; oldest are dequeued.
Softmax Classification: Margin-based softmax (CosFace or D-Softmax) over the current memory.
Minimal RAM/GPU Load: Memory is maintained as an efficient queue, completely device-resident (Smirnov et al., 2021).

E. Continual Learning

Prototype-guided Replay: For each class, up to $n$ candidate samples closest to its dynamic prototype are kept in episodic memory.
Buffer Management: Old and new instances ranked by prototype proximity ensure a compact, yet class-representative memory.
Replay Frequency: Past prototypes periodically revisited to mitigate forgetting (Ho et al., 2021).

3. Learning Rules and Prototype Extraction

In nonparametric associative memories, the prototype extraction problem involves distilling the most representative prototype for each class from a set of noisy, partial, or distorted instances.

A. Hebbian Learning Rules:

Standard Hebb, Hopfield, Willshaw, Covariance, Presynaptic Covariance, and Bayesian Confidence Propagation Neural Network (BCPNN) rules have been benchmarked (Lansner et al., 2023).
In the prototype extraction regime (e.g., $N \simeq 1024$ , $n_{\rm inst} = 10$ distorted instances per prototype with 10% random resampling), BCPNN achieves the highest capacity (e.g., $P_{90\%}=140$ in modular, uncorrelated patterns), 2–3 $\times$ the runner-up, due to the normalization and log-odds weighting of coactivation statistics.
BCPNN’s bias terms, bits-per-weight scaling, and resilience to correlated patterns and noise allow it to combine evidence more effectively than additive rules.

B. Evaluation Protocol:

Training: One-pass weight/bias computation over the set of distorted instances (not the prototypes themselves).
Recall: Iterative winner-take-all activation; successful extraction if the recalled pattern exactly matches the original prototype.
Memory storage: Only the weights/biases, resulting from summary statistics over instances, are retained. No explicit raw samples are memorized.

Rule	Non-Modular (hrand)	Modular (hrand)	Silent (modular)	Correlated (modular)
WILLSHAW	50	65	60	50
HEBB	54	68	62	5
HOPFIELD	58	70	64	5
COVARIANCE	62	75	65	40
PRESYN COV	64	80	72	45
BCPNN	105	140	130	60

BCPNN substantially exceeds all others except in non-modular, correlated cases, where its performance advantage is diminished but remains significant (Lansner et al., 2023).

4. Empirical Performance Across Domains

Dental Completion: On the Teeth3DS-derived benchmark, Mem4Teeth reduces Chamfer Distance by 15.7% compared to SVDFormer, and 58.5% compared to PCN; morphological accuracy is visible in sharper cusps, interproximal integrity, and correct interdental gaps. Prototypes self-organize along shape axes, forming a compact anchor cloud in embedding space (Sun et al., 3 Dec 2025).

Generative Episodic Memory: VQ-VAE achieves $\sim30\times$ compression. Semantic completion via PixelCNN approximately doubles classification-recall accuracy at low attention levels. VQ quantization confers superior denoising and out-of-domain generalization compared to standard VAEs (Fayyaz et al., 2021).

Few-Shot Generation: In P2G, adding three prototypes yields +6.7 BLEU and +7.2 ROUGE at the 50-shot regime compared to a strong T5 baseline. Human evaluations confirm boosts in factuality and fluency. The selector is critical; more than three prototypes per input degrades performance (Su et al., 2021).

Large-Scale Classification: Prototype Memory maintains representation quality as measured by rank-1 MegaFace accuracy, outperforming or matching sampled softmax methods with less overhead and eliminating stale prototypes. Proper refresh ratio ( $r=0.2$ ) and batch size ( $k=4$ ) are key (Smirnov et al., 2021).

Continual Learning: Dynamic prototype-guided replay improves final sequential accuracy by 1.9 percentage points (AGNews) and reduces the “accuracy drop” relative to i.i.d. training from 53.34% to 20.62%, reflecting strongly mitigated catastrophic forgetting (Ho et al., 2021).

5. Theoretical Analysis and Practical Considerations

A. Prototype Memory Efficiency

By constraining the memory size ( $K$ prototypes), prototype banks induce a strong inductive bias towards cross-sample regularities, both compressing and denoising the representations supplied to the decoder.
Plug-and-play design: The memory module and fusion logic are easily inserted into a wide range of architectures, with no changes to downstream losses or training regimes (Sun et al., 3 Dec 2025, Su et al., 2021).

B. Selection and Fusion Mechanisms

Retrieval based on Euclidean or cosine similarity; fusion confidence measured by similarity or learned predictors.
In few-shot and sequential settings, prototype selection is crucial: more than three prototypes increases context “noise,” diluting the useful patterns, while too few miss key structure (Su et al., 2021).
Updating prototype memory online (e.g., dynamic refresh and queueing in face recognition (Smirnov et al., 2021), continual learning (Ho et al., 2021)) is essential to avoid staleness and prototype detachment.

C. Cross-Domain Generalization

VQ-based memory is able to reconstruct types not present in the training set, demonstrating the importance of prototype combinatorics and abstraction capacity (Fayyaz et al., 2021).
In face recognition, the dynamic memory eliminates prototype obsolescence endemic to sampled softmax (Smirnov et al., 2021).

6. Relationship to Associative Memory and Pattern Completion

The prototype completion principle realized in Mem4Teeth is fundamentally an instantiation of the pattern completion and prototype extraction behavior long studied in associative memory literature. Modern implementations operationalize these ideas with learned deep prototypes and end-to-end differentiable fusion or autoregressive completion, but the core insight—matching incomplete queries to stored prototypes to recover or denoise the latent structure—remains consonant with classic attractor networks and Bayesian retrieval.

Specifically, BCPNN learning (Lansner et al., 2023) delivers prototype extraction by log-odds accumulation of co-activation evidence, enabling the network to recover the true prototype from noisy instances. In VQ-VAE-based generative episodic memory, discrete codebook vectors serve as the prototypes, and PixelCNN completion serves as the pattern completion engine (Fayyaz et al., 2021). Point-cloud and face representation methods implement analogous memory-banked retrieval and efficient fusion within high-dimensional feature spaces (Sun et al., 3 Dec 2025, Smirnov et al., 2021).

7. Limitations and Open Directions

Prototype memory completion schemes require that underlying data share sufficient regularity for prototypical anchors to be effective structural priors. In few-shot generation, retrieval out-of-domain (e.g., rare table entities not represented in Wikipedia) may limit utility (Su et al., 2021). Overly large prototype banks can dilute retrieval quality via noisy or irrelevant anchors. No explicit forgetting mechanism exists in static prototype selection models beyond unlikelihood loss or queue size limitation; online approaches address prototype obsolescence but may be sensitive to refresh heuristics.

A plausible implication is that further progress in prototype memory completion will depend on dynamic, context-sensitive prototype acquisition, finer-grained retrieval metrics, and hybridization with meta-learning to adapt prototype banks online as domain drift or task evolution occurs.

Key references:

"Memory-Guided Point Cloud Completion for Dental Reconstruction" (Sun et al., 3 Dec 2025)
"A model of semantic completion in generative episodic memory" (Fayyaz et al., 2021)
"Few-Shot Table-to-Text Generation with Prototype Memory" (Su et al., 2021)
"Prototype Memory for Large-scale Face Representation Learning" (Smirnov et al., 2021)
"Prototype-Guided Memory Replay for Continual Learning" (Ho et al., 2021)
"Benchmarking Hebbian learning rules for associative memory" (Lansner et al., 2023)