Papers
Topics
Authors
Recent
Search
2000 character limit reached

MemoryGAN: Memory-Augmented GAN

Updated 21 January 2026
  • MemoryGAN is a generative adversarial architecture augmented with explicit memory units that address catastrophic forgetting and improve mode diversity.
  • It integrates memory modules within both the generator and discriminator to partition latent space into discrete, learnable slots that guide conditional sample synthesis.
  • Empirical evaluations demonstrate that MemoryGAN achieves higher Inception Scores and better stability compared to traditional GANs, reducing mode collapse effectively.

MemoryGAN refers to a family of generative adversarial networks (GANs) that are augmented with explicit memory mechanisms, designed to overcome limitations such as catastrophic forgetting and structural discontinuity in unsupervised generative modeling. These architectures integrate memory units either as episodic buffers or as life-long slot networks, influencing generator conditioning and discriminator decision boundaries to enhance representation learning, sample diversity, and long-term stability.

1. Architecture and Memory Integration

MemoryGAN is characterized by the introduction of a learnable memory component accessible by both the generator and discriminator:

  • Generator (Memory-Conditional Generative Network, MCGN): Accepts a continuous latent vector zp(z)z \sim p(z) and a discrete memory index cc, retrieving a key KcK_c to synthesize samples as x^=G(z,Kc)\hat{x} = G(z, K_c).
  • Discriminator (Discriminative Memory Network, DMN): Encodes samples xx to query q=μ(x)q = \mu(x), computing a posterior p(cx)p(c|x) over NN slot keys {Ki}\{K_i\} using von Mises-Fisher (vMF) similarity. Real/fake discrimination is via D(x)=ivip(c=ix)D(x) = \sum_i v_i p(c=i|x), where viv_i denotes real/fake status of slot ii.
  • Memory Module: Stores key vectors KRN×MK \in \mathbb{R}^{N \times M}, slot values v{0,1}Nv \in \{0,1\}^N, usage histogram hRNh \in \mathbb{R}^N, and age vector aRNa \in \mathbb{R}^N. Updates occur via least-recently-used (LRU) slot replacement or incremental EM steps that shift slot centroids and update histograms.

This explicit slot-based memory partitions the latent space, allocating each slot to an implicit cluster in data (class or semantic mode), thereby enabling both generators and discriminators to address and preserve diverse modes without traversing structurally invalid regions in latent space (Kim et al., 2018).

2. Mathematical Formulation and Loss Functions

MemoryGAN leverages a joint model:

p(x,z,c)=p(xz,c)p(z)p(c)p(x, z, c) = p(x|z, c) p(z) p(c)

where p(z)=N(0,I)p(z) = \mathcal{N}(0, I) is the latent prior and p(c=ivi=1)hivip(c=i|v_i=1) \propto h_i v_i reflects memory slot usage.

  • Generator mapping: x^=G(z,Kc)\hat{x} = G(z, K_c)
  • Discriminator evaluation: D(x)=ivip(c=ix)D(x) = \sum_i v_i p(c=i|x)

The adversarial objective is augmented by a mutual information regularizer:

minGmaxD{Expdata[logD(x)]+Ez,c[log(1D(G(z,Kc)))]+λI^}\min_G \max_D \Bigl\{ \mathbb{E}_{x \sim p_{\text{data}}}[-\log D(x)] + \mathbb{E}_{z, c}[-\log(1 - D(G(z, K_c)))] + \lambda \hat{I} \Bigr\}

where I^=Ez,c[κKcTμ(G(z,Kc))]\hat{I} = -\mathbb{E}_{z, c}[\kappa K_c^T \mu(G(z, K_c))] enforces high cosine similarity between sampled keys and generated sample embeddings. This regularizer enforces cluster–sample alignment, critically lowering mode collapse and improving the fidelity and diversity of outputs (Kim et al., 2018).

3. Memory Network Update Policies and Operational Algorithms

Updates to the slot-based memory network follow principles that blend reinforcement of active clusters with continual refreshment:

  • Addressing: For each query, a vMF posterior selects top-kk slots with highest exp(κKiTq)(hi+β)\exp(\kappa K_i^T q)(h_i + \beta) values.
  • LRU-Rotation: If no slot matches the label, select the oldest slot (argmaxiai\arg\max_i a_i), overwrite KnqK_{n^*} \leftarrow q, vnyv_{n^*} \leftarrow y, reset hnh_{n^*} and ana_{n^*}.
  • Incremental EM: Otherwise, update selected slots SyS_y via EM steps on soft assignments γi(t)=p(c=ix)\gamma_i^{(t)} = p(c=i|x), adjusting KiK_i and hih_i toward modal centroids over observed data.

Training alternates memory updates, discriminator learning on real/generated samples, and generator updating with latent z/c sampled from recent slot statistics (Kim et al., 2018).

4. Impact on GAN Training Dynamics: Structural Discontinuity and Forgetting

The slot-based memory mitigates GAN limitations by:

  • Structural discontinuity: Discrete latent indices cc partition latent space into NN cluster regions, avoiding unnatural interpolation and invalid transitions characteristic of unimodal latent distributions. Each key KiK_i acts as an anchor for a data mode—generator traversal along zz within one slot yields intra-class variations, while inter-slot transitions produce inter-class changes (Kim et al., 2018).
  • Discriminator forgetting: Persistent slot centroids and nonzero priors p(c)p(c) ensure continued representation of rare or previously generated samples, stabilizing adversarial learning and reducing catastrophic forgetting of generator modes. This persistence is not achievable with purely continuous or memoryless GAN architectures.

5. Empirical Evaluation and Results

MemoryGAN demonstrates quantitative and qualitative superiority in unsupervised image generation:

  • Unsupervised Inception Scores (CIFAR-10):
    • MemoryGAN: 8.04±0.138.04 \pm 0.13
    • Comparators: e.g., WGAN-GP: 7.86±0.077.86 \pm 0.07, Fisher GAN: 7.90±0.057.90 \pm 0.05
  • Qualitative results: On datasets like Fashion-MNIST and affine-MNIST, discrete slots each encapsulate distinct object categories or transformation clusters. Interpolations along zz within a slot yield smooth stylistic changes; transitions between slots effect semantic shifts.
  • Ablation studies: Removing memory network (“–Memory”) leads to a sharp drop in inception score from $8.04$ to $5.35$. Eliminating memory-based sampling or using simple moving average for slot centroids also degrades performance.
  • Failure cases: Occur when slots mix visually similar but semantically diverse samples, suggesting future research in adaptive slot granularity (Kim et al., 2018).

6. Extensions, Trade-offs, and Limitations

  • Scalability: MemoryGAN’s slot memory scales to thousands of clusters (N=4096N=4096 for MNIST, N=16384N=16384 for CIFAR-10), but memory management overhead and EM computation may pose challenges for extremely large or fine-grained datasets.
  • Mode granularity: Slot collapse or mixing can arise when clusters in data are poorly separated or visually ambiguous. Modulating key dimension MM, slot count NN, or updating rules could improve allocation fidelity.
  • Integrability: MemoryGAN can be integrated with other GAN models without optimization tricks or weaker divergences and remains unsupervised, highlighting its flexibility for broad application domains.

MemoryGAN is conceptually distinct from GANs using simple episodic buffers (e.g., in continual learning contexts such as CloGAN (Rios et al., 2018)). While CloGAN maintains a small buffer of real samples to regularize continual class learning, MemoryGAN applies a life-long unsupervised memory slot architecture tuned for structural latent representation and adversarial stability. Both approaches address forgetting and diversity, but employ orthogonal mechanisms—episodic replay versus slot-based mixture models—to condition and regularize generation.

In summary, MemoryGAN’s discrete–continuous latent decomposition and persistent slot memory fundamentally alleviate mode collapse, improve interpretability, and stabilize adversarial training in unsupervised settings. Its architecture extends the generative modeling frontier by leveraging memory as a high-dimensional, learnable scaffold for both representation and synthesis (Kim et al., 2018).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to MemoryGAN.