Papers
Topics
Authors
Recent
Search
2000 character limit reached

LOGAN: Latent Optimization for GAN Stability

Updated 27 February 2026
  • The paper introduces LOGAN, which applies a latent optimization step using both gradient descent and natural gradient techniques to stabilize GAN training.
  • Methodologically, LOGAN incorporates an intermediate latent update before the generator's forward pass, adding second-order terms that dampen adversarial cycling.
  • Empirical results on ImageNet demonstrate significant improvements in FID and IS metrics, validating the practical benefits of LOGAN in high-capacity GANs.

LOGAN (Latent Optimisation for Generative Adversarial Networks) is a technique designed to improve the stability and performance of generative adversarial network (GAN) training by introducing a principled gradient-based optimization in the latent space prior to each forward pass through the generator. LOGAN leverages both plain and natural-gradient methods for this latent space adjustment, yielding substantial improvements in convergence behavior, adversarial dynamics, and sample quality, particularly on large-scale datasets such as ImageNet at 128×128128 \times 128 resolution (Wu et al., 2019).

1. Mathematical Foundations of Latent Optimisation

LOGAN modifies the canonical GAN training loop by incorporating an intermediate "latent optimisation" step before generating samples. Let θD\theta_D and θG\theta_G denote the discriminator and generator parameters, z∼p(z)z \sim p(z) a vector in latent space, x=G(z;θG)x = G(z; \theta_G) a generated sample, and f(z;θD,θG):=D(G(z;θG);θD)f(z; \theta_D, \theta_G) := D(G(z; \theta_G); \theta_D) the discriminator score.

  • Standard GAN Gradients: For a Wasserstein-type GAN loss, parameters are updated via gD=∂f/∂θDg_D = \partial f / \partial \theta_D, gG=−∂f/∂θGg_G = -\partial f / \partial \theta_G.
  • Latent Optimisation by Gradient Descent (GD): A single latent step is introduced:

Δz=α ∂f(z)∂z,z′=Clip(z+Δz, −1,+1)\Delta z = \alpha\, \frac{\partial f(z)}{\partial z}, \qquad z' = \text{Clip}(z+\Delta z,\, -1, +1)

Afterward, GAN losses are computed from x′=G(z′;θG)x' = G(z'; \theta_G), and the resulting gradients retain second-order terms due to the dependence z′=z+Δzz' = z + \Delta z.

g=∂f(z)∂z,  F′=ggT+βI;Δz=αgβ+∥g∥2,z′=Clip(z+Δz, −1,+1)g = \frac{\partial f(z)}{\partial z},\,\, F' = gg^T + \beta I;\quad \Delta z = \alpha \frac{g}{\beta + \|g\|^2}, \quad z' = \text{Clip}(z+\Delta z,\, -1, +1)

A regularization term Rz=wr∥Δz∥2R_z = w_r \|\Delta z\|^2 is added to both losses.

This modification introduces additional cross-derivative terms, fundamentally changing the adversarial dynamics during training.

2. Integration into the GAN Training Loop

LOGAN inserts latent optimisation into the standard GAN mini-batch training cycle as follows:

  1. For each batch sample, sample ziz_i from Uniform([−1,1]d)\mathrm{Uniform}([-1,1]^d).
  2. Forward propagate: x=G(zi;θG)x = G(z_i; \theta_G), f=D(x;θD)f = D(x; \theta_D).
  3. Compute gradient in latent space gz=∂f∂zig_{z} = \frac{\partial f}{\partial z_i}.
  4. Update zi→zi′z_i \rightarrow z'_i via GD or NGD.
  5. Forward propagate x′=G(zi′;θG)x' = G(z'_i; \theta_G), obtain f′=D(x′;θD)f' = D(x'; \theta_D).
  6. Define per-sample generator and discriminator losses: LG=−f′+wr∥Δz∥2L_G = -f' + w_r\|\Delta z\|^2, LD=f′−D(xi;θD)+wr∥Δz∥2L_D = f' - D(x_i;\theta_D) + w_r\|\Delta z\|^2.
  7. Batch-aggregate losses; update parameters via Adam.

Empirically, one latent optimisation step per iteration suffices for improved dynamics; additional steps can destabilize the training process.

3. Theoretical Rationale for Improved Stability

Several theoretical principles underlie LOGAN’s stabilization effects:

  • Symplectic Gradient Adjustment (SGA) Connection: In adversarial games, the joint gradient field has antisymmetric (rotational) components that induce cycling. LOGAN’s backpropagation through z′z' yields second-order terms that couple discriminator and generator updates in a manner analogous to SGA, dampening rotation and reducing cycling.
  • Analogy to Unrolling: LOGAN’s single-step latent update is comparable to unrolling optimization in generator parameter space but operates on the substantially lower-dimensional zz, thus being computationally inexpensive.
  • Two-Time-Scale Update Rule (TTUR) Perspective: By pre-optimizing zz in the discriminator’s favor, LOGAN effectively accelerates the discriminator’s impact and/or retards the generator’s updates, increasing the effective time-scale separation between players, which is conducive to stable convergence.

These mechanisms act jointly to suppress divergence, oscillation, and other pathologies typical in GAN training.

4. Empirical Results and Experimental Protocol

LOGAN achieves state-of-the-art results on ImageNet (128×128128 \times 128) with established architectures:

  • Architecture: BigGAN-deep with class conditioning, spectral-normalized ResNets, self-attention, latent dimension expanded 128→256128 \to 256, latent prior switched to Uniform([−1,1])\mathrm{Uniform}([-1,1]), LeakyReLU (slope $0.2$) in generator final layers.
  • Optimizers and Hyperparameters:
    • Adam: β1=0\beta_1=0, β2=0.999\beta_2=0.999, learning rate 2×10−42\times10^{-4}
    • Batch size: $2048$
    • Latent optimiser: α=0.9\alpha=0.9, β=5.0\beta=5.0, wr=300.0w_r=300.0
    • Fraction of latent dimensions optimised per iteration: c=50%c=50\% (chosen randomly)
    • Training duration: Up to $600$k steps (LOGAN delays collapse compared to vanilla BigGAN-deep)
    • No latent optimisation used at evaluation time

A direct comparison on the ImageNet benchmark yielded significant improvements:

Model FID (↓) IS (↑)
BigGAN-deep (orig.) 5.7 ± 0.3 124.5 ± 2.0
Baseline (ours) 4.92 ± 0.05 126.6 ± 1.3
LOGAN (GD) 4.86 ± 0.09 127.7 ± 3.5
LOGAN (NGD) 3.36 ± 0.14 148.2 ± 3.1

This reflects a 32% improvement in FID and a 17% boost in IS compared to their re-implemented BigGAN-deep baseline. LOGAN also demonstrates superior FID/IS trade-off when varying truncation parameters (Wu et al., 2019).

5. Implementation Considerations and Hyperparameter Regimes

LOGAN introduces some computational and practical implications:

  • Overhead: Training time per iteration increases by approximately 3×3\times due to the extra forward/backward pass for z′z'. There is no additional cost at evaluation, as latent optimisation is not used at inference.
  • Recommended Hyperparameter Ranges (from grid search):
    • α\alpha: $0.7$–$1.0$ (best: $0.9$)
    • β\beta: $0.1$–$10$ (best: ∼\sim5 for BigGAN)
    • wrw_r: $0.1$–$500$ (best: $300$ for deep models)
    • Fraction cc: 30%30\%–90%90\% (best: $50$–80%80\%)
  • Failure Modes:
    • Excessively large α\alpha or too small β\beta causes latent overshoot and destabilization.
    • LOGAN substantially delays (but does not eliminate) mode collapse; divergence may occur with very prolonged training due to higher-order adversarial effects.
    • Very small batch sizes can compromise the intended two-time-scale beneficial effects.

A plausible implication is that the choice of hyperparameters such as α\alpha, β\beta, and wrw_r should scale with model capacity and dataset size to balance stability and expressiveness.

6. Relationship to Broader Adversarial Optimisation Advances

LOGAN situates itself conceptually at the intersection of game-theoretic GAN optimization and efficient second-order stabilisation methods:

  • By exploiting low-dimensional latent optimisations and closed-form natural gradients, LOGAN approximates theoretical stabilisation terms found in SGA and unrolled GANs, both of which are computationally intensive at large scale.
  • The approach generalises to other GAN architectures requiring sophisticated dynamic management, suggesting broader applicability in stabilising adversarial learning scenarios.
  • The methodology aligns with TTUR principles for two-player minimax optimization, offering an actionable strategy to enforce discriminator dominance via the latent channel.

LOGAN thus provides a tractable means to enhance the stability, efficiency, and quality of high-capacity GANs in large-scale settings, as supported by empirical evidence on ImageNet benchmarks (Wu et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LOGAN Approach for GAN Stabilization.