Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 43 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 18 tok/s Pro

GPT-5 High 16 tok/s Pro

GPT-4o 95 tok/s Pro

Kimi K2 198 tok/s Pro

GPT OSS 120B 464 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Rewriting a Deep Generative Model (2007.15646v1)

Published 30 Jul 2020 in cs.CV, cs.GR, and cs.LG

Abstract: A deep generative model such as a GAN learns to model a rich set of semantic and physical rules about the target distribution, but up to now, it has been obscure how such rules are encoded in the network, or how a rule could be changed. In this paper, we introduce a new problem setting: manipulation of specific rules encoded by a deep generative model. To address the problem, we propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory. We derive an algorithm for modifying one entry of the associative memory, and we demonstrate that several interesting structural rules can be located and modified within the layers of state-of-the-art generative models. We present a user interface to enable users to interactively change the rules of a generative model to achieve desired effects, and we show several proof-of-concept applications. Finally, results on multiple datasets demonstrate the advantage of our method against standard fine-tuning methods and edit transfer algorithms.

Citations (126)

View on Semantic Scholar

Collections

Summary

Overview of "Rewriting a Deep Generative Model"

The focus of this paper is the manipulation of learned semantic and physical rules within deep generative models, specifically GANs. The authors present a novel problem setting where they propose modifying the internal rules of a deep network to produce desired changes in its generated output. This task, termed "model rewriting," allows for changes across an entire distribution of generated images, in contrast with traditional methods that modify individual output images.

Methodology

The paper introduces a method for rewriting deep generative models by focusing on manipulating a layer in the network as a linear associative memory. The key idea is to alter the weights of this layer to selectively change specific semantic rules while maintaining the integrity of existing rules. The authors provide an algorithm that modifies one entry of this associative memory, using a combination of constrained optimization and associative memory theory. This includes:

Objective Design: The aim is to modify a specific rule encoded within the network by updating the weights so that specific input conditions produce new, desired output conditions, while minimizing collateral effects on other outputs.
Interpretation as Associative Memory: The authors describe a convolutional layer's weights as an associative memory storing key-value pairs, linking input features (keys) to output features (values). This perspective allows the application of concepts from associative memory to manage and constrain changes made to the model.
Optimization Strategy: A rank-one update approach is employed, where the optimization is constrained to a specific directional change in the weights of a layer. This ensures targeted and minimally invasive changes to the generative model.
User Interface for Model Editing: A three-step process (Copy-Paste-Context) in a user interface allows users to specify changes interactively, facilitating intuitive manipulation of the model by non-experts.

Results

The paper demonstrates the efficacy of the proposed method across various tasks: adding new objects into generated scenes, removing undesired features, and altering contextual graphical rules in the generation process. For instance, changes like replacing architectural elements, modifying facial expressions, and inverting lighting effects in scenes illustrate the method's capability to generalize modifications across a wide range of outputs. The results show advantages in photorealism and adherence to intended changes, outperforming certain baseline methods like fine-tuning and traditional edit transfer techniques.

Implications and Future Work

The implications of this research are substantial for the fields of computer vision and graphics. By enabling selective rule changes in generative models without retraining from scratch, this work provides a tool for efficient model customization and content generation, potentially reducing computational costs and the need for vast datasets.

From a theoretical perspective, this exploration of internal model semantics opens pathways for greater interpretability and understanding of deep generative networks. Practically, it can enhance creative applications in media production and virtual environments.

Future work could explore extending this technique to other types of generative models, such as those used in language and audio synthesis. Additionally, refining the method to handle more complex rule manipulations or to improve user interaction interfaces are promising directions.

In conclusion, this paper offers a sophisticated yet intuitive approach to altering the internal mechanics of deep generative models, providing both a deepened understanding and enhanced utility of these tools in artificial intelligence.