Controlling the image generation process with parametric activation functions

Published 17 Oct 2025 in cs.CV and cs.AI | (2510.15778v1)

Abstract: As image generative models continue to increase not only in their fidelity but also in their ubiquity the development of tools that leverage direct interaction with their internal mechanisms in an interpretable way has received little attention In this work we introduce a system that allows users to develop a better understanding of the model through interaction and experimentation By giving users the ability to replace activation functions of a generative network with parametric ones and a way to set the parameters of these functions we introduce an alternative approach to control the networks output We demonstrate the use of our method on StyleGAN2 and BigGAN networks trained on FFHQ and ImageNet respectively.

Abstract PDF Upgrade to Chat

Authors (1)

Ilia Pavlov

Summary

The paper presents a system that replaces static activation functions with parametric versions to allow real-time control over GAN outputs.
The methodology employs a GUI for layer selection and parameter adjustment, demonstrating clear effects on image structure and style in StyleGAN2 and BigGAN.
Empirical results reveal that subtle parameter tweaks yield minor image modifications while larger changes can produce dramatic, sometimes unpredictable, effects.

Parametric Activation Functions for Controlling Image Generation in GANs

Introduction

This paper presents a system for interactive control over image generative models by enabling users to replace static activation functions with parametric alternatives and adjust their parameters in real time. The approach is motivated by the need for interpretable and direct manipulation of generative neural networks, particularly in the context of Explainable AI (xAI) and computational creativity. The system is demonstrated on two canonical architectures: StyleGAN2 (trained on FFHQ) and BigGAN (trained on ImageNet), providing empirical evidence of the impact of parametric activation functions on generated outputs.

Figure 1: The GUI enables layer selection, activation function replacement, parameter adjustment, and real-time visualization of results.

System Architecture and Control Method

The core contribution is a graphical user interface (GUI) that exposes the internal structure of a generative model, allowing users to:

Select neural layers for intervention
Replace activation functions with parametric variants (SinLU, ReLUN, ShiLU, and polynomial functions)
Adjust parameters of these functions interactively
Visualize the resulting output images in real time

This design facilitates a direct mapping between architectural changes and output variations, supporting both educational and creative exploration. The GUI also allows for latent vector editing and selective layer disabling, further expanding the control space.

Parametric Activation Functions

The paper investigates several parametric activation functions:

SinLU: $SinLU(x) = (x + a \sin(bx)) \cdot \sigma(x)$ , where $a$ controls amplitude and $b$ controls frequency. High parameter values induce unpredictable, abstract outputs.
ReLUN: $ReLUN(x) = \min(\max(0, x), n)$ , with $n$ as a tunable upper bound.
ShiLU: $ShiLU(x) = a \cdot ReLU(x) + b$ , with $a$ for slope and $b$ for vertical shift.
Polynomial Activation: $f_n(x) = \frac{\sum_{i=1}^{n} w_i \cdot \sigma(x)^i}{\sqrt{2}^n}$ , where $w_i$ are user-controlled weights.

These functions are injected into selected layers, modifying the intermediate feature maps and, consequently, the final output. The cascading effect of parameter changes across multiple layers is highlighted as a source of significant output diversity.

Empirical Results: StyleGAN2

Experiments on StyleGAN2 reveal that:

Modifying activation functions in the mapping network primarily affects image structure, as the disentangled latent vector is altered.
Early mapping layers provide fine-grained control, while generator network modifications influence both style and structure.
Later generator layers tend to affect global properties such as coloration.

Figure 2: Progressive replacement of activation functions in the mapping network of StyleGAN2 yields incremental changes in facial structure and style.

Figure 3: Parametric activation functions in both mapping and generator networks of StyleGAN2 produce compounded effects on image structure and style.

The results demonstrate that small parameter adjustments lead to subtle output changes, while larger modifications can produce dramatic, often unpredictable, alterations. This supports the claim that parametric activation functions offer a viable mechanism for interactive control, albeit with a trade-off between precision and creative exploration.

Empirical Results: BigGAN

BigGAN experiments show that:

Applying SinLU or ReLUN to early layers can alter image content, though the effects are less pronounced than in StyleGAN2.
Polynomial activation functions are highly sensitive; using them in more than two layers often degrades output quality, and small parameter changes can have outsized effects.
Figure 4: BigGAN outputs with SinLU and ReLUN applied to the second layer demonstrate content variation with randomly chosen parameters.

Figure 5: BigGAN outputs with third-order polynomial activation functions; increasing the number of modified layers amplifies output divergence.

The instability of polynomial activations is attributed to gradient explosion and increased training complexity, consistent with prior literature. The authors restrict parameter ranges to mitigate these effects, but the method remains fragile.

Practical and Theoretical Implications

The proposed system advances the state of interactive model manipulation by exposing activation function parameters as a control axis. This has several implications:

Educational Utility: Non-expert users can develop intuition about network internals and their influence on output, potentially increasing AI literacy.
Creative Exploration: Artists and designers gain a new tool for generative experimentation, enabling the production of both realistic and abstract imagery.
xAI Research: The approach complements existing xAI techniques by providing a direct, interpretable mechanism for output control.

However, the lack of feature-specific control and reliance on trial-and-error limits the precision of the method. The authors note that guided control, possibly via integration with text-to-image models or feature disentanglement techniques, could enhance usability.

Limitations and Future Directions

Key limitations include:

Imprecision: Unguided parameter adjustment does not guarantee targeted output changes.
Sensitivity: Some activation functions (notably polynomials) are highly sensitive to parameter values, risking output degradation.
Scalability: The approach may not generalize well to very deep or complex architectures without additional safeguards.

Future work could explore:

Automated parameter tuning via optimization or reinforcement learning
Integration with feature attribution methods for more targeted control
Extension to multimodal generative models (e.g., text-to-image diffusion models)
User studies to quantify educational and creative benefits

Conclusion

This paper introduces a system for interactive control of image generative models via parametric activation functions, demonstrated on StyleGAN2 and BigGAN. The method enables real-time exploration of architectural modifications and their impact on output, supporting both educational and creative use cases. While the approach offers a novel axis of control, its precision is limited by the lack of feature-specific guidance and the sensitivity of certain activation functions. Future research should address these limitations and evaluate the system's utility in broader contexts.

Markdown Report Issue