Conditioned Normalizing Flows Overview

Updated 1 December 2025

Conditioned normalizing flows are invertible generative models that learn the conditional density p(y|x) through bijective mappings and tractable Jacobian determinants.
They integrate conditioning at every transformation layer, enabling structured prediction, uncertainty quantification, and efficient sampling in high-dimensional settings.
They are applied in inverse problems, time-series forecasting, scientific inference, and fairness, offering competitive performance compared to GANs and VAEs.

Conditioned normalizing flows are a class of invertible generative models designed to learn conditional probability densities of the form $p(y\mid x)$ , where $x$ denotes side information or a conditioning variable and $y$ is the target or output variable. By parameterizing bijective mappings $f_\theta$ —with efficient and tractable Jacobian determinants—these models generalize standard normalizing flows to the conditional setting, enabling efficient sampling and density estimation for complex, high-dimensional conditional distributions. Conditioned normalizing flows have become a foundational tool for structured prediction, probabilistic inference in inverse problems, scientific modeling, and real-world systems where uncertainty quantification and exact likelihood evaluation are critical.

1. Mathematical Formulation of Conditioned Normalizing Flows

Let $x \in \mathcal{X}$ be the conditioning variable and $y \in \mathcal{Y}$ the target variable. A conditioned normalizing flow (often abbreviated as CNF) posits an invertible map $f_\theta : \mathcal{Y} \times \mathcal{X} \to \mathcal{Z}$ parameterized by $\theta$ . For each $x$ , the map $f_\theta$ induces a bijection between $y$ and $z = f_\theta(y;x)$ , where $\mathcal{Z}$ is typically chosen so that $p_Z(z)$ is a tractable base density (e.g., a standard multivariate Gaussian).

The conditional density is given by the change-of-variables formula: $p(y|x) = p_Z(f_\theta(y; x)) \cdot \left| \det \frac{\partial f_\theta(y; x)}{\partial y} \right|.$ Alternatively, if $f_\theta^{-1}$ denotes the inverse map, then for $z = f_\theta(y; x)$ ,

$p(y|x) = p_Z(z) \cdot \left| \det \frac{\partial f_\theta(y; x)}{\partial y} \right|.$

For continuous normalizing flows, the evolution of $z(t)$ is parameterized by an ODE, and the conditional log-density accumulates the negative trace of the Jacobian of the flow field (Abdal et al., 2020).

Conditioning is incorporated at every layer of the flow, ensuring that the mapping and the density both adapt to the auxiliary information $x$ (Winkler et al., 2019, Baranchuk et al., 2021).

2. Architectural Design and Conditioning Methods

Conditioned flows are composed of sequences of invertible transformations; common choices include RealNVP-style affine coupling blocks, rational-quadratic spline flows, autoregressive (MAF/IAF), or continuous-time ODE-based flows (CNFs). In multi-resolution flows, split and hierarchical priors further increase expressivity.

Conditioning on $x$ or $c$ may occur in several places:

Base distribution: the mean and variance of the base density $p_Z(z|x)$ are parameterized as functions of $x$ (e.g., by deep networks).
Coupling or transformation blocks: affine/scale/shift and nonlinear transformation parameters are conditioned on $x$ , typically via concatenation, FiLM layers, or conditional neural network embeddings (Winkler et al., 2019, Baranchuk et al., 2021, Klein et al., 2022).
Amortization: For instance, in amortized variational inference settings, the conditioner may be a learned function of $x$ (e.g., an encoder), enabling zero-shot inference for arbitrary $x$ (Whang et al., 2020, Abdal et al., 2020).

Exactly conditioning the flow’s output is computationally intractable in general (proven NP-hard for additive-coupling flows; see (Whang et al., 2020)), motivating variational and amortized approximations.

3. Training Objectives and Algorithms

The principal training objective is the (conditional) maximum log-likelihood: $\mathcal{L}(\theta) = -\mathbb{E}_{(x, y) \sim \mathcal{D}} [\log p(y|x; \theta)]$ where

$\log p(y|x; \theta) = \log p_Z(f_\theta(y; x)) + \log \left| \det \frac{\partial f_\theta(y; x)}{\partial y} \right|.$

For inverse problems and incomplete observations, the target conditioning may be softened with Gaussian smoothing or handled with variational Bayes using ELBO surrogates. A particularly expressive approach constructs the conditional as a composition of two flows: one pre-trained unconditional flow as a prior, and a learned pre-generator flow mapping from noise to the posterior in latent space, optimizing a KL divergence objective under smoothing (Whang et al., 2020).

Training is typically performed by stochastic gradient descent (SGD, Adam) with backpropagation through all invertible layers and conditioner networks. Batch size, learning rate schedules, normalization within coupling nets, and reparameterization tricks follow domain and architecture specifics (Winkler et al., 2019, Whang et al., 2020, Baranchuk et al., 2021).

4. Methodological Variations and Inference

Several methodological variants address domain-specific needs:

Flows for flows: Both the transport map and the conditional base density are themselves parameterized by flows, enabling “flow-to-flow” mappings between arbitrary conditional distributions (Klein et al., 2022).
Boundary-conditioned flows: Hard constraints (e.g., antisymmetry of fermionic wavefunctions) are imposed by restricting the domain and using shape-adaptive priors (O-splines) and invertible I-spline bijections for analytic enforcement of boundary or symmetry conditions (Thiede et al., 2022).
Variational conditioning: For partially observed data, variational Schur complement approaches use inference networks to parameterize distributions over missing variables, optimizing ELBOs that involve sub-Jacobian determinants and equality-constrained solves in latent space (Moens et al., 2021).
Spatio-temporal flows: Coupling blocks, 3D convolutions, and ConvLSTM/backbone conditioners enable modeling of high-dimensional conditional time series or remote sensing data, with factorized priors and multi-scale latent parameterizations (Winkler et al., 2023, Rasul et al., 2020).
Distillation: Knowledge from a heavy invertible CNF can be transferred to a feed-forward student lacking invertibility and tractable likelihoods but enabling rapid inference, by matching samples in latent space under the same conditioning (Baranchuk et al., 2021).
Decorrelation: Conditioning on protected attributes enables the construction of representations or classifiers whose output distributions are independent of nuisance or sensitive variables, while preserving ROC at each fixed attribute value (Klein et al., 2022).

5. Practical Applications and Empirical Insights

Conditioned normalizing flows are deployed in a range of structured prediction, uncertainty quantification, and scientific inference tasks:

Inverse problems: Posterior inference in imaging (inpainting, compressed sensing, super-resolution, colorization) using flow-compositions and amortized (zero-shot) inference (Whang et al., 2020, Xiao et al., 2019).
Time series forecasting: High-dimensional probabilistic prediction for thousands of interdependent sequences via autoregressive and transformer-conditioned flows (Rasul et al., 2020).
Scientific and physics applications: Urban sound propagation from 2D layouts for regulatory compliance and planning, with >2000× speedup over PDE solvers and increased NLoS accuracy (Eckerle et al., 6 Oct 2025); VQMC electronic structure with hard antisymmetry constraints in quantum chemistry (Thiede et al., 2022); climate variable prediction with spatio-temporal flows for stable, calibrated long-horizon rollouts (Winkler et al., 2023).
Vision and segmentation: Super-resolution and binary segmentation calibrated with novel dequantization strategies and outperformance (bits/dim, PSNR, SSIM, F-score) over factored baselines and adversarial models (Winkler et al., 2019).
Generative editing and exploration: Attribute-controlled manipulation and disentangled sampling in StyleGAN’s latent space using continuous CNFs (Abdal et al., 2020).
Statistical fairness and decorrelation: Construction of classifiers invariant to nuisance variables in collider physics, fairness in ML, and anomaly detection, with provable retention of ROC at every attribute slice (Klein et al., 2022).

Empirical metrics include negative log-likelihood, FID, IS, PSNR, SSIM, LPIPS, MAE, MSE, CRPS, calibration scatter, and domain-specific measures such as Jensen–Shannon divergence for decorrelation. Conditioning achieves uncertainty quantification, diversity, and smoothness in conditional sampling. Performance is competitive or superior to GANs, VAEs, PixelCNN++, and specialized baselines.

6. Theoretical Hardness and Limitations

Exact conditional sampling for complex invertible flow models is proven to be NP-hard in general (even up to small total variation error), by reduction from SAT to the inversion and conditioning of additive-coupling flows. Approximate inference and amortized or variational approaches are thus required for tractable deployment (Whang et al., 2020).

Conditioned flows require careful control of architecture (e.g., partitioning and bijectivity), stability regularization (choice of noise/smoothing parameter, normalization), and the design of conditioners for deep or high-dimensional conditioning variables.

Boundary-conditioned flows depend crucially on the topological mismatch between prior and target; shape-adaptive priors and flexible parameterizations are needed to prevent failure in multimodal or constrained target distributions (Thiede et al., 2022, Dong et al., 2022).

Feed-forward distillation sacrifices invertibility and likelihood computation for inference speed and compactness, limiting utility for tasks needing exact density evaluation (Baranchuk et al., 2021).

Scalability is retained due to coupling layer structure and parallelizable conditioners; batch and spatial/temporal scaling are linear in problem size for common CNF instantiations (Rasul et al., 2020, Eckerle et al., 6 Oct 2025).

7. Summary Table: Principal Conditioned Flow Methodologies

Methodology	Conditioning Mechanism	Domain/Application
Base RealNVP/GLOW CNF	Conditioner in each coupling/prior layer	Image super-res., segmentation (Winkler et al., 2019)
Composed flows (posterior)	Pretrained base + learned conditional flow	Inverse problems, uncertainty (Whang et al., 2020)
Flows-for-flows	Flow-parameterized conditional base	Optimal transport, structured generative (Klein et al., 2022)
Variational Schur (VISCOS)	Partitioned latent inference and ELBO	Partial observations, imputation (Moens et al., 2021)
O-spline/I-spline boundary CNF	Shape-adaptive, domain-restricted priors	Fermionic wavefunction, hard constraints (Thiede et al., 2022)
Spatio-temporal CNF	ConvLSTM, 3D conv., split-prior	Remote sensing, climate, time series (Winkler et al., 2023, Rasul et al., 2020)
Decorrelation flows	Conditioning on protected/nuisance attributes	Classifier fairness, decorrelation (Klein et al., 2022)
Knowledge distillation	Flow→feed-forward student; match samples	Fast inference, speech, vision (Baranchuk et al., 2021)

All methodologies are unified by the use of invertible mappings and conditioning architectures, exact or variational likelihood training, and tractable Jacobian determinants for efficient density computation and sampling.

References

"Composing Normalizing Flows for Inverse Problems" (Whang et al., 2020)
"Real-time Prediction of Urban Sound Propagation with Conditioned Normalizing Flows" (Eckerle et al., 6 Oct 2025)
"Flows for Flows: Training Normalizing Flows Between Arbitrary Distributions with Maximum Likelihood Estimation" (Klein et al., 2022)
"Learning Likelihoods with Conditional Normalizing Flows" (Winkler et al., 2019)
"Waveflow: boundary-conditioned normalizing flows applied to fermionic wavefunctions" (Thiede et al., 2022)
"Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows" (Rasul et al., 2020)
"Distilling the Knowledge from Conditional Normalizing Flows" (Baranchuk et al., 2021)
"StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows" (Abdal et al., 2020)
"Decorrelation with conditional normalizing flows" (Klein et al., 2022)
"Normalizing Flow with Variational Latent Representation" (Dong et al., 2022)
"Viscos Flows: Variational Schur Conditional Sampling With Normalizing Flows" (Moens et al., 2021)
"Towards Climate Variable Prediction with Conditioned Spatio-Temporal Normalizing Flows" (Winkler et al., 2023)
"A Method to Model Conditional Distributions with Normalizing Flows" (Xiao et al., 2019)