Graph Conditional Flow Models

Updated 16 May 2026

Graph conditional flow is a deep generative modeling approach that maps simple noise to complex graph structures conditioned on additional attributes or relational schemas.
It integrates continuous ODE/SDE frameworks with graph neural networks to ensure permutation equivariance and capture both local and global graph dependencies.
Recent models like GraphCFM, PIFM, and DeFoG showcase impressive performance in applications such as relational data synthesis, graph reconstruction, and scene graph generation.

Graph conditional flow refers to a family of deep generative modeling frameworks in which the stochastic or deterministic transformation of random variables into structured graph data is conditioned on another graph, set of node/edge attributes, context, or relational schema. These models combine flow-based approaches (invertible or denoising ODEs/SDEs, discrete or continuous) with message-passing or neural architectures that exploit and encode the symmetries and dependencies inherent in graphs. Conditioning mechanisms allow the generation, imputation, or reconstruction of graph-structured outputs that respect both local attributes and global relational constraints. Recent advances include graph conditional flow matching (GraphCFM), prior-informed flow matching (PIFM), discrete flow matching (DeFoG), and variations tailored to tasks such as relational data synthesis, graph reconstruction, and scene graph generation.

1. Mathematical Foundations of Graph Conditional Flow

Graph conditional flow models center on learning transport maps or stochastic processes that evolve a simple noise distribution (e.g., Gaussian, masked, or uniform prior) to a target distribution over graphs, subject to conditioning information. The process may be governed by continuous-time ODEs or SDEs, or by discrete-time Markov chains.

Continuous-time flow models: The process is parameterized as an ODE,

$\frac{d\phi_t(\mathbf{X})}{dt} = v_t(\phi_t(\mathbf{X}), \mathcal{G}),$

where $\phi_t$ maps the data from noise ( $t=0$ ) to data ( $t=1$ ), $v_t$ is a learnable velocity field, and $\mathcal{G}$ is the conditioning graph (e.g., foreign-key schema or observed partial adjacency) (Scassola et al., 21 May 2025, Chen et al., 29 Jan 2026).

Discrete flow matching: For discrete-valued graphs, the flow is defined over the joint state of nodes and edges by a continuous-time Markov chain (CTMC), with a learnable rate matrix $R_t$ that reverses the noising kernel (Qin et al., 2024).
Hybrid continuous-discrete flows: Some models, such as for scene graphs, employ a mixture of ODEs for continuous attributes (geometry) and discrete-time Markov flows for categorical node or edge labels (Hu et al., 18 Apr 2026).

Conditioning is incorporated either by explicit graph structure (as in GraphCFM), embedding-based priors (as in PIFM), or context features (e.g., image, class label, or subgraph).

2. Conditioning Mechanisms and Graph Representation

Conditional flow models for graphs encode both the structure and attributes of graphs in forms suitable for neural processing, with explicit handling of types, relationships, and permutation invariances.

Relational schemas as graphs: Relational databases are encoded by mapping each record as a node labeled by its table, and foreign-key dependencies as typed edges. This enables message passing over arbitrary relational structures (Scassola et al., 21 May 2025).
Observed or prior graphs: When reconstructing graphs from partial observations, conditioning can be via a masked adjacency matrix, node embeddings from graphon models, or inductive GNN embeddings. These serve as priors for the unobserved parts (Chen et al., 29 Jan 2026).
Hybrid attribute graphs: Scene graphs associate each node and edge with both continuous (e.g., bounding box) and discrete (e.g., semantic label) attributes, using hybrid flow architectures (Hu et al., 18 Apr 2026).

All models enforce or respect the symmetries of the space of graphs—most notably permutation equivariance, which ensures that predictions do not depend on the arbitrary ordering of nodes.

3. Neural Architectures for Conditional Flow on Graphs

Graph conditional flow models leverage advanced neural graph architectures to encode and process the input and conditioning structures.

GNN-Conditioned Denoisers: In GraphCFM, a heterogeneous GNN processes noisy node features along with the type- and edge-specific foreign-key graph to produce context embeddings. Subsequent table- and column-specific MLPs predict the clean record features for each node (Scassola et al., 21 May 2025).
Permutation-Equivariant GNNs: PIFM designs the main flow network $f_\theta$ to be permutation-equivariant with respect to node orderings via graph-convolutional message passing, sometimes with additional time injection using FiLM or positional encodings (Chen et al., 29 Jan 2026).
Graph Transformers: In scene graph generation (FlowSG), the denoiser is a stack of Transformer blocks operating on a hybrid discrete-continuous graph, exploiting attention over both edges and nodes with additional multi-modal context from images (Hu et al., 18 Apr 2026).
Discrete Flow Matching Models: DeFoG employs a graph transformer with node- and edge-level random-walk positional features. Conditioning information is injected via FiLM-style modulation or concatenation at the network input (Qin et al., 2024).

4. Learning Objectives and Theoretical Guarantees

Objective functions in graph conditional flow models are tightly linked to the underlying continuous transport or discrete CTMC.

Variational flow-matching loss: In GraphCFM, the network is trained to minimize the negative log-likelihood for reconstructing the clean data from noisy observations, using cross-entropy for categorical and squared error for continuous features. The variational approach factorizes over nodes and features (Scassola et al., 21 May 2025).
Rectified flow-matching loss: PIFM uses mean-squared error to match the predicted velocity field to the ground-truth direction in the linear interpolation from prior to ground-truth adjacency (Chen et al., 29 Jan 2026).
Discrete flow-matching cross-entropy: DeFoG's objective is the expected cross-entropy between the predicted clean posteriors and the true clean labels, supporting both node and edge variables (Qin et al., 2024).
Hybrid loss for mixed-type graphs: Scene graph flows combine ODE-based flow-matching for continuous state (geometry) and cross-entropy for discrete CTMC slots (semantic tokens), jointly optimized (Hu et al., 18 Apr 2026).

Theoretical results include:

Guarantees that Euler-discretized flows approach the ground-truth data distribution in total variation as the step size and network error decrease (Qin et al., 2024).
Proofs of permutation invariance for the joint density under equivariant architectures (Chen et al., 29 Jan 2026).

5. Inference and Sampling Procedures

The sampling procedure in graph conditional flow models reconstructs or generates new samples by running the reverse flow from a simple prior to the data distribution, always conditioned on the relevant graph or context.

ODE-based sampling: GraphCFM and PIFM solve the underlying ODE (using, e.g., Euler steps) to migrate from a sampled noise vector (or prior estimate) to a sample matching the data distribution, at each step recomputing the velocity field with the trained GNN (Scassola et al., 21 May 2025, Chen et al., 29 Jan 2026).
Discrete Euler integration: DeFoG simulates the reverse CTMC with independent-dimension Euler steps, using conditional or unconditional rate matrices, with various options for guidance, stochasticity, and initial distribution (Qin et al., 2024).
Hybrid ODE–CTMC integration: In FlowSG, hybrid inference integrates geometry with an ODE solver and semantics with a CTMC step, handling both in a time-coupled fashion conditioned on the evolving graph state and outside context (Hu et al., 18 Apr 2026).
Loop-guided multi-flow SDEs: The Twigs framework generalizes to multiple co-evolving flows (trunk for structure, stems for properties), with loop guidance by which stem gradients guide the trunk SDE at each step (Mercatali et al., 2024).

6. Applications and Empirical Performance

Graph conditional flow models achieve strong results across diverse domains requiring faithful and flexible modeling of structured, relational, or partially observed graph data.

Relational data synthesis: GraphCFM demonstrates state-of-the-art fidelity in generating multi-table relational databases, capturing complex foreign-key dependencies, multi-parent, and multi-type relations (Scassola et al., 21 May 2025).
Graph reconstruction and inpainting: PIFM is effective in reconstructing graphs from partially observed adjacencies, outperforming classical embedding techniques and generative baselines in ROC-AUC and perceptual metrics, especially as observation sparsity increases (Chen et al., 29 Jan 2026).
Scene graph generation: FlowSG achieves improved predicate recall, mean recall, and graph-level consistency, validating flow-based progressive denoising of hybrid-attribute graphs (Hu et al., 18 Apr 2026).
Molecular and pathology graph generation: DeFoG attains SOTA performance in validity, uniqueness, novelty, and MMD on synthetic, molecular, and digital pathology graphs with a fraction of the diffusion steps required by previous methods. Conditional generation accuracy (e.g., for graph attributes) is notably high (Qin et al., 2024).
Time series with conditional dependencies: Graph Neural Flows unveil Bayesian-network style interactions among system components, yielding improved accuracy and likelihood in forecasting and classification under irregular sampling (Mercatali et al., 2024).

7. Limitations, Extensions, and Open Problems

While graph conditional flow frameworks show notable flexibility and accuracy, current limitations include scalability to extremely large graphs, extension to heterogeneous and discrete-state settings (in some models), and richer multimodal and hierarchical conditioning.

Scalability: Subgraph sampling addresses partial scalability, but end-to-end generation for million-node graphs remains challenging (Chen et al., 29 Jan 2026).
Type and state space generality: Some methods are formulated only for binary, undirected, or homogeneous graphs; extension to arbitrary edge/vertex types and categorical flows is a subject of further work (Qin et al., 2024, Chen et al., 29 Jan 2026).
Conditional sequence and hybrid setups: Advanced conditioning mechanisms (such as in Twigs/loop guidance or classifier-free guidance in DeFoG) are promising for richer inverse design and controlled generation tasks (Qin et al., 2024, Mercatali et al., 2024).
Theoretical identifiability and optimal transport: Existing approaches rely on certain conditional independence or optimal transport assumptions, which may not always hold for real data; this is a source of open technical questions (Chen et al., 29 Jan 2026).

Graph conditional flow thus represents a unifying and extensible paradigm for conditional generative modeling over graph domains, with a spectrum of architectures and theoretical underpinnings enabling high-fidelity, structure-aware synthesis and reconstruction.