PackFlow: Multi-Domain Flow Modeling

Updated 27 February 2026

PackFlow is a multi-domain framework that models flows—ranging from packet streams and atomic arrangements to network queues—as fundamental units for analysis.
It employs specialized methods such as set-tree gradient boosting for rapid DDoS detection, ODE-based generative sampling for crystal structures, and stochastic ODE models for network performance.
Empirical results show high detection accuracy with minimal overhead, improved energy predictions in crystals, and provably stable performance in network traffic scenarios.

PackFlow refers to multiple distinct frameworks in network modeling, data-driven DDoS detection, and molecular crystal structure prediction. The term captures modeling paradigms that view “flows” as fundamental composition units—either streams of network packets or atomic arrangements—while emphasizing structured data, optimal control, or generative learning. This entry presents the major PackFlow models spanning these domains, focusing on their mathematical structures, algorithmic methods, and empirical performance.

1. Stream-Structured DDoS Detection with PackFlow

PackFlow is a tree-based intrusion detection system that models each network flow as an ordered, variable-length stream of packet-header vectors, departing from fixed-size, statistics-aggregated records. Each flow $F$ is represented as a sequence $F = (p_1, p_2, ..., p_n)$ of $d$ -dimensional vectors ( $d=15$ in experiments), capturing features such as directionality (Is_forward), protocol metadata (src_port, dst_port, protocol), position in stream (index), timing (ts, IAT variants), and TCP flags (SYN, ACK, RST) (Giryes et al., 2024).

Model Architecture: Set-Tree Decision Trees

Detection architecture employs the Set-Tree model, a gradient-boosted ensemble of $M=10$ decision trees of maximum depth 10. Nodes split using “set-compatible” tests:

$g_t(F) = |F|^{-\beta} \sum_{p\in F} (p_j)^\alpha \geq \theta$

where $j$ is a feature index, $\alpha, \beta$ are exponents, and $\theta$ is a threshold. Special cases correspond to mean, sum, and harmonic mean splits. Each node computes an “attention set” $\mathcal{A}_t(F)$ and may focus splits recursively on these significant subsets. Learning seeks node splits that maximize Gini reduction, and the predictions are combined via gradient boosting. Attention history at each tree path is limited to the five most recent node outputs.

Training and Detection Protocol

The model is trained as a binary classifier (Benign vs. Attack) using logistic loss:

$L = \sum_{\ell=1}^N \log(1 + \exp(-y_\ell \cdot h(F_\ell)))$

where $y_\ell \in \{+1, -1\}$ is the true label. Early detection is evaluated by presenting the model with flow prefixes $F^{(k)} = (p_1, ..., p_k)$ for various $k$ , without retraining separate models per prefix size.

Evaluation and Benchmarking

PackFlow is evaluated on CICDDoS2019 ( $\sim$ 50M attack, 0.1M benign flows, 248M packets) and CICIDS2017 (2.8M flows, 55.6M packets) datasets. Performance metrics include Recall, Precision, $F_1$ , Accuracy, time-saving ( $\mathrm{TS}\%$ ), and traffic volume overhead. Table 1 summarizes key empirical findings:

Dataset	$k$ -pkts	Acc	TS (%)
CICDDoS2019	full	0.99980	0.0
	2	0.99975	99.79
CICIDS2017	full	0.99651	0.0
	2	0.80984	97.2
	4	0.96756	75.3
	14	0.99252	19.8

PackFlow matches or outperforms prior ML/DL methods (CyDDoS, DDoSNet, 1D-CNN+LSTM) in accuracy and recall, while supporting detection after only $2$ packets on median flows and operating with a raw header traffic overhead of $<6\%$ (Giryes et al., 2024).

Feature Importance and Scaling

Feature contributions (Gini index) identify “length” and SYN/RST flag indicators as dominant DDoS cues; inter-packet arrival times and TCP window sizes are minor. Computational overhead of Set-Tree over standard trees is estimated at $\sim2\times$ , with typical runtime of a few seconds on 80K flows.

2. PackFlow for Generative Molecular Crystal Structure Prediction

PackFlow is also the name of a generative framework for molecular crystal structure prediction (CSP), based on flow-matching generative models aligned with physics-informed reinforcement learning (RL). Given a molecular graph, PackFlow jointly samples all heavy-atom Cartesian coordinates within a unit cell and the six lattice parameters $(a,b,c,\alpha,\beta,\gamma)$ (Subramanian et al., 23 Feb 2026).

Generative Architecture and Flow Matching

Atoms are embedded as tokens

$h_i = \phi_x(x_{t,i}) + \phi_{\mathrm{node}}(f_i) + \phi_\ell(\ell_t) + \phi_{t_x}(t_x) + \phi_{t_\ell}(t_\ell)$

with $x_{t,i} \in \mathbb{R}^3$ noisy coordinates, $f_i$ is an RDKit feature vector, and $\ell_t \in \mathbb{R}^6$ is the lattice at annealing time $t_\ell$ . A Transformer encoder integrates covalent bond graph biases, and the network predicts vector fields for coordinates and lattice, enabling ODE-based sampling from a Gaussian prior.

Flow matching is achieved by interpolating between data and noise distributions for both coordinates and lattice, with a loss:

$\mathcal{L}_{\mathrm{FM}}(\theta) = \mathbb{E}_{(x_0,\ell_0), t_x, t_\ell, \varepsilon}\left[ \frac{1}{N}\left\|v_{x,\theta}(x_t, \ell_t, t_x, t_\ell) - (\varepsilon_x - x_0)\right\|_2^2 + \lambda_\ell \left\|v_{\ell,\theta}(x_t, \ell_t, t_x, t_\ell) - (\varepsilon_\ell - \ell_0)\right\|_2^2 \right]$

packflow samples are then deterministically generated via ODE integration from noise.

Physics Alignment via Reinforcement Learning

A post-training “physics alignment” (PA) step applies RL to encourage low-energy, clash-free structures, leveraging ML-interatomic-potential (MLIP) proxies:

Rewards: negative lattice energy and average force norm, standardised and mixed via $A_\lambda^{(k)}$ .
Updates: policy improvement via Group Relative PPO (GRPO) loss with KL divergence, using flow-matching loss as a negative log-probability proxy.

This physically aligns the generative model without modifying the deterministic inference procedure, improving energy ranking and physical plausibility.

Integration in CSP Pipelines

PackFlow fits within standard CSP workflows:

Heavy-atom proposal generation for a given composition.
Hydrogen placement (RDKit 3D builder).
H-only relaxation (frozen heavy atoms, MLIP relaxation).
Full relaxation (all atoms, FIRE optimizer, MLIP).
Lattice energy computation and ranking.
Polymorph selection within target energy windows.

Pseudocode for the complete protocol is directly outlined in (Subramanian et al., 23 Feb 2026).

Empirical Performance

On $\sim$ 37K unseen crystals, PackFlow outperforms Genarris baselines with lower density error (4.7% vs. 27.9/19.8%), improved AMD $L_\infty$ (0.286 vs. 0.903/0.541), lower RDF Wasserstein distances, and comparable inference speed (0.13s/sample). Physics alignment further reduces clash rates and controls force/energy trade-offs continuously via $\lambda$ parameter. In CSP blind tests, PackFlow finds candidates within a few kJ/mol of experimental polymorphs within 100 samples (Subramanian et al., 23 Feb 2026).

Implementation and Limitations

Key features include independent coordinate/lattice flows, dense bond-graph attention, canonical unwrapping, and reliance on MLIP for relaxation and reward calculation. The approach is so far limited to homomolecular crystals, does not enforce explicit space-group symmetry, and adopts an $O(N^2)$ scaling with atom count due to the dense attention mechanism.

3. PackFlow in Stochastic Packet/Flow Network Modeling

In the context of network performance analysis, “PackFlow” (as Editor's term summarizing the framework in (Moallemi et al., 2010)) unifies packet-level queueing and flow-level utility-based control in a two-time-scale stochastic model of packet-switched networks.

Modeling Framework

Nodes: finite set of directed links $L$ and end-to-end flow types $F$ .
Queues: FIFO per $(\text{link}, \text{destination})$ in $E = L \times V$ .
Flows: Poisson flow-level arrivals $\nu_f$ , random packet generation, finite lifetime (mean $1/\mu_f$ packets per flow).
Packets: Generated by flows and injected into ingress queues $e_f$ .

Scheduling and Dynamics

Per-slot scheduling $\pi(\tau)\in\mathcal{S}\subset\{0,1\}^E$ enforces constraints.
Queue evolution:

$Q(\tau+1) = Q(\tau) - (I - R^T)S(\tau) + (I - R^T)Z(\tau) + \Gamma A(\tau+1)$

with routing matrix $R$ , schedule count $S(\tau)$ , idling process $Z(\tau)$ , and flow packet arrivals $A(\tau+1)$ .

Congestion Control and Control Policy

Flows maximize $\alpha$ -fair utility $U_f(x_f) = x_f^{1-\alpha}/(1-\alpha)$ under link constraints.
Dual variables $p_\ell \geq 0$ represent link shadow prices.
At each slot, maximal weight $MWUM-\alpha$ scheduling is used, optimizing a weighted sum of queue lengths.

Fluid Limit and Stability

In the large-scale limit, packet/flow dynamics converge to a deterministic ODE system whose fixed points solve a convex program minimising a Lyapunov function $L_\alpha(n,q)$ . The framework is shown to be throughput optimal (positive recurrent under load constraints), to approach cost-optimality within a balance-factor, and to globally attract trajectories to the invariant manifold corresponding to optimal resource allocation.

4. Comparative Analysis and Empirical Results

PackFlow in DDoS detection and molecular crystal prediction both leverage stream or flow-structured data, but differ fundamentally:

Domain	Input Structure	Learning Method	Evaluation Data	Empirical Strengths
DDoS Detection (Giryes et al., 2024)	Stream of packet headers	Set-Tree Gradient Boost	CICDDoS2019, CICIDS2017	Very early accurate detection, low overhead
Molecular CSP (Subramanian et al., 23 Feb 2026)	Atom + lattice tensors	Flow Matching + RL	Unseen crystals, CSP blind tests	Physically-plausible samples, lower energy proposals
Packet/Flow Net Modeling (Moallemi et al., 2010)	Queues/flows (stochastic)	Stochastic/ODE modeling	Analytical, simulative	Proven throughput and stability guarantees

PackFlow’s empirical results in their respective domains show either superior or equivalent performance to deep learning and heuristic contemporaries, with provable or measured gains in detection speed, candidate quality, and efficiency.

5. Limitations, Extensions, and Prospective Directions

PackFlow’s instantiations highlight several limitations:

DDoS model coverage bounded by header field diversity and static tree designs.
Molecular CSP variant currently restricted to homomolecular crystals, omitting explicit space-group symmetry and detailed thermodynamic modeling.
Stochastic network model omits detailed traffic and protocol heterogeneities.

Potential future directions include extension to co-crystals, equivariant or symmetry-enforcing generative models, large-scale corpus training, sparse attention in atom-rich scenarios, and inclusion of explicit free-energy and kinetic effects in crystal structure ranking. In DDoS detection, incorporating deeper temporal or multi-modal representations and adapting model structures for higher traffic diversity are prospective avenues.

PackFlow, across its major incarnations, demonstrates the power of flow (stream) structure modeling for high-precision detection, efficient generative tasks, and mathematically rigorous network analysis (Giryes et al., 2024, Subramanian et al., 23 Feb 2026, Moallemi et al., 2010).

Markdown Report Issue Upgrade to Chat

References (3)

A Flow is a Stream of Packets: A Stream-Structured Data Approach for DDoS Detection (2024)

PackFlow: Generative Molecular Crystal Structure Prediction via Reinforcement Learning Alignment (2026)

On the Flow-level Dynamics of a Packet-switched Network (2010)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PackFlow.

PackFlow: Multi-Domain Flow Modeling

1. Stream-Structured DDoS Detection with PackFlow

Model Architecture: Set-Tree Decision Trees

Training and Detection Protocol

Evaluation and Benchmarking

Feature Importance and Scaling

2. PackFlow for Generative Molecular Crystal Structure Prediction

Generative Architecture and Flow Matching

Physics Alignment via Reinforcement Learning

Integration in CSP Pipelines

Empirical Performance

Implementation and Limitations

3. PackFlow in Stochastic Packet/Flow Network Modeling

Modeling Framework

Scheduling and Dynamics

Congestion Control and Control Policy

Fluid Limit and Stability

4. Comparative Analysis and Empirical Results

5. Limitations, Extensions, and Prospective Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

PackFlow: Multi-Domain Flow Modeling

1. Stream-Structured DDoS Detection with PackFlow

Model Architecture: Set-Tree Decision Trees

Training and Detection Protocol

Evaluation and Benchmarking

Feature Importance and Scaling

2. PackFlow for Generative Molecular Crystal Structure Prediction

Generative Architecture and Flow Matching

Physics Alignment via Reinforcement Learning

Integration in CSP Pipelines

Empirical Performance

Implementation and Limitations

3. PackFlow in Stochastic Packet/Flow Network Modeling

Modeling Framework

Scheduling and Dynamics

Congestion Control and Control Policy

Fluid Limit and Stability

4. Comparative Analysis and Empirical Results

5. Limitations, Extensions, and Prospective Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research