Data-Driven Networks Approach
- Data-driven networks are strategies that use operational data and ML to infer latent structures, forecast dynamic conditions, and overcome limitations of traditional models.
- They employ techniques such as unsupervised clustering, graph neural networks, and reinforcement learning to automate resource allocation and optimize control policies.
- This approach improves network resilience and performance in complex environments, supporting applications from 5G core networks to large-scale cyber-physical systems.
A data-driven networks approach refers to the systematic use of operational measurements and ML techniques to analyze, optimize, and control complex networked systems—ranging from communication infrastructures (such as 5G core networks) to distributed cyber-physical systems and large-scale dynamical networks—without reliance on a priori analytic or simulation models. This paradigm stands in contrast to classical model-based methods by directly leveraging observed traces, logs, and performance counters to infer latent structures, forecast network states, and enact adaptive policies. Across domains, the data-driven approach enables robust network management under highly dynamic, uncertain, or nonlinear conditions, and generalizes to both architectural design (e.g., topologies, resource allocation) and operational control (e.g., traffic engineering, consensus, anomaly detection) (Manias et al., 2022, Hope et al., 2021, Samari et al., 2024, Caro-Ruiz et al., 2017).
1. Principles: Data-Driven vs. Model-Based Methodologies
Traditional model-based approaches require the formulation of accurate closed-form models describing all relevant network phenomena, such as queuing, channel propagation, or protocol behaviors. These models are often brittle under non-stationarity and may fail to capture the full range of real-world operating conditions, especially in mobile, large-scale, or highly heterogeneous networks.
In contrast, the data-driven approach dispenses with explicit modeling:
- Direct use of measurements: Operational logs, packet and event traces, and temporal performance indicators are ingested and form the empirical basis for analysis and optimization.
- Statistical and ML inference: Modern learning techniques (e.g., unsupervised clustering, deep learning, graph neural networks, reinforcement learning) are employed to extract patterns, detect anomalies, and anticipate future conditions.
- Robustness and adaptivity: Data-driven algorithms adapt to sudden traffic surges, topology changes, and device heterogeneity, accommodating effects that are impractical to model comprehensively.
This paradigm enables the discovery of latent network structure (e.g., clustering of function-to-function traffic in a 5G core), learning control policies in environments where node behaviors are unknown, and proactive optimizations grounded in observed, rather than hypothesized, dynamics (Manias et al., 2022, Samari et al., 2024).
2. Architectures and Systems Implementing Data-Driven Networks
Network Data Analytics Function (NWDAF) in 5G Core Networks
- The 3GPP-specified NWDAF is a logical network function aggregating events, KPIs, and traffic data from across core network functions (NFs). Key components include:
- NWDAF analytics engine: Hosts clustering/forecasting algorithms.
- Data repository: Retains event logs and measurements (e.g., MongoDB).
- Kafka pipeline: Ingests real-time packet-level data for subsequent ML analysis.
- Interface Nnwdaf: Exposes standardized APIs (e.g., Nnwdaf_AnalyticsInfo) for consuming analytics results.
- Operational packet capture is implemented via hypervisor-based port mirroring, ensuring both real-time and historical data coexist for online and retrospective analytics.
Graph-Driven Policy Architectures
- In domains such as data-driven traffic engineering, Graph Neural Networks (GNNs) provide a natural substrate for learning control policies that respect network topology and can generalize across topological changes (Hope et al., 2021).
Symbolic and Divide-and-Conquer Compositional Architectures
- For large-scale or infinite networks with unknown agents and topologies, data-driven compositional methods build symbolic discrete-domain models of each subsystem using data from local trajectories, sidestepping monolithic modeling and enabling tractable synthesis and safety guarantees (Samari et al., 2024, Zaker et al., 15 Jul 2025).
3. Data Acquisition, Feature Engineering, and Preprocessing
Measurement Collection
- Continuous packet and event capture: E.g., over 170,000 packets in 138 minutes, filtered to only NF-to-NF interactions (Manias et al., 2022).
- Real-time mirroring for live inference and retrospective querying.
Feature Engineering
- For each source-destination (NF_i, NF_j) pair, extraction of:
- n: Total number of packets exchanged.
- L₁, L₂, …, L_n: Individual packet lengths.
- : Average packet length; : Maximum packet size; : Standard deviation.
- Features are min-max scaled or standardized (zero mean, unit variance) for ML suitability.
- Similar pipelines extend to joint feature construction in multi-input systems, e.g., traffic history matrices in GNN-based routing (Hope et al., 2021).
4. Machine Learning and Inference Techniques
Unsupervised Structure Discovery
- k-means clustering groups NF-to-NF feature vectors by minimizing within-cluster Euclidean variance. The number of clusters is tuned based on interpretability and silhouette score metrics, enabling the identification of both high-traffic ("heavy") and idle ("null") NF pairs (Manias et al., 2022).
- Heatmaps and cluster analysis extract operational insights (e.g., asymmetric traffic profiles due to protocol payloads, concentration of registration traffic).
Policy Learning and Control
- In routing, GNN architectures parameterize per-edge (or per-node) policies that are optimized via deep reinforcement learning (PPO), with rewards tied to downstream congestion metrics compared to linear programming optima (Hope et al., 2021).
- Iterative multi-message-passing architectures require ≥3 rounds for effective encoding of route state and generalization.
- Divide-and-conquer strategies build local symbolic models, aggregate them with scenario-based data-driven functions (e.g., alternating sub-bisimulation), and synthesize compositional controllers with statistical guarantees, eliminating the need for small-gain computation or explicit knowledge of interconnection topology (Samari et al., 2024).
Proactive and Autonomous Optimization
- Automated mapping of heavy-cluster NF pairs to co-location or bandwidth-priority measures.
- Predictive scaling of highly variable NFs avoids future bottlenecks.
- Dynamic migration decisions for under-utilized NF instances to optimize host resource utilization.
5. Applications and Case Studies
| Application Domain | Data-Driven Approach Elements | Key Outcome |
|---|---|---|
| 5G Core (NWDAF) (Manias et al., 2022) | Packet capture, feature extraction, unsupervised clustering, Kafka-ML stack | Automated resource balancing |
| Data-Driven Routing (Hope et al., 2021) | GNN policy, RL optimization, transfer to unseen topologies | Near-optimal congestion min. |
| Symbolic Control in Unknown Networks (Samari et al., 2024) | Local scenario-based symbolic models, compositional bisimulation | Formal correctness, scalability |
| Event-driven Consensus (Renganathan et al., 2022) | Probabilistic trust estimation, history-driven weighting | Consensus, attack robustness |
This approach accommodates evolving traffic, topology, and device heterogeneity found in contemporary and future networks, offering both adaptive intelligence and formal performance/robustness properties.
6. Generalization, Limitations, and Future Perspectives
Data-driven network analysis and control generalizes naturally across technological domains. For example, the NWDAF ingestion/analytics pipeline is directly extensible to radio-access/C-RAN domains, containerized edge environments, and IoT protocols by modifying event schemas and feature sets (Manias et al., 2022). Techniques such as anomaly detection, drift analysis, or advanced time-series forecasting (RNNs/transformers) are earmarked for ongoing enhancements.
Limitations:
- Ground truth and feature availability constrain the granularity and domain of some analyses.
- ML methods require robust, representative operational data for accuracy.
- Feature selection and scaling critically impact clustering and policy learning results.
- Some pipelines omit direct application of advanced techniques (e.g., DBSCAN for nonconvex clusters is not always empirically implemented).
Future Directions:
- Deep integration with edge/cloud-native telemetry.
- Online anomaly and drift detection for enhanced resilience.
- Cross-technology deployment by reuse of event ingestion and analytics stacks in new domains.
- Time-series based proactive slice reconfiguration and self-optimization.
- Hierarchical and explainable learning models to improve scalability and interpretability in massive-scale networks (Manias et al., 2022, Samari et al., 2024).
7. Significance and Impact
Data-driven network approaches provide a scalable, flexible alternative to brittle model-based methods for analyzing, optimizing, and controlling both communication and generalized dynamical networks. By accommodating operational complexity and enabling automation, these approaches underpin autonomy in modern network management—evident in 5G core systems and forward-looking towards future edge, RAN, and cross-domain deployments (Manias et al., 2022, Hope et al., 2021, Samari et al., 2024). The approach's extensibility to both structure and policy spaces ensures its centrality in ongoing network systems research.