FitNets: An Adaptive Framework to Learn Accurate Traffic Distributions (2405.10931v2)
Abstract: Learning precise distributions of traffic features (e.g., burst sizes, packet inter-arrival time) is still a largely unsolved problem despite being critical for management tasks such as capacity planning or anomaly detection. A key limitation nowadays is the lack of feedback between the control plane and the data plane. Programmable data planes offer the opportunity to create systems that let data- and control plane to work together, compensating their respective shortcomings. We present FitNets, an adaptive network monitoring system leveraging feedback between the data- and the control plane to learn accurate traffic distributions. In the control plane, FitNets relies on Kernel Density Estimators which allow to provably learn distributions of any shape. In the data plane, FitNets tests the accuracy of the learned distributions while dynamically adapting data collection to the observed distribution fitness, prioritizing under-fitted features. We have implemented FitNets in Python and P4 (including on commercially available programmable switches) and tested it on real and synthetic traffic traces. FitNets is practical: it is able to estimate hundreds of distributions from up to 60 millions samples per second, while providing accurate error estimates and adapting to complex traffic patterns.
- Barefoot tofino.
- The CAIDA UCSD anonymized internet traces 2018.
- CONGA: Distributed congestion-aware load balancing for datacenters. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, pages 503–514. ACM. event-place: Chicago, Illinois, USA.
- P4: Programming protocol-independent packet processors. 44(3):87–95.
- Kernel density estimation via diffusion. 38(5):2916–2957.
- B. Claise. Cisco systems NetFlow services export version 9. Published: Internet Requests for Comments.
- Estimating flow distributions from sampled flow statistics. In Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 325–336, 2003.
- Strictly proper scoring rules, prediction, and estimation. 102(477):359–378.
- Sonata: Query-driven network telemetry.
- Sketchlearn: Relieving user burdens in approximate measurement with automated statistical inference. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM ’18, pages 576–590. ACM. event-place: Budapest, Hungary.
- SciPy: Open source scientific tools for Python.
- One sketch to rule them all: Rethinking network flow monitoring with UnivMon. In Proceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM ’16, pages 101–114. ACM.
- Language-directed hardware design for network performance monitoring. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 85–98. ACM.
- Tommy Odland. tommyod/KDEpy: Kernel density estimation in python.
- InMon corporation’s sFlow: A method for monitoring traffic in switched and routed networks. Published: Internet Requests for Comments.
- Aggregation and degradation in JetStream: Streaming analytics in the wide area. pages 275–288.
- Towards optimal sampling for flow size estimation. In Proceedings of the 8th ACM SIGCOMM conference on Internet measurement, pages 243–256, 2008.
- Kernel smoothing. Number 60 in Monographs on statistics and applied probability. Chapman & Hall, 1st ed edition.
- Elastic sketch: Adaptive and fast network-wide measurements. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 561–575. ACM.
- Software defined traffic measurement with OpenSketch. In NSDI, volume 13, pages 29–42.
- AWStream: Adaptive wide-area streaming analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM ’18, pages 236–252. ACM.