Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CATO: End-to-End Optimization of ML-Based Traffic Analysis Pipelines (2402.06099v2)

Published 8 Feb 2024 in cs.NI

Abstract: Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice. Many existing approaches only optimize the predictive performance of their models, overlooking the practical challenges of running them against network traffic in real time. This is especially problematic in the domain of traffic analysis, where the efficiency of the serving pipeline is a critical factor in determining the usability of a model. In this work, we introduce CATO, a framework that addresses this problem by jointly optimizing the predictive performance and the associated systems costs of the serving pipeline. CATO leverages recent advances in multi-objective Bayesian optimization to efficiently identify Pareto-optimal configurations, and automatically compiles end-to-end optimized serving pipelines that can be deployed in real networks. Our evaluations show that compared to popular feature optimization techniques, CATO can provide up to 3600x lower inference latency and 3.7x higher zero-loss throughput while simultaneously achieving better model performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. 2023. SmartCore. https://smartcorelib.org/. (2023).
  2. 2023. Web Application Firewall documentation. https://learn.microsoft.com/en-us/azure/web-application-firewall/. (2023).
  3. 2023. Zeek. https://zeek.org/. (2023).
  4. Deep Learning for Network Traffic Monitoring and Analysis (NTMA): A Survey. In Computer Communications.
  5. Hasan Faik Alan and Jasleen Kaur. 2016. Can Android Applications Be Identified Using Only TCP/IP Headers of Their Launch Time Traffic?. In ACM Conference on Security and Privacy in Wireless Networks (WiSec).
  6. Permutation Importance: A Corrected Feature Importance Measure. In Bioinformatics.
  7. Zied Aouini and Adrian Pekar. 2022. NFStream: A Flexible Network Data Analysis Framework. In Computer Networks.
  8. FlowLens: Enabling Efficient Flow Classification for ML-based Network Security Applications. In Network and Distributed Systems Security Symposium (NDSS).
  9. Traffic Classification On The Fly. In ACM SIGCOMM Computer Communication Review.
  10. Early Application Identification. In International Conference on Emerging Networking Experiments and Technologies (CoNEXT).
  11. A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. In Journal of Internet Services and Applications.
  12. Xavier Bouthillier and Gaël Varoquaux. 2020. Survey of Machine-learning Experimental Methods at NeurIPS2019 and ICLR2020. Research Report, Inria Saclay Ile de France (2020).
  13. Traffic Refinery: Cost-Aware Data Representation for Machine Learning on Network Traffic. In Proceedings of the ACM on Measurement and Analysis of Computing Systems.
  14. Inferring Streaming Video Quality from Encrypted Traffic: Practical Models and Deployment Experience. In Proceedings of the ACM on Measurement and Analysis of Computing Systems.
  15. pForest: In-Network Inference with Random Forests. arXiv preprint arXiv:1909.05680v2 (2022).
  16. Bayesian Gait Optimization for Bipedal Locomotion. In International Conference on Learning and Intelligent Optimization (LION).
  17. Learned Load Balancing. In International Conference on Distributed Computing and Networking (ICDCN).
  18. InferLine: Latency-Aware Provisioning and Scaling for Prediction Serving Pipelines. In ACM Symposium on Cloud Computing.
  19. Clipper: A Low-Latency Online Prediction Serving System. In USENIX Symposium on Networked Systems Design and Implementation (NSDI).
  20. Early Classification of Network Traffic through Multi-classification. In International Workshop on Traffic Monitoring and Analysis (TMA).
  21. Context Adaptive Ensemble Classification Mechanism with Multi-Criteria Decision Making for Network Intrusion Detection. Concurrency and Computation: Practice and Experience (2022).
  22. Characterization of Encrypted and VPN Traffic using Time-Related Features. In International Conference on Information Systems Security and Privacy (ICISSP).
  23. Peter I. Frazier. 2018. A Tutorial on Bayesian Optimization. arXiv preprint arXiv:1807.02811 (2018).
  24. Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis. In ACM SIGSAC Conference on Computer and Communication Security (CCS).
  25. Requet: Real-Time QoE Detection for Encrypted YouTube Traffic. In ACM Transactions on Multimedia Computing, Communications.
  26. Gene Selection for Cancer Classification using Support Vector Machines. In Machine Learning.
  27. BaCO: A Fast and Portable Bayesian Compiler Optimization Framework. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
  28. New Directions in Automated Traffic Analysis. In ACM Conference on Computer and Communication Security (CCS).
  29. Application Traffic Classification at the Early Stage by Characterizing Application Rounds. In Information Sciences.
  30. Towards Adaptive ML Traffic Processing Systems. In Proceedings of the on CoNEXT Student Workshop 2023.
  31. π𝜋\piitalic_πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization. In International Conference on Learning Representations (ICLR).
  32. AC-DC: Adaptive Ensemble Classification for Network Traffic Identification. arXiv preprint arXiv:2302.11718 (2023).
  33. Efficient Global Optimization of Expensive Black-Box Functions. In Journal of Global Optimization.
  34. Radhakrishna Kamath and Krishna M. Sivalingam. 2015. Machine Learning based Flow Classification in DCNs using P4 Switches. In International Conference on Computer Communications and Networks.
  35. BLINC: Multilevel Traffic Classification in the Dark. In ACM Special Interest Group on Data Communication (SIGCOMM).
  36. Optimization by simulated annealing. In Science.
  37. Ron Kohavi and George H. John. 1997. Wrappers for Feature Subset Selection. In Artificial Intelligence.
  38. BUFFEST: Predicting Buffer Conditions and Real-time Requirements of HTTP(S) Adaptive Streaming Clients. In ACM Multimedia Systems Conference.
  39. Jong-Hyouk Lee and Kamal Singh. 2020. SwitchTree: In-Network Computing and Traffic Analyses with Random Forests. In Neural Computing and Applications.
  40. Miqing Li and Xin Yao. 2019. Quality Evaluation of Solution Sets in Multiobjective Optimisation: A Survey. In ACM Computing Surveys.
  41. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. USENIX Symposium on Operating Systems Design and Implementation (2023).
  42. Neural Packet Classification. In Proceedings of the ACM Special Interest Group on Data Communication.
  43. Network Traffic Classification Using K-means Clustering. In International Multi-Symposiums on Computer and Computational Sciences.
  44. Network Traffic Classifier with Convolutional and Recurrent Neural Networks for Internet of Things. In IEEE Access.
  45. Deep packet: A Novel Approach for Encrypted Traffic Classification using Deep Learning. Soft Computing (2020).
  46. Using Session Modeling to Estimate HTTP-Based Video QoE Metrics From Encrypted Network Traffic. In IEEE Transactions on Network and Service Management.
  47. M. Hammad Mazhar and Zubair Shafiq. 2018. Real-time Video Quality of Experience Monitoring for HTTPS and QUIC. In IEEE International Conference on Computer Communications.
  48. IoT Sentinel: Automated Device-Type Identification for Security Enforcement in IoT. In International Conference on Distributed Computing Systems.
  49. Discriminators for use in flow-based classification. Technical Report.
  50. HyperMapper: a Practical Design Space Exploration Framework. In IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).
  51. Thuy T.T. Nguyen and Grenville Armitage. 2008. A Comprehensive Survey on Machine Learning for Networking: Evolution, Applications and Research Opportunities. In IEEE Communications Surveys & Tutorials.
  52. Effective Packet Number for Early Stage Internet Traffic Identification. In Neurocomputing.
  53. GGFAST: Automating Generation of Flexible Network Traffic Classifiers. In ACM Special Interest Group on Data Communication (SIGCOMM).
  54. Large-scale Mobile App Identification Using Deep Learning. In IEEE Access.
  55. Automated Website Fingerprinting Through Deep Learning. arXiv preprint arXiv:1708.06376 (2017).
  56. INFaaS: Automated Model-less Inference Serving. In USENIX Annual Technical Conference (USENIX ATC).
  57. Can the Network be the AI Accelerator?. In Morning Workshop on In-Network Computing.
  58. Gabriel Gómez Sena and Pablo Belzarena. 2009. Early Traffic Classification Using Support Vector Machines. In International Latin American Networking Conference (LANC).
  59. Taking the Human Out of the Loop: A Review of Bayesian Optimization. In Proceedings of IEEE, vol. 104, no. 1.
  60. Tal Shapira and Yuval Shavitt. 2021. FlowPic: A Generic Representation for Encrypted Traffic Classification and Applications Identification. In IEEE Transactions on Network and Service Management.
  61. Jayveer Singh and Manisha Nene. 2013. A Survey on Machine Learning Techniques for Intrusion Detection Systems. In International Journal of Advanced Research in Computer and Communication Engineering.
  62. Re-architecting Traffic Analysis with Neural Network Interface Cards. In USENIX Symposium on Networked Systems Design and Implementation (NSDI).
  63. Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics. In IEEE Transactions on Mobile Computing.
  64. Practical Bayesian optimization of machine learning algorithms. In International Conference on Advances in Neural Information Processing Systems (NeurIPS).
  65. Taurus: A Data Plane Architecture for Per-Packet ML. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
  66. Homunculus: Auto-Generating Efficient Data-Plane ML Pipelines for Datacenter Networks. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
  67. High-Throughput Traffic Classification on Multi-Core Processors. In 2014 IEEE 15th International Conference on High Performance Switching and Routing (HPSR).
  68. Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. In Proceedings of the NeurIPS 2020 Competition and Demonstration Track,.
  69. Jorge R. Vergara and Pablo A. Estévez. 2014. A Review of Feature Selection Methods Based on Mutual Information. In Neural Computing and Applications.
  70. Retina: Analyzing 100 GbE Traffic on Commodity Hardware. In ACM Special Interest Group on Data Communication (SIGCOMM).
  71. Malware Traffic Classification Using Convolutional Neural Network for Representation Learning. In International Conference on Information Networking.
  72. Zhaoqi Xiong and Noa Zilberman. 2019. Do Switches Dream of Machine Learning?: Toward In-Network Classification. In ACM Workshop on Hot Topics in Networks.
  73. Malicious Encryption Traffic Detection Based on NLP. In Security and Communication Networks.
  74. Feature Extraction for Novelty Detection in Network Traffic. arXiv preprint arXiv:2006.16993v2 (2021).
  75. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. In USENIX Annual Technical Conference (USENIX ATC).
  76. Automating In-Network Machine Learning. arXiv preprint arXiv:2205.08824v1 (2022).
  77. MTT: An Efficient Model for Encrypted Network Traffic Classification using Multi-task Transformer. Applied Intelligence (2022).

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com