Peregrine: ML-based Malicious Traffic Detection for Terabit Networks
Abstract: Malicious traffic detectors leveraging ML, namely those incorporating deep learning techniques, exhibit impressive detection capabilities across multiple attacks. However, their effectiveness becomes compromised when deployed in networks handling Terabit-speed traffic. In practice, these systems require substantial traffic sampling to reconcile the high data plane packet rates with the comparatively slower processing speeds of ML detection. As sampling significantly reduces traffic observability, it fundamentally undermines their detection capability. We present Peregrine, an ML-based malicious traffic detector for Terabit networks. The key idea is to run the detection process partially in the network data plane. Specifically, we offload the detector's ML feature computation to a commodity switch. The Peregrine switch processes a diversity of features per-packet, at Tbps line rates - three orders of magnitude higher than the fastest detector - to feed the ML-based component in the control plane. Our offloading approach presents a distinct advantage. While, in practice, current systems sample raw traffic, in Peregrine sampling occurs after feature computation. This essential trait enables computing features over all traffic, significantly enhancing detection performance. The Peregrine detector is not only effective for Terabit networks, but it is also energy- and cost-efficient. Further, by shifting a compute-heavy component to the switch, it saves precious CPU cycles and improves detection throughput.
- Canadian Institute for Cybersecurity datasets. Retrieved 2023-02-15. URL: https://www.unb.ca/cic/datasets/.
- High-capacity strataxgs® trident4 ethernet switch series. Retrieved 2023-02-15. URL: https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56880-series.
- Aggregate-based congestion control for pulse-wave DDoS defense. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 693–706, 2022.
- João Romeiras Amado. Source code. Retrieved 2024-02-06. URL: https://github.com/netx-ulx/peregrine.
- Understanding the Mirai botnet. In 26th USENIX Security Symposium, pages 1093–1110, 2017.
- Dos and don’ts of machine learning in computer security. In Proc. of the USENIX Security Symposium, 2022.
- Surgeprotector: Mitigating temporal algorithmic complexity attacks using adversarial scheduling. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 723–738, 2022.
- Fastfe: Accelerating ml-based traffic analysis with programmable switches. In Proceedings of the Workshop on Secure Programmable Network Infrastructure, pages 1–7, 2020.
- Flowlens: Enabling efficient flow classification for ml-based network security applications. In NDSS, 2021.
- Optimized invariant representation of network traffic for detecting unseen malware variants. In 25th USENIX Security Symposium, 2016.
- Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. In ACM SIGCOMM Computer Communication Review, volume 43, pages 99–110, 2013.
- Augmenting rule-based dns censorship detection at scale with machine learning. arXiv preprint arXiv:2302.02031, 2023.
- A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2):1153–1176, 2016.
- pforest: In-network inference with random forests. arXiv preprint arXiv:1909.05680, 2019.
- Network intrusion detection for IoT security based on learning techniques. IEEE Communications Surveys & Tutorials, 21(3):2671–2701, 2019.
- Quantile sampling for practical delay monitoring in internet backbone networks. Computer Networks, 51(10):2701–2716, 2007.
- Cisco. isco Encrypted Traffic Analytics Whitepaper. Retrieved 2024-02-01. URL: https://www.cisco.com/c/en/us/solutions/collateral/enterprise-networks/enterprise-network-security/nb-09-encrytd-traf-anlytcs-wp-cte-en.pdf.
- L2 regularization for learning kernels. arXiv preprint arXiv:1205.2653, 2012.
- Tuple space explosion: A denial-of-service attack against a software packet classifier. In Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies, pages 292–304, 2019.
- Introducing packet-level analysis in programmable data planes to advance network intrusion detection. arXiv preprint arXiv:2307.05936, 2023.
- Lifelong anomaly detection through unlearning. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 2019.
- Realtime robust malicious traffic detection via frequency domain analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 3431–3446, 2021.
- Lyra: A cross-platform language and compiler for data plane programming on heterogeneous ASICs. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, SIGCOMM ’20, 2020.
- Stats 101 in p4: Towards in-switch anomaly detection. In Proceedings of the Twentieth ACM Workshop on Hot Topics in Networks, pages 84–90, 2021.
- Yunhui Guo. A survey on methods and theories of quantized neural networks. arXiv preprint arXiv:1808.04752, 2018.
- Sonata: Query-driven streaming network telemetry. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 357–371, 2018.
- Detecting volumetric attacks on lot devices via sdn-based monitoring of mud activity. In Proceedings of the 2019 ACM Symposium on SDN Research, pages 36–48, 2019.
- Detecting and characterizing lateral phishing at scale. In 28th USENIX Security Symposium, 2019.
- Identifying disinformation websites using infrastructure features. In 10th USENIX Workshop on Free and Open Communications on the Internet, 2020.
- The nanopu: A nanosecond network stack for datacenters. In 15th USENIX Symposium on Operating Systems Design and Implementation, 2021.
- Intel. P416 Intel® Tofino™ Native Architecture – Public Version. Retrieved 2023-02-15. URL: https://raw.githubusercontent.com/barefootnetworks/Open-Tofino/master/PUBLIC_Tofino-Native-Arch.pdf.
- Intel. The Intel® Tofino™ series of P4-programmable Ethernet switch ASICs. Retrieved 2022-10-20. URL: https://www.intel.com/content/www/us/en/products/details/network-io/programmable-ethernet-switch/tofino-series.html.
- Nazca: Detecting malware distribution in large-scale networks. In NDSS 2014, 2014.
- Qpipe: Quantiles sketch fully in the data plane. In Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies, pages 285–291, 2019.
- Does rate adaptation at daily timescales make sense? In Proceedings of the 2nd Workshop on Sustainable Computer Systems, pages 1–7, 2023.
- Keith Wiles. Pktgen - Traffic Generator powered by DPDK. Retrieved 2022-10-20. URL: https://github.com/pktgen/Pktgen-DPDK.
- Changhoon Kim. Programming the network data plane: What, how, and why? Retrieved 2023-02-15. URL: https://conferences.sigcomm.org/events/apnet2017/slides/chang.pdf.
- Tea: Enabling state-intensive network functions on programmable switches. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pages 90–106, 2020.
- One sketch to rule them all: Rethinking network flow monitoring with UnivMon. In Proceedings of the 2016 ACM SIGCOMM Conference, pages 101–114, 2016.
- Jaqen: A high-performance switch-native approach for detecting and mitigating volumetric DDoS attacks with programmable switches. In 30th USENIX Security Symposium, pages 3829–3846, 2021.
- Kitsune: an ensemble of autoencoders for online network intrusion detection. In Network and Distributed Systems Security Symposium, 2018.
- Machine-learning-enabled ddos attacks detection in p4 programmable networks. Journal of Network and Systems Management, 30:1–27, 2022.
- {{\{{SketchLib}}\}}: Enabling efficient sketch-based monitoring on programmable switches. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 743–759, 2022.
- Language-directed hardware design for network performance monitoring. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 85–98, 2017.
- WebWitness: Investigating, categorizing, and mitigating malware download paths. In 24th USENIX Security Symposium, 2015.
- A survey of techniques for internet traffic classification using machine learning. IEEE communications surveys & tutorials, 10(4):56–76, 2008.
- Vern Paxson. Bro: a system for detecting network intruders in real-time. Computer networks, 31(23-24):2435–2463, 1999.
- Automatic generation of network function accelerators using component-based synthesis. In Proceedings of the Symposium on SDN Research, SOSR ’22, 2022.
- Jupiter evolving: transforming google’s datacenter network via optical circuit switches and software-defined networking. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 66–85, 2022.
- Martin Roesch et al. Snort: Lightweight intrusion detection for networks. In Proceedings of LISA’99: 13th Systems Administration Conference, volume 99, pages 229–238, 1999.
- Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
- Evaluating the power of flexible packet processing for network resource allocation. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 67–82, 2017.
- Jupiter rising: A decade of clos topologies and centralized control in google’s datacenter network. ACM SIGCOMM computer communication review, 45(4):183–197, 2015.
- Heavy-hitter detection entirely in the data plane. In Proceedings of the Symposium on SDN Research, SOSR ’17, 2017.
- Turboflow: Information rich flow record generation on commodity switches. In Proceedings of the Thirteenth EuroSys Conference, pages 1–16, 2018.
- Flightplan: Dataplane disaggregation and placement for p4 programs. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), 2021.
- Taurus: A data plane architecture for per-packet ml. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’22, 2022.
- Mv-sketch: A fast and compact invertible sketch for heavy flow detection in network data streams. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pages 2026–2034, 2019.
- Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):267–288, 1996.
- Easyquantile: Efficient quantile tracking in the data plane. 2023.
- Martini: Bridging the gap between network measurement and control using switching asics. In 2020 IEEE 28th International Conference on Network Protocols, pages 1–12, 2020.
- Xian Wang. Enidrift: A fast and adaptive ensemble system for network intrusion detection under real-world drift. In Proceedings of the 38th Annual Computer Security Applications Conference, pages 785–798, 2022.
- Do switches dream of machine learning? toward in-network classification. In Proceedings of the 18th ACM workshop on hot topics in networks, pages 25–33, 2019.
- An efficient one-class SVM for anomaly detection in the internet of things. arXiv preprint arXiv:2104.11146, 2021.
- Elastic sketch: Adaptive and fast network-wide measurements. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 561–575, 2018.
- Mantis: Reactive programmable switches. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pages 296–309, 2020.
- Software defined traffic measurement with OpenSketch. In 13th USENIX Symposium on Networked Systems Design and Implementation, pages 29–42, 2013.
- Gallium: Automated software middlebox offloading to programmable switches. SIGCOMM ’20, 2020.
- Poseidon: Mitigating volumetric DDoS attacks with programmable switches. In 27th Network and Distributed System Security Symposium, 2020.
- Achieving 100gbps intrusion prevention on a single server. In 14th USENIX Symposium on Operating Systems Design and Implementation, pages 1083–1100, 2020.
- Featuresmith: Automatically engineering features for malware detection by mining the security literature. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.