Realistic Website Fingerprinting By Augmenting Network Trace (2309.10147v1)
Abstract: Website Fingerprinting (WF) is considered a major threat to the anonymity of Tor users (and other anonymity systems). While state-of-the-art WF techniques have claimed high attack accuracies, e.g., by leveraging Deep Neural Networks (DNN), several recent works have questioned the practicality of such WF attacks in the real world due to the assumptions made in the design and evaluation of these attacks. In this work, we argue that such impracticality issues are mainly due to the attacker's inability in collecting training data in comprehensive network conditions, e.g., a WF classifier may be trained only on samples collected on specific high-bandwidth network links but deployed on connections with different network conditions. We show that augmenting network traces can enhance the performance of WF classifiers in unobserved network conditions. Specifically, we introduce NetAugment, an augmentation technique tailored to the specifications of Tor traces. We instantiate NetAugment through semi-supervised and self-supervised learning techniques. Our extensive open-world and close-world experiments demonstrate that under practical evaluation settings, our WF attacks provide superior performances compared to the state-of-the-art; this is due to their use of augmented network traces for training, which allows them to learn the features of target traffic in unobserved settings. For instance, with a 5-shot learning in a closed-world scenario, our self-supervised WF attack (named NetCLR) reaches up to 80% accuracy when the traces for evaluation are collected in a setting unobserved by the WF adversary. This is compared to an accuracy of 64.4% achieved by the state-of-the-art Triplet Fingerprinting [35]. We believe that the promising results of our work can encourage the use of network trace augmentation in other types of network traffic analysis.
- P. Bachman, O. Alsharif and D. Precup “Learning with pseudo-ensembles” In NIPS, 2014
- “ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring” In ICLR, 2020
- “Var-CNN and DynaFlow: Improved Attacks and Defenses for Website Fingerprinting” In arXiv preprint arXiv:1802.10215, 2018
- “Touching from a distance: Website fingerprinting attacks and defenses” In ACM CCS, 2012
- “A simple framework for contrastive learning of visual representations” In ICML, 2020
- G. Cherubin, J. Hayes and M. Juarez “Website fingerprinting defenses at the application layer” In PETS, 2017
- G. Cherubin, R. Jansen and C. Troncoso “Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World” In USENIX Security, 2022
- M. Perry “A Critique of Website Fingerprinting Attacks. Tor project Blog”, 2013 URL: https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks
- “Autoaugment: Learning augmentation strategies from data” In CVPR, 2019
- “Randaugment: Practical automated data augmentation with a reduced search space” In CVPR, 2020
- R. Dingledine, N. Mathewson and P. Syverson “Tor: The second-generation onion router” In USENIX Security, 2004
- “A Survey on Concept Drift Adaptation” In ACM Comput. Surv. Association for Computing Machinery, 2014 URL: https://doi.org/10.1145/2523813
- “k-fingerprinting: A robust scalable website fingerprinting technique” In USENIX Security, 2016
- “Batch normalization: Accelerating deep network training by reducing internal covariate shift” In ICML, 2015
- “Inside Job: Applying Traffic Analysis to Measure Tor from Within.” In NDSS, 2018
- “A critical evaluation of website fingerprinting attacks” In ACM CCS, 2014
- “Temporal ensembling for semi-supervised learning” In arXiv preprint arXiv:1610.02242, 2016
- D Lee “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks” In ICML 2013 Workshop: Challenges in Representation Learning (WREPL), 2013
- G. McLachlan “Iterative reclassification procedure for constructing an asymptotically optimal rule of allocation in discriminant analysis” In Journal of the American Statistical Association, 1975
- “A unifying view on dataset shift in classification” In Pattern Recognition, 2012 URL: https://sciencedirect.com/science/article/pii/S0031320311002901
- M. Nasr, A. Bahramali and A. Houmansadr “Defeating DNN-Based Traffic Analysis Systems in Real-Time With Blind Adversarial Perturbations” In USENIX Security, 2021
- “GANDaLF: GAN for Data-Limited Fingerprinting.” In PETS, 2021
- S. Oh, S. Sunkam and N. Hopper “p1-FP: Extraction, Classification, and Prediction of Website Fingerprints with Deep Learning” In PETS, 2019
- “Website Fingerprinting at Internet Scale.” In NDSS, 2016
- “Website fingerprinting in onion routing based anonymization networks” In WPES, 2011
- “Website Fingerprinting with Website Oracles.” In PETS, 2020
- “Tik-Tok: The utility of packet timing in website fingerprinting attacks” In PETS, 2020
- “Automated website fingerprinting through deep learning” In NDSS, 2018
- C. Rosenberg, M. Hebert and H. Schneiderman “Semi-supervised self-training of object detection models” In 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05) - Volume 1 1, 2005, pp. 29–36 DOI: 10.1109/ACVMOT.2005.107
- M. Sajjadi, M. Javanmardi and T. Tasdizen “Regularization with stochastic transformations and perturbations for deep semi-supervised learning” In Advances in neural information processing systems, 2016
- F. Schroff, D. Kalenichenko and J. Philbin “Facenet: A unified embedding for face recognition and clustering” In CVPR, 2015
- H. Scudder “Probability of error of some adaptive pattern-recognition machines” In IEEE Transactions on Information Theory 11.3, 1965, pp. 363–371 DOI: 10.1109/TIT.1965.1053799
- “Python Language Bindings for Selenium WebDriver”, 2020 URL: https://pypi.org/project/selenium/
- “Deep fingerprinting: Undermining website fingerprinting defenses with deep learning” In ACM CCS, 2018
- “Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-shot Learning” In ACM CCS, 2019
- “Fixmatch: Simplifying semi-supervised learning with consistency and confidence” In NIPS, 2020
- “Stem”, 2022 URL: https://pypi.org/project/stem/1.8.1/
- “Tor-browser-selenium”, 2022 URL: https://pypi.org/project/tbselenium/0.6.3/
- “Tor Metrics Portal”, 2023 URL: https://metrics.torproject.org/
- “Tshark(1) Manual Page”, 2022 URL: https://wireshark.org/docs/man-pages/tshark.html
- “Dobbs: Towards a comprehensive dataset to study the browsing behavior of online users” In WI-IAT, 2013
- T. Wang “High Precision Open-World Website Fingerprinting” In IEEE S&P, 2020
- T. Wang “Website fingerprinting: Attacks and defenses” University of Waterloo, 2016
- “Effective Attacks and Provable Defenses for Website Fingerprinting” In USENIX Security, 2014
- “Improved website fingerprinting on tor” In WPES, 2013
- “On realistically attacking tor with website fingerprinting” In PETS, 2016
- “Self-training with noisy student improves imagenet classification” In CVPR, 2020
- “A multi-tab website fingerprinting attack” In ACSAC, 2018