A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards (2308.01074v1)

Published 2 Aug 2023 in cs.CR and cs.LG

Abstract: With recent developments in deep learning, the ubiquity of micro-phones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever. This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrokes recorded by a nearby phone, the classifier achieved an accuracy of 95%, the highest accuracy seen without the use of a LLM. When trained on keystrokes recorded using the video-conferencing software Zoom, an accuracy of 93% was achieved, a new best for the medium. Our results prove the practicality of these side channel attacks via off-the-shelf equipment and algorithms. We discuss a series of mitigation methods to protect users against these series of attacks.

Citations (12)

View on Semantic Scholar

Summary

The paper develops a comprehensive ASCA pipeline that isolates keystrokes, extracts mel-spectrogram features, and employs a self-attention transformer for classification.
The study demonstrates the feasibility of remote attacks using smartphone and Zoom recordings, achieving 95% and 93% accuracy respectively.
The implementation of the CoAtNet architecture highlights deep learning’s potential to elevate both on-site and remote acoustic keyboard attack methodologies.

Acoustic Side Channel Attacks on Keyboards: Deep Learning Approaches

This paper presents an exploration of acoustic side channel attacks (ASCAs) on keyboards, leveraging advancements in deep learning (DL) to enhance the accuracy and practicality of these attacks. The paper addresses key questions concerning the implementation of a fully automated ASCA pipeline and the applicability of deep learning models in both on-site and remote attacks involving contemporary laptops.

Key Contributions

Development of ASCA Pipeline: The paper proposes a comprehensive, automated ASCA pipeline tailored for keyboards. This pipeline includes essential stages of keystroke separation, feature extraction using mel-spectrograms, and final prediction with a deep learning model featuring self-attention transformer layers.
Experimental Setup: Experiments are conducted using two methods of data collection. The first involves capturing keystroke sounds with a smartphone placed near the laptop, while the second exploits Zoom's video-conferencing tool for remote data gathering. The latter approach is noteworthy for demonstrating the feasibility of remote ASCAs without physical access to the target device.
Deep Learning Model Implementation: The paper successfully implements a novel self-attention transformer-based neural network, specifically the CoAtNet architecture, to process acoustically captured keystrokes. This approach achieves unprecedented accuracy rates of 95% in the phone-recorded setting and 93% via Zoom-based recordings, establishing new benchmarks for ASC attack via non-traditional mediums.

Methodology Overview

The methodology pivots around constructing a DL model capable of processing complex audio features to classify keystrokes. Key stages involve:

Data Parsing and Augmentation: Keystrokes are isolated through energy-based thresholding and processed into mel-spectrograms, incorporating SpecAugment techniques to bolster the model's robustness and prevent overfitting.
Model Configuration and Training: Through scrutiny of hyperparameters like learning rate and training epochs, the paper determines optimal configurations for the CoAtNet model, achieving high classification accuracy on unseen test data.

Practical Implications and Future Directions

The implications of this research are manifold, highlighting the increasing vulnerability of laptop users to ASCAs, especially in environments populated with always-on microphone devices such as smart speakers and smartphones. A key concern extends to remote audio data capture via common communication applications, underscoring a significant threat vector.

Mitigation strategies suggested include altering typing habits, using randomized passwords with varied characters, and enhancing VoIP software with acoustic masking techniques to obfuscate keystroke sounds.

Future prospects recommended by the authors include further optimizations in keystroke isolation techniques and integrating LLMs to complement DL classifiers for holistic keystroke identification. Additionally, investigating real-world deployments of ASCAs encapsulates an avenue conducive to cross-disciplinary exploration of cybersecurity and machine learning.

In conclusion, this paper makes substantial contributions to the domain of side channel attacks utilizing deep learning, significantly improving the understanding and potential to conduct such attacks while also laying the groundwork for critical discourse on defending against these emerging threats.

PDF Markdown

Related Papers

Tweets

https://twitter.com/midnucas/status/1836152295190561014

https://twitter.com/401521845/status/1733522995312758848

https://twitter.com/RedCardinal/status/1767142201673621972

https://twitter.com/copie_eth/status/1760917816503398828

https://twitter.com/PolSci_Realist/status/1770522051851530294

https://twitter.com/duckqlz/status/1928742746258301366

YouTube

Show All Videos

HackerNews

A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards (2 points, 0 comments)