Investigating Efficient Deep Learning Architectures For Side-Channel Attacks on AES (2309.13170v1)
Abstract: Over the past few years, deep learning has been getting progressively more popular for the exploitation of side-channel vulnerabilities in embedded cryptographic applications, as it offers advantages in terms of the amount of attack traces required for effective key recovery. A number of effective attacks using neural networks have already been published, but reducing their cost in terms of the amount of computing resources and data required is an ever-present goal, which we pursue in this work. We focus on the ANSSI Side-Channel Attack Database (ASCAD), and produce a JAX-based framework for deep-learning-based SCA, with which we reproduce a selection of previous results and build upon them in an attempt to improve their performance. We also investigate the effectiveness of various Transformer-based models.
- ANSSI and CEA “ASCAD: Side Channels Analysis and Deep Learning”, 2019-2021 URL: https://github.com/ANSSI-FR/ASCAD
- “Secure AES128 Encryption Implementation for ATmega8515”, 2018 URL: https://github.com/ANSSI-FR/secAES-ATmega8515
- “Deep learning for side-channel analysis and introduction to ASCAD database” In Journal of Cryptographic Engineering 10, 2020 DOI: 10.1007/s13389-019-00220-8
- “JAX: composable transformations of Python+NumPy programs”, 2018 URL: http://github.com/google/jax
- “Language Models are Few-Shot Learners”, 2020 arXiv:2005.14165 [cs.CL]
- “Rethinking Attention with Performers”, 2021 arXiv:2009.14794 [cs.LG]
- ASCAD Community “Issue 13 - Difference of Datasets: Sampling Frequency / EM & Power?” Accessed: 2022-02-01 URL: https://archive.is/CMSAL
- “Wavelet transform based pre-processing for side channel analysis” In Proceedings - 2012 IEEE/ACM 45th International Symposium on Microarchitecture Workshops, MICROW 2012, 2012, pp. 32–38 DOI: 10.1109/MICROW.2012.15
- “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, 2019 arXiv:1810.04805 [cs.CL]
- Pierre Dusart, Gilles Letourneux and Olivier Vivolo “Differential fault analysis on AES” In International Conference on Applied Cryptography and Network Security, 2003, pp. 293–306 Springer
- “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, 2021 arXiv:2010.11929 [cs.CV]
- “Tiny Transformers for Environmental Sound Classification at the Edge”, 2021 arXiv:2103.12157 [cs.SD]
- “Affine Masking against Higher-Order Side Channel Analysis” https://ia.cr/2010/523, Cryptology ePrint Archive, Report 2010/523, 2010
- Yuan Gong, Yu-An Chung and James Glass “AST: Audio Spectrogram Transformer”, 2021 arXiv:2104.01778 [cs.SD]
- Daniel Genkin, Adi Shamir and Eran Tromer “Acoustic cryptanalysis” In Journal of Cryptology 30.2 Springer, 2017, pp. 392–443
- “Deep Residual Learning for Image Recognition”, 2015 arXiv:1512.03385 [cs.CV]
- “Haiku: Sonnet for JAX”, 2020 URL: http://github.com/deepmind/dm-haiku
- “Optax: composable gradient transformation and optimisation, in JAX!”, 2020 URL: http://github.com/deepmind/optax
- Jeremy Howard “The fastai deep learning library” GitHub, https://github.com/fastai/fastai, 2018
- “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”, 2015 arXiv:1502.03167 [cs.LG]
- “Averaging Weights Leads to Wider Optima and Better Generalization”, 2019 arXiv:1803.05407 [cs.LG]
- “Perceiver IO: A General Architecture for Structured Inputs and Outputs”, 2021 arXiv:2107.14795 [cs.LG]
- “Perceiver: General Perception with Iterative Attention”, 2021 arXiv:2103.03206 [cs.CV]
- Diederik P. Kingma and Jimmy Ba “Adam: A Method for Stochastic Optimization”, 2017 arXiv:1412.6980 [cs.LG]
- Nikita Kitaev, Łukasz Kaiser and Anselm Levskaya “Reformer: The Efficient Transformer”, 2020 arXiv:2001.04451 [cs.LG]
- Paul C Kocher “Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems” In Annual International Cryptology Conference, 1996, pp. 104–113 Springer
- “Deep Learning for Symbolic Mathematics”, 2019 arXiv:1912.01412 [cs.SC]
- “Pay Attention to Raw Traces: A Deep Learning Architecture for End-to-End Profiling Attacks” In IACR Transactions on Cryptographic Hardware and Embedded Systems 2021.3, 2021, pp. 235–274 DOI: 10.46586/tches.v2021.i3.235-274
- Loïc Masure, Cécile Dumas and Emmanuel Prouff “Gradient Visualization for General Characterization in Profiling Attacks” In Constructive Side-Channel Analysis and Secure Design - 10th International Workshop, COSADE 2019, Darmstadt, Germany, April 3-5, 2019, Proceedings 11421, Lecture Notes in Computer Science Springer, 2019, pp. 145–167 DOI: 10.1007/978-3-030-16350-1\_9
- “Side Channel Analysis against the ANSSI’s protected AES implementation on ARM” https://ia.cr/2021/592, Cryptology ePrint Archive, Report 2021/592, 2021
- “Training Tips for the Transformer Model” In The Prague Bulletin of Mathematical Linguistics 110.1 Charles University in Prague, Karolinum Press, 2018, pp. 43–70 DOI: 10.2478/pralin-2018-0002
- “Language models are unsupervised multitask learners” In OpenAI blog 1.8, 2019, pp. 9
- “Power side-channel attack analysis: A review of 20 years of study for the layman” In Cryptography 4.2 Multidisciplinary Digital Publishing Institute, 2020, pp. 15
- Emmanuel Prouff Ryad Benadjila and Adrian Thillar “Hardened Library for AES-128 encryption/decryption on ARM Cortex M4 Achitecture”, 2019 URL: https://github.com/ANSSI-FR/SecAESSTM32/
- “Audiomer: A Convolutional Transformer For Keyword Spotting”, 2022 arXiv:2109.10252 [cs.LG]
- Leslie N. Smith “Cyclical Learning Rates for Training Neural Networks”, 2017 arXiv:1506.01186 [cs.CV]
- Leslie N. Smith “A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay”, 2018 arXiv:1803.09820 [cs.LG]
- Leslie N. Smith and Nicholay Topin “Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”, 2018 arXiv:1708.07120 [cs.LG]
- “Very Deep Convolutional Networks for Large-Scale Image Recognition”, 2015 arXiv:1409.1556 [cs.CV]
- “Rethinking the Inception Architecture for Computer Vision”, 2015 arXiv:1512.00567 [cs.CV]
- “Attention Is All You Need”, 2017 arXiv:1706.03762 [cs.CL]
- “CvT: Introducing Convolutions to Vision Transformers”, 2021 arXiv:2103.15808 [cs.CV]
- Yang You, Igor Gitman and Boris Ginsburg “Large Batch Training of Convolutional Networks”, 2017 arXiv:1708.03888 [cs.CV]
- “Large Batch Optimization for Deep Learning: Training BERT in 76 minutes”, 2020 arXiv:1904.00962 [cs.LG]
- “Methodology for Efficient CNN Architectures in Profiling Attacks” In IACR Transactions on Cryptographic Hardware and Embedded Systems 2020.1, 2019, pp. 1–36 DOI: 10.13154/tches.v2020.i1.1-36
- “Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis” In IACR Transactions on Cryptographic Hardware and Embedded Systems 2021.1, 2020, pp. 25–55 DOI: 10.46586/tches.v2021.i1.25-55