Accurate Low-Degree Polynomial Approximation of Non-polynomial Operators for Fast Private Inference in Homomorphic Encryption (2404.03216v3)
Abstract: As ML permeates fields like healthcare, facial recognition, and blockchain, the need to protect sensitive data intensifies. Fully Homomorphic Encryption (FHE) allows inference on encrypted data, preserving the privacy of both data and the ML model. However, it slows down non-secure inference by up to five magnitudes, with a root cause of replacing non-polynomial operators (ReLU and MaxPooling) with high-degree Polynomial Approximated Function (PAF). We propose SmartPAF, a framework to replace non-polynomial operators with low-degree PAF and then recover the accuracy of PAF-approximated model through four techniques: (1) Coefficient Tuning (CT) -- adjust PAF coefficients based on the input distributions before training, (2) Progressive Approximation (PA) -- progressively replace one non-polynomial operator at a time followed by a fine-tuning, (3) Alternate Training (AT) -- alternate the training between PAFs and other linear operators in the decoupled manner, and (4) Dynamic Scale (DS) / Static Scale (SS) -- dynamically scale PAF input value within (-1, 1) in training, and fix the scale as the running max value in FHE deployment. The synergistic effect of CT, PA, AT, and DS/SS enables SmartPAF to enhance the accuracy of the various models approximated by PAFs with various low degrees under multiple datasets. For ResNet-18 under ImageNet-1k, the Pareto-frontier spotted by SmartPAF in latency-accuracy tradeoff space achieves 1.42x ~ 13.64x accuracy improvement and 6.79x ~ 14.9x speedup than prior works. Further, SmartPAF enables a 14-degree PAF (f12 g_12) to achieve 7.81x speedup compared to the 27-degree PAF obtained by minimax approximation with the same 69.4% post-replacement accuracy. Our code is available at https://github.com/EfficientFHE/SmartPAF.
- Homomorphic encryption standard. Protecting privacy through homomorphic encryption, pp. 31–62, 2021.
- Convex optimization. Cambridge university press, 2004.
- Low latency privacy preserving inference. In International Conference on Machine Learning, pp. 812–821. PMLR, 2019.
- Simple encrypted arithmetic library-seal v2. 1. In International conference on financial cryptography and data security, pp. 3–18. Springer, 2017.
- Homomorphic encryption for arithmetic of approximate numbers. Cryptology ePrint Archive, Paper 2016/421, 2016. URL https://eprint.iacr.org/2016/421. https://eprint.iacr.org/2016/421.
- Efficient homomorphic comparison methods with optimal complexity. In International Conference on the Theory and Application of Cryptology and Information Security, pp. 221–256. Springer, 2020.
- Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In Balcan, M. F. and Weinberger, K. Q. (eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pp. 201–210, New York, New York, USA, 20–22 Jun 2016. PMLR. URL https://proceedings.mlr.press/v48/gilad-bachrach16.html.
- Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015. URL http://arxiv.org/abs/1512.03385.
- Cryptodl: Deep neural networks over encrypted data. ArXiv, abs/1711.05189, 2017.
- Gazelle: A low latency framework for secure neural network inference. In Proceedings of the 27th USENIX Conference on Security Symposium, SEC’18, pp. 1651–1668, USA, 2018a. USENIX Association. ISBN 9781931971461.
- Gazelle: A low latency framework for secure neural network inference, 2018b. URL https://arxiv.org/abs/1801.05507.
- Bts: An accelerator for bootstrappable fully homomorphic encryption. In Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA ’22, pp. 711–725, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450386104. doi: 10.1145/3470496.3527415. URL https://doi.org/10.1145/3470496.3527415.
- Imagenet classification with deep convolutional neural networks. In Proceedings of the Neural Information Processing Systems (NIPS), 2012.
- Minimax approximation of sign function by composite polynomial for homomorphic comparison. IEEE Transactions on Dependable and Secure Computing, 19(6):3711–3727, 2022. doi: 10.1109/TDSC.2021.3105111.
- Precise approximation of convolutional neural networks for homomorphically encrypted data. ArXiv, abs/2105.10879, 2021.
- SHE: A Fast and Accurate Deep Neural Network for Encrypted Data. Curran Associates Inc., Red Hook, NY, USA, 2019.
- {SAFEN}et: A secure, accurate and fast neural network inference. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=Cz3dbFm5u-.
- Improving the quality of machine learning in health applications and clinical research. Nature Machine Intelligence, 2(10):554–556, 2020.
- Delphi: A cryptographic inference system for neural networks. In Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice, PPMLP’20, pp. 27–30, New York, NY, USA, 2020a. Association for Computing Machinery. ISBN 9781450380881. doi: 10.1145/3411501.3419418. URL https://doi.org/10.1145/3411501.3419418.
- Delphi: A cryptographic inference service for neural networks. In IACR Cryptology ePrint Archive, 2020b.
- Aespa: Accuracy preserving low-degree polynomial activation for fast private inference, 2022.
- About face: A survey of facial recognition evaluation. CoRR, abs/2102.00813, 2021. URL https://arxiv.org/abs/2102.00813.
- CryptoGCN: Fast and scalable homomorphically encrypted graph convolutional network inference. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=VeQBBm1MmTZ.
- Cheetah: Optimizing and accelerating homomorphic encryption for private inference. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 26–39, 2021. doi: 10.1109/HPCA51647.2021.00013.
- Heax: An architecture for computing on encrypted data. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’20, pp. 1295–1309, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450371025. doi: 10.1145/3373376.3378523. URL https://doi.org/10.1145/3373376.3378523.
- Deepsecure: Scalable provably-secure deep learning, 2017. URL https://arxiv.org/abs/1705.08963.
- F1: A fast and programmable accelerator for fully homomorphic encryption. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’21, pp. 238–252, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450385572. doi: 10.1145/3466752.3480070. URL https://doi.org/10.1145/3466752.3480070.
- Craterlake: A hardware accelerator for efficient unbounded computation on encrypted data. In Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA ’22, pp. 173–187, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450386104. doi: 10.1145/3470496.3527393. URL https://doi.org/10.1145/3470496.3527393.
- Che: Channel-wise homomorphic encryption for ciphertext inference in convolutional neural network. IEEE Access, 10:107446–107458, 2022. doi: 10.1109/ACCESS.2022.3210134.
- Pipezk: Accelerating zero-knowledge proof with a pipelined architecture. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), pp. 416–428, 2021. doi: 10.1109/ISCA52012.2021.00040.