Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CiFHER: A Chiplet-Based FHE Accelerator with a Resizable Structure (2308.04890v3)

Published 9 Aug 2023 in cs.AR and cs.CR

Abstract: Fully homomorphic encryption (FHE) is in the spotlight as a definitive solution for privacy, but the high computational overhead of FHE poses a challenge to its practical adoption. Although prior studies have attempted to design ASIC accelerators to mitigate the overhead, their designs require excessive chip resources (e.g., areas) to contain and process massive data for FHE operations. We propose CiFHER, a chiplet-based FHE accelerator with a resizable structure, to tackle the challenge with a cost-effective multi-chip module (MCM) design. First, we devise a flexible core architecture whose configuration is adjustable to conform to the global organization of chiplets and design constraints. Its distinctive feature is a composable functional unit providing varying computational throughput for the number-theoretic transform, the most dominant function in FHE. Then, we establish generalized data mapping methodologies to minimize the interconnect overhead when organizing the chips into the MCM package in a tiled manner, which becomes a significant bottleneck due to the packaging constraints. This study demonstrates that a CiFHER package composed of a number of compact chiplets provides performance comparable to state-of-the-art monolithic ASIC accelerators while significantly reducing the package-wide power consumption and manufacturing cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (97)
  1. R. Agrawal et al., “High-Precision RNS-CKKS on Fixed but Smaller Word-Size Architectures: Theory and Application,” in Workshop on Encrypted Computing & Applied Homomorphic Cryptography, 2023, pp. 23–34.
  2. R. Agrawal et al., “FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption,” in HPCA, 2023, pp. 882–895.
  3. A. Aikata, A. C. Mert, S. Kwon, M. Deryabin, and S. S. Roy, “REED: Chiplet-Based Scalable Hardware Accelerator for Fully Homomorphic Encryption,” arXiv preprint arXiv:2308.02885, 2023.
  4. A. Al Badawi, Y. Polyakov, K. M. M. Aung, B. Veeravalli, and K. Rohloff, “Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme,” IEEE Transactions on Emerging Topics in Computing, vol. 9, no. 2, pp. 941–956, 2019.
  5. A. Al Badawi, B. Veeravalli, J. Lin, N. Xiao, M. Kazuaki, and A. Khin Mi Mi, “Multi-GPU Design and Performance Evaluation of Homomorphic Encryption on GPU Clusters,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 2, pp. 379–391, 2021.
  6. A. Al Badawi, B. Veeravalli, C. F. Mun, and K. M. M. Aung, “High-performance FV Somewhat Homomorphic Encryption on GPUs: An Implementation using CUDA,” IACR Transactions on Cryptographic Hardware and Embedded Systems, pp. 70–95, 2018.
  7. M. Albrecht et al., “Homomorphic Encryption Standard,” in Protecting Privacy through Homomorphic Encryption.   Springer, 2021, pp. 31–62.
  8. A. Arunkumar et al., “MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability,” in ISCA, 2017, pp. 320–332.
  9. A. A. Badawi and Y. Polyakov, “Demystifying Bootstrapping in Fully Homomorphic Encryption,” Cryptology ePrint Archive, Paper 2023/149, 2023. [Online]. Available: https://eprint.iacr.org/2023/149
  10. D. H. Bailey, “FFTs in External or Hierarchical Memory,” in ACM/IEEE Conference on Supercomputing, 1989, pp. 234–242.
  11. F. Boemer, S. Kim, G. Seifu, F. D. M. de Souza, and V. Gopal, “Intel HEXL: Accelerating Homomorphic Encryption with Intel AVX512-IFMA52,” in Workshop on Encrypted Computing & Applied Homomorphic Cryptography, 2021, pp. 57–62.
  12. J. Bossuat, C. Mouchet, J. R. Troncoso-Pastoriza, and J. Hubaux, “Efficient Bootstrapping for Approximate Homomorphic Encryption with Non-sparse Keys,” in Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2021, pp. 587–617.
  13. J. Bossuat, J. Troncoso-Pastoriza, and J. Hubaux, “Bootstrapping for Approximate Homomorphic Encryption with Negligible Failure-Probability by Using Sparse-Secret Encapsulation,” in Applied Cryptography and Network Security, 2022, pp. 521–541.
  14. F. Bourse, M. Minelli, M. Minihold, and P. Paillier, “Fast Homomorphic Evaluation of Deep Discretized Neural Networks,” in Annual International Cryptology Conference, 2018, pp. 483–512.
  15. Z. Brakerski, C. Gentry, and V. Vaikuntanathan, “(Leveled) Fully Homomorphic Encryption without Bootstrapping,” ACM Transactions on Computing Theory, vol. 6, no. 3, pp. 1–36, 2014.
  16. Z. Brakerski and V. Vaikuntanathan, “Efficient Fully Homomorphic Encryption from (Standard) LWE,” SIAM Journal on Computing, vol. 43, no. 2, pp. 831–871, 2014.
  17. L. Cao, “Advanced Packaging Technology Platforms for Chiplets and Heterogeneous Integration,” in International Electron Devices Meeting, 2022, pp. 3.3.1–3.3.4.
  18. J. Chang et al., “A 7nm 256Mb SRAM in High-K Metal-Gate FinFET Technology with Write-Assist Circuitry for Low-VMIN Applications,” in IEEE International Solid-State Circuits Conference, 2017, pp. 206–207.
  19. H. Chen and K. Han, “Homomorphic Lower Digits Removal and Improved FHE Bootstrapping,” in Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2018, pp. 315–337.
  20. H. Chen, K. Laine, and R. Player, “Simple Encrypted Arithmetic Library - SEAL v2.1,” in Financial Cryptography and Data Security, 2017, pp. 3–18.
  21. J. H. Cheon, K. Han, A. Kim, M. Kim, and Y. Song, “A Full RNS Variant of Approximate Homomorphic Encryption,” in Selected Areas in Cryptography, 2018, pp. 347–368.
  22. J. H. Cheon, A. Kim, M. Kim, and Y. S. Song, “Homomorphic Encryption for Arithmetic of Approximate Numbers,” in International Conference on the Theory and Applications of Cryptology and Information Security, 2017, pp. 409–437.
  23. I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène, “TFHE: Fast Fully Homomorphic Encryption Over the Torus,” Journal of Cryptology, vol. 33, no. 1, pp. 34–91, 2020.
  24. I. Chillotti, M. Joye, and P. Paillier, “Programmable Bootstrapping Enables Efficient Homomorphic Inference of Deep Neural Networks,” in Cyber Security Cryptography and Machine Learning, 2021, pp. 1–19.
  25. L. T. Clark et al., “ASAP7: A 7-nm FinFET Predictive Process Design Kit,” Microelectronics Journal, vol. 53, pp. 105–115, 2016.
  26. J. W. Cooley and J. W. Tukey, “An Algorithm for the Machine Calculation of Complex Fourier Series,” Mathematics of Computation, vol. 19, no. 90, pp. 297–301, 1965.
  27. CryptoLab Inc., “HEAAN v2.1,” Sep 2018. [Online]. Available: https://github.com/snucrypto/HEAAN
  28. B. R. Curtis and R. Player, “On the Feasibility and Impact of Standardising Sparse-secret LWE Parameter Sets for Homomorphic Encryption,” in Workshop on Encrypted Computing & Applied Homomorphic Cryptography, 2019, pp. 1–10.
  29. W. Dai and B. Sunar, “cuHE: A Homomorphic Encryption Accelerator Library,” in Cryptography and Information Security in the Balkans, 2016, pp. 169–186.
  30. W. J. Dally, “Virtual-Channel Flow Control,” IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194–205, 1992.
  31. L. Ducas and D. Micciancio, “FHEW: Bootstrapping Homomorphic Encryption in Less Than a Second,” in Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2015, pp. 617–640.
  32. EPFL-LDS and Tune Insight SA, “Lattigo v4,” Aug 2022. [Online]. Available: https://github.com/tuneinsight/lattigo
  33. J. Fan and F. Vercauteren, “Somewhat Practical Fully Homomorphic Encryption,” Cryptology ePrint Archive, Paper 2012/144, 2012. [Online]. Available: https://eprint.iacr.org/2012/144
  34. S. Fan, Z. Wang, W. Xu, R. Hou, D. Meng, and M. Zhang, “TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU,” in HPCA, 2023, pp. 922–934.
  35. Y. Feng and K. Ma, “Chiplet Actuary: A Quantitative Cost Model and Multi-Chiplet Architecture Exploration,” in ACM/IEEE Design Automation Conference, 2022, p. 121–126.
  36. S. Halevi and V. Shoup, “Faster Homomorphic Linear Transformations in HElib,” in Annual International Cryptology Conference, 2018, pp. 93–120.
  37. K. Han, S. Hong, J. H. Cheon, and D. Park, “Logistic Regression on Homomorphic Encrypted Data at Scale,” in AAAI Conference on Artificial Intelligence, 2019, pp. 9466–9471.
  38. K. Han and D. Ki, “Better Bootstrapping for Approximate Homomorphic Encryption,” in Cryptographers’ Track at the RSA Conference, 2020, pp. 364–390.
  39. M. Han, Y. Zhu, Q. Lou, Z. Zhou, S. Guo, and L. Ju, “coxHE: A Software-Hardware Co-Design Framework for FPGA Acceleration of Homomorphic Computation,” in Design, Automation & Test in Europe Conference & Exhibition, 2022, pp. 1353–1358.
  40. S. Hong, S. Kim, J. Choi, Y. Lee, and J. H. Cheon, “Efficient Sorting of Homomorphic Encrypted Data With k-Way Sorting Network,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 4389–4404, 2021.
  41. S. Y. Hou et al., “Wafer-Level Integration of an Advanced Logic-Memory System Through the Second-Generation CoWoS Technology,” IEEE Transactions on Electron Devices, vol. 64, no. 10, pp. 4071–4077, 2017.
  42. Z. Huang, W. jie Lu, C. Hong, and J. Ding, “Cheetah: Lean and Fast Secure Two-Party Deep Neural Network Inference,” in USENIX Security Symposium, 2022, pp. 809–826.
  43. R. Hwang, T. Kim, Y. Kwon, and M. Rhu, “Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations,” in ISCA, 2020, pp. 968–981.
  44. IEEE, “International Roadmap for Devices and Systems,” Tech. Rep., 2018. [Online]. Available: https://irds.ieee.org/editions/2018/
  45. IEEE, “Heterogeneous Integration Roadmap,” Tech. Rep., 2021. [Online]. Available: https://eps.ieee.org/technology/heterogeneous-integration-roadmap.html
  46. Y. Ishimaki, H. Imabayashi, K. Shimizu, and H. Yamana, “Privacy-Preserving String Search for Genome Sequences with FHE Bootstrapping Optimization,” in 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 3989–3991.
  47. JEDEC, “High Bandwidth Memory DRAM (HBM3),” Tech. Rep. JESD238, 2022.
  48. W. Jeong et al., “True 7nm Platform Technology featuring Smallest FinFET and Smallest SRAM cell by EUV, Special Constructs and 3rd Generation Single Diffusion Break,” in IEEE Symposium on VLSI Technology, 2018, pp. 59–60.
  49. L. Jiang, Q. Lou, and N. Joshi, “MATCHA: A Fast and Energy-Efficient Accelerator for Fully Homomorphic Encryption over the Torus,” in ACM/IEEE Design Automation Conference, 2022, pp. 235–240.
  50. N. P. Jouppi et al., “Ten Lessons From Three Generations Shaped Google’s TPUv4i: Industrial Product,” in ISCA, 2021, pp. 1–14.
  51. W. Jung, S. Kim, J. Ahn, J. H. Cheon, and Y. Lee, “Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs,” IACR Transactions on Cryptographic Hardware and Embedded Systems, vol. 2021, no. 4, pp. 114–148, 2021.
  52. W. Jung et al., “Accelerating Fully Homomorphic Encryption Through Architecture-Centric Analysis and Optimization,” IEEE Access, vol. 9, pp. 98 772–98 789, 2021.
  53. C. Juvekar, V. Vaikuntanathan, and A. Chandrakasan, “{{\{{GAZELLE}}\}}: A Low Latency Framework for Secure Neural Network Inference,” in USENIX Security Symposium, 2018, pp. 1651–1669.
  54. A. Kannan, N. E. Jerger, and G. H. Loh, “Enabling Interposer-Based Disintegration of Multi-Core Processors,” in MICRO, 2015, pp. 546–558.
  55. J. Kim et al., “ARK: Fully Homomorphic Encryption Accelerator with Runtime Data Generation and Inter-Operation Key Reuse,” in MICRO, 2022, pp. 1237–1254.
  56. M. Kim et al., “Ultrafast homomorphic encryption models enable secure outsourcing of genotype imputation,” Cell Systems, vol. 12, no. 11, pp. 1108–1120.e4, 2021.
  57. S. Kim, W. Jung, J. Park, and J. Ahn, “Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs,” in IEEE International Symposium on Workload Characterization, 2020, pp. 264–275.
  58. S. Kim et al., “BTS: An Accelerator for Bootstrappable Fully Homomorphic Encryption,” in ISCA, 2022, pp. 711–725.
  59. S. Kim, K. Lee, W. Cho, Y. Nam, J. H. Cheon, and R. A. Rutenbar, “Hardware Architecture of a Number Theoretic Transform for a Bootstrappable RNS-based Homomorphic Encryption Scheme,” in IEEE International Symposium on Field-Programmable Custom Computing Machines, 2020, pp. 56–64.
  60. T. Kim, Y. Oh, and H. Kim, “Efficient Privacy-Preserving Fingerprint-Based Authentication System Using Fully Homomorphic Encryption,” Security and Communication Networks, vol. 2020, pp. 1–11, 2020.
  61. E. Lee et al., “Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions,” in International Conference on Machine Learning, 2022, pp. 12 403–12 422.
  62. J.-W. Lee et al., “Privacy-Preserving Machine Learning With Fully Homomorphic Encryption for Deep Neural Network,” IEEE Access, vol. 10, pp. 30 039–30 054, 2022.
  63. S. Li, J. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, “McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures,” in MICRO, 2009, pp. 469–480.
  64. Y. Li, A. Louri, and A. Karanth, “SPACX: Silicon Photonics-based Scalable Chiplet Accelerator for DNN Inference,” in HPCA, 2022, pp. 831–845.
  65. Lotame, “IDFA and Big Tech Impact – One Year Later,” 2022. [Online]. Available: https://www.lotame.com/idfa-and-big-tech-impact-one-year-later
  66. R. Mahajan et al., “Embedded Multi-die Interconnect Bridge (EMIB) – A High Density, High Bandwidth Packaging Interconnect,” in IEEE Electronic Components and Technology Conference, 2016, pp. 557–565.
  67. A. C. Mert, E. Öztürk, and E. Savaş, “Design and Implementation of a Fast and Scalable NTT-Based Polynomial Multiplier Architecture,” in Euromicro Conference on Digital System Design, 2019, pp. 253–260.
  68. T. Morshed, M. M. A. Aziz, and N. Mohammed, “CPU and GPU Accelerated Fully Homomorphic Encryption,” in IEEE International Symposium on Hardware Oriented Security and Trust, 2020, pp. 142–153.
  69. S. Naffziger et al., “Pioneering Chiplet Technology and Design for the AMD EPYC™ and Ryzen™ Processor Families : Industrial Product,” in ISCA, 2021, pp. 57–70.
  70. S. Naffziger, K. Lepak, M. Paraschou, and M. Subramony, “AMD Chiplet Architecture for High-Performance Server and Desktop Products,” in IEEE International Solid-State Circuits Conference, 2020, pp. 44–45.
  71. K. Nam, H. Oh, H. Moon, and Y. Paek, “Accelerating N-Bit Operations over TFHE on Commodity CPU-FPGA,” in IEEE/ACM International Conference on Computer-Aided Design, 2022, pp. 1–9.
  72. S. Narasimha et al., “A 7nm CMOS Technology Platform for Mobile and High Performance Compute Application,” in IEEE International Electron Devices Meeting, 2017, pp. 29.5.1–29.5.4.
  73. N. Nassif et al., “Sapphire Rapids: The Next-Generation Intel Xeon Scalable Processor,” in IEEE International Solid-State Circuits Conference, vol. 65, 2022, pp. 44–46.
  74. W. Oed and O. Lange, “On the Effective Bandwidth of Interleaved Memories in Vector Processor Systems,” IEEE Transactions on Computers, vol. C-34, no. 10, pp. 949–957, 1985.
  75. M. O’Connor et al., “Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems,” in MICRO, 2017, pp. 41–54.
  76. G. Pradel and C. Mitchell, “Privacy-Preserving Biometric Matching Using Homomorphic Encryption,” in IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 2021, pp. 494–505.
  77. O. Regev, “On Lattices, Learning with Errors, Random Linear Codes, and Cryptography,” Journal of the ACM, vol. 56, no. 6, pp. 1–40, 2009.
  78. M. S. Riazi, K. Laine, B. Pelton, and W. Dai, “HEAX: An Architecture for Computing on Encrypted Data,” in ASPLOS, 2020, pp. 1295–1309.
  79. S. S. Roy, F. Turan, K. Järvinen, F. Vercauteren, and I. Verbauwhede, “FPGA-Based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data,” in HPCA, 2019, pp. 387–398.
  80. N. Samardzic et al., “F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption,” in MICRO, 2021, pp. 238–252.
  81. N. Samardzic et al., “CraterLake: A Hardware Accelerator for Efficient Unbounded Computation on Encrypted Data,” in ISCA, 2022, pp. 173–187.
  82. G. Seiler, “Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography,” Cryptology ePrint Archive, Paper 2018/039, 2018. [Online]. Available: https://eprint.iacr.org/2018/039
  83. A. Shafaei, Y. Wang, X. Lin, and M. Pedram, “FinCACTI: Architectural Analysis and Modeling of Caches with Deeply-Scaled FinFET Devices,” in IEEE Computer Society Annual Symposium on VLSI, 2014, pp. 290–295.
  84. Y. S. Shao et al., “Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture,” in MICRO, 2019, pp. 14–27.
  85. K. Shivdikar et al., “GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption,” in MICRO, 2023, pp. 670–684.
  86. T. Song et al., “A 7nm FinFET SRAM Using EUV Lithography with Dual Write-Driver-Assist Circuitry for Low-Voltage Applications,” in IEEE International Solid-State Circuits Conference, 2018, pp. 198–200.
  87. H. S. Stone, “Parallel Processing with the Perfect Shuffle,” IEEE Transactions on Computers, vol. C-20, no. 2, pp. 153–161, 1971.
  88. E. Talpes et al., “The Microarchitecture of DOJO, Tesla’s Exa-Scale Computer,” IEEE Micro, pp. 1–5, 2023.
  89. Z. Tan, H. Cai, R. Dong, and K. Ma, “NN-BATON: DNN Workload Orchestration and Chiplet Granularity Exploration for Multichip Accelerators,” in ISCA.   IEEE, 2021, pp. 1013–1026.
  90. F. Turan, S. S. Roy, and I. Verbauwhede, “HEAWS: An Accelerator for Homomorphic Encryption on the Amazon AWS FPGA,” IEEE Transactions on Computers, vol. 69, no. 8, pp. 1185–1196, 2020.
  91. Universal Chiplet Interconnect Express, “Universal Chiplet Interconnect express (UCIe) Specification,” Tech. Rep., 2022. [Online]. Available: https://www.uciexpress.org/specification
  92. Vernam Group, “CUDA-Accelerated Fully Homomorphic Encryption Library,” Feb 2019. [Online]. Available: https://github.com/vernamlab/cuFHE
  93. S. Wu et al., “A 7nm CMOS Platform Technology Featuring 4th Generation FinFET Transistors with a 0.027um2 High Density 6-T SRAM cell for Mobile SoC Applications,” in IEEE International Electron Devices Meeting, 2016, pp. 2.6.1–2.6.4.
  94. J. Xia, C. Cheng, X. Zhou, Y. Hu, and P. Chun, “Kunpeng 920: The First 7-nm Chiplet-Based 64-Core ARM SoC for Cloud Services,” IEEE Micro, vol. 41, no. 5, pp. 67–75, 2021.
  95. Y. Yang, H. Zhang, S. Fan, H. Lu, M. Zhang, and X. Li, “Poseidon: Practical Homomorphic Encryption Accelerator,” in HPCA, 2023, pp. 870–881.
  96. F. Zaruba, F. Schuiki, and L. Benini, “Manticore: A 4096-Core RISC-V Chiplet Architecture for Ultraefficient Floating-Point Computing,” IEEE Micro, vol. 41, no. 2, pp. 36–42, 2021.
  97. Y. Zhai et al., “Accelerating Encrypted Computing on Intel GPUs,” in IEEE International Parallel and Distributed Processing Symposium, 2022, pp. 705–716.

Summary

We haven't generated a summary for this paper yet.