Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gazelle: A Low Latency Framework for Secure Neural Network Inference (1801.05507v1)

Published 16 Jan 2018 in cs.CR

Abstract: The growing popularity of cloud-based machine learning raises a natural question about the privacy guarantees that can be provided in such a setting. Our work tackles this problem in the context where a client wishes to classify private images using a convolutional neural network (CNN) trained by a server. Our goal is to build efficient protocols whereby the client can acquire the classification result without revealing their input to the server, while guaranteeing the privacy of the server's neural network. To this end, we design Gazelle, a scalable and low-latency system for secure neural network inference, using an intricate combination of homomorphic encryption and traditional two-party computation techniques (such as garbled circuits). Gazelle makes three contributions. First, we design the Gazelle homomorphic encryption library which provides fast algorithms for basic homomorphic operations such as SIMD (single instruction multiple data) addition, SIMD multiplication and ciphertext permutation. Second, we implement the Gazelle homomorphic linear algebra kernels which map neural network layers to optimized homomorphic matrix-vector multiplication and convolution routines. Third, we design optimized encryption switching protocols which seamlessly convert between homomorphic and garbled circuit encodings to enable implementation of complete neural network inference. We evaluate our protocols on benchmark neural networks trained on the MNIST and CIFAR-10 datasets and show that Gazelle outperforms the best existing systems such as MiniONN (ACM CCS 2017) by 20 times and Chameleon (Crypto Eprint 2017/1164) by 30 times in online runtime. Similarly when compared with fully homomorphic approaches like CryptoNets (ICML 2016) we demonstrate three orders of magnitude faster online run-time.

Citations (826)

Summary

  • The paper introduces a novel framework that securely performs CNN inference by leveraging homomorphic encryption and two-party computation.
  • The approach achieves significant speedups, with up to 30x improvement over existing methods, ensuring low latency and robust privacy protection.
  • The methodology integrates efficient encryption switching protocols and tailored linear algebra kernels to streamline secure, practical neural network deployment.

Secure Neural Network Inference with GAZELLE: A Low Latency Approach

The paper, titled "GAZELLE: A Low Latency Framework for Secure Neural Network Inference," addresses the critical issue of privacy in cloud-based machine learning, particularly focusing on convolutional neural networks (CNNs). The computational paradigms offered by cloud infrastructures significantly enhance the accessibility and efficiency of machine learning but introduce potential vulnerabilities concerning data privacy. The authors, Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan, propose a solution that leverages homomorphic encryption and two-party computation to safeguard both user data and model integrity during inference.

Problem Statement and Goals

The primary objective of GAZELLE is to enable a client to classify private images using a pre-trained CNN hosted on a cloud server without disclosing the input to the server and without exposing the server-side model to the client. This dual-sided privacy is a fundamental requirement for applications in sensitive domains like medical diagnosis.

Methodological Contributions

GAZELLE introduces a scalable and low-latency solution through the clever amalgamation of homomorphic encryption techniques and two-party computation. The framework makes three notable contributions:

  1. GAZELLE Homomorphic Encryption Library: This component offers optimized algorithms for basic homomorphic operations, including SIMD addition, SIMD multiplication, and ciphertext permutation. These optimizations are key to enhancing performance and minimizing computational overhead.
  2. GAZELLE Linear Algebra Kernels: These kernels facilitate the mapping of neural network layers to efficient homomorphic matrix-vector multiplication and convolution routines. The kernels are crucial for the practical deployment of neural networks under a blind inference setting.
  3. Optimized Encryption Switching Protocols: GAZELLE introduces protocols that efficiently transition between homomorphic encryption and garbled circuits, enabling seamless and secure neural network inference.

Performance and Evaluation

GAZELLE's performance was evaluated against several benchmark neural networks implemented on the MNIST and CIFAR-10 datasets. The system achieved significant performance improvements: a 20x speedup over MiniONN and a 30x speedup over Chameleon in online runtime, and a three orders of magnitude faster runtime compared to fully homomorphic approaches like CryptoNets. These benchmarks underscore GAZELLE's efficiency and scalability in practical applications.

Homomorphic Encryption and Two-Party Computation

GAZELLE's approach to combining homomorphic encryption with two-party computation addresses the inherent limitations of each method when used in isolation.

  • Homomorphic Encryption: The library's use of lattice-based packed additive homomorphic encryption (PAHE) with operations like SIMD adds efficiency. The pivotal techniques involve addition, scalar multiplication, and permutations on ciphertexts, providing a robust structure for linear algebra computations.
  • Two-Party Computation: The use of Yao's garbled circuits for non-linear functions within the neural network ensures that non-linear layers such as ReLU and MaxPool can be computed securely with minimal communication overhead.

The implementation leverages a rotated input method for homomorphic matrix-vector multiplication and convolutional layers, maintaining a balance between noise growth and computational efficiency. Furthermore, the switching protocols bridge the gap between homomorphic encrypted values and garbled circuits, ensuring consistent operation throughout the neural network's layers.

Practical Implications and Future Directions

The practical implications of GAZELLE are significant for any domain requiring secure computation over sensitive data. The robust protection of both model parameters and user inputs aligns well with privacy regulations and ethical standards in data handling.

Future research may explore extending GAZELLE to more complex neural network structures used in domains like facial recognition and larger-scale natural language processing tasks. Another potential development is automating the secure computation pipeline, enabling easier integration into broader machine learning frameworks.

In conclusion, GAZELLE provides substantial advancements in the secure and efficient inference of neural networks. Its contribution lies in the optimization of homomorphic encryption operations and the innovative use of two-party computation, setting a foundation for secure machine learning in numerous applications.