XONN: XNOR-based Oblivious Deep Neural Network Inference (1902.07342v2)

Published 19 Feb 2019 in cs.CR

Abstract: Advancements in deep learning enable cloud servers to provide inference-as-a-service for clients. In this scenario, clients send their raw data to the server to run the deep learning model and send back the results. One standing challenge in this setting is to ensure the privacy of the clients' sensitive data. Oblivious inference is the task of running the neural network on the client's input without disclosing the input or the result to the server. This paper introduces XONN, a novel end-to-end framework based on Yao's Garbled Circuits (GC) protocol, that provides a paradigm shift in the conceptual and practical realization of oblivious inference. In XONN, the costly matrix-multiplication operations of the deep learning model are replaced with XNOR operations that are essentially free in GC. We further provide a novel algorithm that customizes the neural network such that the runtime of the GC protocol is minimized without sacrificing the inference accuracy. We design a user-friendly high-level API for XONN, allowing expression of the deep learning model architecture in an unprecedented level of abstraction. Extensive proof-of-concept evaluation on various neural network architectures demonstrates that XONN outperforms prior art such as Gazelle (USENIX Security'18) by up to 7x, MiniONN (ACM CCS'17) by 93x, and SecureML (IEEE S&P'17) by 37x. State-of-the-art frameworks require one round of interaction between the client and the server for each layer of the neural network, whereas, XONN requires a constant round of interactions for any number of layers in the model. XONN is first to perform oblivious inference on Fitnet architectures with up to 21 layers, suggesting a new level of scalability compared with state-of-the-art. Moreover, we evaluate XONN on four datasets to perform privacy-preserving medical diagnosis.

Citations (265)

View on Semantic Scholar

Summary

The paper introduces Xonn, a framework for oblivious deep neural network inference using binary neural networks and Yao's Garbled Circuits, replacing costly matrix multiplication with efficient XNOR operations.
Xonn achieves significant speed improvements over state-of-the-art secure inference methods, showing up to a 7x reduction in execution time compared to Gazelle.
The framework offers a high-level API for usability and enables practical, privacy-preserving DL inference in sensitive domains like healthcare with constant round complexity.

Oblivious Deep Neural Network Inference with Xonn: A Technical Examination

The proliferation of deep learning (DL) models and their deployment across cloud platforms have introduced substantial privacy risks, especially when direct access to sensitive data is indispensable for inference services. The seminal work presented in this paper, titled "Xonn: XNOR-based Oblivious Deep Neural Network Inference," brings forth a practical framework leveraging binary neural networks (BNNs) and Yao's Garbled Circuits (GC) for secure and efficient oblivious inference.

Core Contributions

The primary innovation of the Xonn framework lies in the strategic adoption of BNNs within the GC protocol. Xonn substitutes typical integer-based matrix multiplication operations—prevalent and computationally prohibitive in traditional cryptographic methods—with XNOR operations, which are essentially "free" in terms of computation within GC. This approach contrasts sharply with methods relying on operations like Homomorphic Encryption (HE), which, while powerful, entail significant computational burdens.

Xonn further introduces a novel design by customizing neural network architectures to optimize the runtime of the GC protocol. This ensures minimal sacrifice in inference accuracy while significantly curbing computational demands, thereby undermining a dominant bottleneck in previous secure computing frameworks.

Strong Numerical Claims and Comparative Analysis

Performance results vividly illustrate Xonn's computational efficiency. It surpasses Gazelle, the previous state-of-the-art approach for secure DL inference, achieving up to a 7x reduction in execution time. Moreover, it demonstrates an impressive 93x speed advantage over MiniONN and a 37x improvement over SecureML. The framework's innovative handling of deep architectures extends its relevance and application scalability, exhibiting constant round complexity, a critical consideration in reducing network latency in real-world settings.

High-Level Implementation and Accessibility

From an engineering perspective, Xonn pioneers a high-level API that significantly enhances usability. This infrastructure allows for seamless translation of models from popular machine learning libraries like Keras into the Xonn framework, facilitating widespread adoption and integration without necessitating deep cryptographic expertise from the end user.

Implications and Future Developments

The implications of Xonn are multifold. Practically, it opens avenues for deploying DL models in privacy-sensitive environments—such as healthcare—where data protection is paramount. The inclusion of privacy-preserving features in medical diagnostic systems, demonstrated in evaluations on datasets like breast cancer and malaria, underscores its applicability.

Theoretically, Xonn's paradigm of marrying DL model optimization with secure computation protocols paves a promising trajectory for research into marrying efficiency with privacy, especially in environments involving constrained computational resources.

In the broader context of AI development, Xonn might inspire further explorations into the binary representation of models and their operational efficiencies in cryptographic protocols. Future advancements may delve into enhancing the accuracy of BNNs while retaining their cryptographic efficiency benefits, as well as expanding support for more complex neural network architectures.

Xonn sets a precedent in overcoming the communication and computational hurdles in secure ML, advocating for novel integrations between model-specific training techniques and cryptographic protocols. It offers a glimpse into a future where seamless, secure interactions with machine learning models become standard practice, balancing the dichotomy of utility and privacy.

PDF Markdown