The Capacity of Symmetric Private Information Retrieval (1606.08828v2)

Published 28 Jun 2016 in cs.IT, cs.CR, cs.IR, and math.IT

Abstract: Private information retrieval (PIR) is the problem of retrieving as efficiently as possible, one out of $K$ messages from $N$ non-communicating replicated databases (each holds all $K$ messages) while keeping the identity of the desired message index a secret from each individual database. Symmetric PIR (SPIR) is a generalization of PIR to include the requirement that beyond the desired message, the user learns nothing about the other $K-1$ messages. The information theoretic capacity of SPIR (equivalently, the reciprocal of minimum download cost) is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. We show that the capacity of SPIR is $1-1/N$ regardless of the number of messages $K$, if the databases have access to common randomness (not available to the user) that is independent of the messages, in the amount that is at least $1/(N-1)$ bits per desired message bit, and zero otherwise. Extensions to the capacity region of SPIR and the capacity of finite length SPIR are provided.

Citations (181)

View on Semantic Scholar

Summary

The paper establishes that the optimal SPIR capacity is 1-1/N, indicating that data retrieval efficiency improves as the number of databases increases.
It reveals that a minimum of 1/(N-1) bits of common randomness per desired message is necessary to achieve full capacity and maintain stringent privacy.
The work highlights that SPIR capacity is independent of the number of messages, simplifying protocol design in diverse distributed data systems.

Capacity of Symmetric Private Information Retrieval (SPIR)

The paper, authored by Hua Sun and Syed A. Jafar, presents a comprehensive exploration into the field of Symmetric Private Information Retrieval (SPIR). This area of paper is situated at the intersection of several domains such as information theory, cryptography, coding, and complexity theory, and tackles the intricate task of retrieving information without compromising privacy.

Overview

Private Information Retrieval (PIR) traditionally addresses the challenge of obtaining data from distributed databases while ensuring the privacy of the user. In SPIR, privacy requirements are elevated; not only must the user's request remain confidential, but the user must also gain no additional information about other non-requested messages.

The work by Sun and Jafar delivers the core findings regarding the SPIR capacity—the maximum rate at which data can be privately retrieved from $N$ databases holding $K$ messages. An intriguing aspect revealed by the authors is that the SPIR capacity is expressed as $1-1/N$, irrespective of the number of messages $K$ . However, achieving this optimal capacity necessitates the availability of common randomness distributed among the databases, totaling at least $1/(N-1)$ bits of randomness per desired message bit. Without this, the capacity reduces to zero, thus underscoring the essential role of common randomness.

Key Results

The paper offers several significant contributions to the literature:

SPIR Capacity: The capacity is $1-1/N$, which indicates that as the number of databases $N$ increases, the SPIR capacity approaches the ideal value of 1. This suggests improved efficiency in larger distributed systems.
Common Randomness: The critical threshold for the amount of common randomness required among databases is established at $1/(N-1)$. This threshold is not only necessary for achieving non-zero capacity but also sufficient to reach full capacity $1-1/N$.
Independence from $K$ : The capacity's independence from the number of messages $K$ denotes robustness across different PIR scenarios, simplifying design considerations regardless of dataset size.
Extensions: The paper extends these findings to cases involving unequal message sizes and finite-length messages, providing a holistic approach to SPIR strategies.

Implications and Speculations

The implications of this research are both profound and practical. From a theoretical perspective, understanding the bounds and requirements for SPIR contributes to optimizing distributed databases' privacy protocols. Practically, implementing SPIR according to the capacity and randomness findings can enhance security mechanisms for distributed data storage systems, particularly where data sensitivity is paramount.

Furthermore, future research avenues may explore the relationship between SPIR and cryptographic schemes such as oblivious transfer, potentially unlocking more efficient protocols and applications. These investigations could also lead to advancements in complexity theory, exploring the computational implications of efficient SPIR designs.

Overall, the work by Sun and Jafar provides the foundational knowledge required to push forward the boundaries of privacy-preserving data retrieval methods. As the field of AI continues to evolve, understanding how to maintain privacy within distributed systems will remain a key area of focus, with this paper offering crucial insights into how that can be achieved optimally.

PDF Markdown