Private Information Retrieval from Coded Databases with Colluding Servers (1611.02062v3)

Published 7 Nov 2016 in cs.IT and math.IT

Abstract: We present a general framework for Private Information Retrieval (PIR) from arbitrary coded databases, that allows one to adjust the rate of the scheme according to the suspected number of colluding servers. If the storage code is a generalized Reed-Solomon code of length n and dimension k, we design PIR schemes which simultaneously protect against t colluding servers and provide PIR rate 1-(k+t-1)/n, for all t between 1 and n-k. This interpolates between the previously studied cases of t=1 and k=1 and asymptotically achieves the known capacity bounds in both of these cases, as the size of the database grows.

Citations (218)

View on Semantic Scholar

Summary

The paper introduces a novel PIR scheme that achieves a retrieval rate of (n-(k+t-1))/n for GRS-coded databases with colluding servers.
The methodology leverages the optimal properties of GRS codes and Hadamard products to ensure both correctness and privacy in the presence of collusion.
Numerical results demonstrate practical efficiency with setups like n=12 servers and t=2, underscoring its utility in large-scale, secure data storage.

An Analysis of Private Information Retrieval from Coded Databases with Colluding Servers

In the context of distributed storage systems and efficient retrieval of data, the paper titled "Private Information Retrieval from Coded Databases with Colluding Servers" offers a comprehensive framework for achieving secure and efficient data retrieval using Private Information Retrieval (PIR) techniques when some servers may collude. The primary focus of the paper is to create PIR schemes that protect user privacy by ensuring that servers cannot discern which data files users retrieve, even if a certain number of servers collude.

Overview of Contributions

The framework established in this paper is applicable to distributed databases that use coding for data storage. Specifically, the authors propose that if the data storage utilizes a generalized Reed-Solomon (GRS) code, the PIR schemes can achieve a retrieval rate of $\frac{n-(k+t-1)}{n}$ , where $n$ is the length, $k$ is the dimension of the code, and $t$ is the number of colluding servers protected against. Importantly, this work extends prior analyses by interpolating results for previously considered special cases, ultimately providing both a detailed theoretical foundation and practical mechanisms to maximize retrieval efficiency and privacy simultaneously.

Methodological Details

The methodology is centered around selecting a retrieval code alongside the storage code to form a scheme that combines elements of both for optimizing PIR rates. Crucially, the authors utilize GRS codes due to their optimal properties in manipulating minimum distances through their Hadamard (star) products. This multiplicative approach leverages the duality and self-duality attributes of GRS codes, ensuring that a systematic adaptation can counter the effects of server collusion.

The robustness of their scheme is evidenced by satisfying distinct conditions:

Correctness is ensured by appropriate alignment of code dimensions and properties, as stated in Theorem \ref{correct}.
Privacy relies on the star product and dual properties of the chosen codes to keep potential colluding servers uninformed of the file retrieval requests.

Numerical Results and Implications

The paper provides strong numerical results by verifying theoretical claims through instances with specific parameters. These include the example where $n=12$ servers and $m=8$ files, showcasing the practical closeness of the proposed PIR schemes to known capacity bounds under various storage code rates. An illustrative example involves choosing $t=2$ , $k=2$ , and $n=5$ , demonstrating the application with a tangible rate and privacy level.

Practical and Theoretical Implications

The implications of this research are multifaceted:

Practical: By reducing the storage and communication overheads and providing a systematic method of privacy-preserving retrieval, the solutions are particularly relevant for large-scale data repositories where server collusion is a realistic threat.
Theoretical: The use of GRS codes in a PIR context could potentially be leveraged in other cryptographic or information-theoretic settings where similar trade-offs between efficiency and privacy are vital.

Future Directions

While the paper focuses primarily on elementary and foundational cases (single file retrieval), future research could explore multi-file retrieval settings and extend capacity bounds demonstrated empirically. Furthermore, resolving the conjectured capacity for coded storage with collusion remains an open question in broader settings, which can spur continued theoretical explorations into the combinatorial properties of codes under such conditions.

Conclusion

This work serves as a pivotal contribution to the field of privacy-preserving data retrieval from coded databases, particularly under the realistic constraint of a fixed number of colluding servers. The judicious design of the PIR schemes utilizing the structures of GRS codes forms a bridge between theoretical interest and practical necessity, leaving significant room for future refinement and exploration.

PDF Markdown