Universally Utility-Maximizing Privacy Mechanisms (0811.2841v3)

Published 18 Nov 2008 in cs.DB and cs.GT

Abstract: A mechanism for releasing information about a statistical database with sensitive data must resolve a trade-off between utility and privacy. Privacy can be rigorously quantified using the framework of {\em differential privacy}, which requires that a mechanism's output distribution is nearly the same whether or not a given database row is included or excluded. The goal of this paper is strong and general utility guarantees, subject to differential privacy. We pursue mechanisms that guarantee near-optimal utility to every potential user, independent of its side information (modeled as a prior distribution over query results) and preferences (modeled via a loss function). Our main result is: for each fixed count query and differential privacy level, there is a {\em geometric mechanism} $M^*$ -- a discrete variant of the simple and well-studied Laplace mechanism -- that is {\em simultaneously expected loss-minimizing} for every possible user, subject to the differential privacy constraint. This is an extremely strong utility guarantee: {\em every} potential user $u$, no matter what its side information and preferences, derives as much utility from $M^*$ as from interacting with a differentially private mechanism $M_u$ that is optimally tailored to $u$.

Authors (3)

Arpita Ghosh (24 papers)
Tim Roughgarden (80 papers)
Mukund Sundararajan (27 papers)

Citations (524)

View on Semantic Scholar

Summary

Universally Utility-Maximizing Privacy Mechanisms

This paper examines mechanisms for releasing information from statistical databases with sensitive data, navigating the trade-off between utility and privacy encapsulated in differential privacy. The pivotal concern is to devise mechanisms that maximize utility while adhering to privacy constraints, specifically targeting differential privacy.

Main Contributions

The paper introduces the geometric mechanism as a universally utility-maximizing privacy mechanism for count queries. The geometric mechanism is a discrete adaptation of the well-known Laplace mechanism, designed to minimize expected loss for every user irrespective of their side information and preferences. This mechanism achieves differential privacy by releasing perturbed query results using a two-sided geometric distribution, ensuring that the probability of outputs remains invariant within a defined privacy level.

A key result is proving the simultaneous optimality of this mechanism. For any differential privacy level, the geometric mechanism delivers as much utility as any user-specific optimal mechanism, making it applicable to all users regardless of their prior distributions and loss functions. This finding is notable as it suggests a single privacy-preserving mechanism could feasibly meet the needs of diverse users without customization.

Methodology

The paper's methodology involves formulating the optimal mechanism design problem within a linear programming framework. The utility model is robust, accounting for:

Loss Functions: Modeling user preferences, loss functions are arbitrary but nonnegative and non-decreasing in the distance between actual and perturbed query results.
Priors: These distributions model the user's side information, representing the user's expectations before observing the perturbed query results.

Two salient factors solidify the approach:

Optimal Remappings: Users can post-process outputs to optimize their utility, making the geometric mechanism universally applicable.
Proof Structure: The paper provides a detailed proof structure illustrating that the geometric mechanism's range can be manipulated through deterministic remaps, ensuring minimal loss for every potential user profile.

Implications and Future Directions

Practically, this work offers a straightforward tool for system designers to implement privacy without needing to tailor mechanisms to specific users, streamlining deployment across various applications such as public data releases by government agencies or private data handling by internet companies.

Theoretically, this research contributes to our understanding of differential privacy by presenting a mechanism that is both a theoretical construct and a practical tool, challenging the notion that privacy mechanisms need to be customized to user-specific data distributions and loss profiles.

Future investigations could explore extending the framework to other types of queries beyond count queries or to interactive settings where multiple queries are posed sequentially. Additionally, extension to alternative privacy definitions or more complex user models poses an intriguing line of inquiry.

In conclusion, this paper surmounts the challenge of providing strong utility guarantees in privacy settings, leveraging the geometric mechanism to deliver universally optimal performance while adhering to differential privacy constraints. This presents a pivotal advance in privacy-preserving data analysis.

PDF Markdown