Papers
Topics
Authors
Recent
Search
2000 character limit reached

Differentially Private Projected Histograms of Multi-Attribute Data for Classification

Published 22 Apr 2015 in cs.CR | (1504.05997v1)

Abstract: In this paper, we tackle the problem of constructing a differentially private synopsis for the classification analyses. Several the state-of-the-art methods follow the structure of existing classification algorithms and are all iterative, which is suboptimal due to the locally optimal choices and the over-divided privacy budget among many sequentially composed steps. Instead, we propose a new approach, PrivPfC, a new differentially private method for releasing data for classification. The key idea is to privately select an optimal partition of the underlying dataset using the given privacy budget in one step. Given one dataset and the privacy budget, PrivPfC constructs a pool of candidate grids where the number of cells of each grid is under a data-aware and privacy-budget-aware threshold. After that, PrivPfC selects an optimal grid via the exponential mechanism by using a novel quality function which minimizes the expected number of misclassified records on which a histogram classifier is constructed using the published grid. Finally, PrivPfC injects noise into each cell of the selected grid and releases the noisy grid as the private synopsis of the data. If the size of the candidate grid pool is larger than the processing capability threshold set by the data curator, we add a step in the beginning of PrivPfC to prune the set of attributes privately. We introduce a modified $\chi2$ quality function with low sensitivity and use it to evaluate an attribute's relevance to the classification label variable. Through extensive experiments on real datasets, we demonstrate PrivPfC's superiority over the state-of-the-art methods.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.