Linear Queries Estimation with Local Differential Privacy (1810.02810v1)

Published 5 Oct 2018 in cs.LG and stat.ML

Abstract: We study the problem of estimating a set of $d$ linear queries with respect to some unknown distribution $\mathbf{p}$ over a domain $\mathcal{J}=[J]$ based on a sensitive data set of $n$ individuals under the constraint of local differential privacy. This problem subsumes a wide range of estimation tasks, e.g., distribution estimation and $d$-dimensional mean estimation. We provide new algorithms for both the offline (non-adaptive) and adaptive versions of this problem. In the offline setting, the set of queries are fixed before the algorithm starts. In the regime where $n\lesssim d^2/\log(J)$, our algorithms attain $L_2$ estimation error that is independent of $d$, and is tight up to a factor of $\tilde{O}\left(\log^{{1/4}(J)\right)$.} For the special case of distribution estimation, we show that projecting the output estimate of an algorithm due to [Acharya et al. 2018] on the probability simplex yields an $L_2$ error that depends only sub-logarithmically on $J$ in the regime where $n\lesssim J^2/\log(J)$. These results show the possibility of accurate estimation of linear queries in the high-dimensional settings under the $L_2$ error criterion. In the adaptive setting, the queries are generated over $d$ rounds; one query at a time. In each round, a query can be chosen adaptively based on all the history of previous queries and answers. We give an algorithm for this problem with optimal $L_{\infty}$ estimation error (worst error in the estimated values for the queries w.r.t. the data distribution). Our bound matches a lower bound on the $L_{\infty}$ error for the offline version of this problem [Duchi et al. 2013].

Citations (40)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Linear Queries Estimation with Local Differential Privacy (1810.02810v1)

Summary

Related Papers