Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parity Queries for Binary Classification (1809.00901v2)

Published 4 Sep 2018 in cs.IT, cs.HC, cs.LG, and math.IT

Abstract: Consider a query-based data acquisition problem that aims to recover the values of $k$ binary variables from parity (XOR) measurements of chosen subsets of the variables. Assume the response model where only a randomly selected subset of the measurements is received. We propose a method for designing a sequence of queries so that the variables can be identified with high probability using as few ($n$) measurements as possible. We define the query difficulty $\bar{d}$ as the average size of the query subsets and the sample complexity $n$ as the minimum number of measurements required to attain a given recovery accuracy. We obtain fundamental trade-offs between recovery accuracy, query difficulty, and sample complexity. In particular, the necessary and sufficient sample complexity required for recovering all $k$ variables with high probability is $n = c_0 \max{k, (k \log k)/\bar{d}}$ and the sample complexity for recovering a fixed proportion $(1-\delta)k$ of the variables for $\delta=o(1)$ is $n = c_1\max{k, (k \log(1/\delta))/\bar{d}}$, where $c_0, c_1>0$.

Summary

We haven't generated a summary for this paper yet.