Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
11 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
37 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

On a Near-Optimal \& Efficient Algorithm for the Sparse Pooled Data Problem (2312.14588v1)

Published 22 Dec 2023 in math.PR, cs.DM, cs.IT, math.IT, and stat.ML

Abstract: The pooled data problem asks to identify the unknown labels of a set of items from condensed measurements. More precisely, given $n$ items, assume that each item has a label in $\cbc{0,1,\ldots, d}$, encoded via the ground-truth $\SIGMA$. We call the pooled data problem sparse if the number of non-zero entries of $\SIGMA$ scales as $k \sim n{\theta}$ for $\theta \in (0,1)$. The information that is revealed about $\SIGMA$ comes from pooled measurements, each indicating how many items of each label are contained in the pool. The most basic question is to design a pooling scheme that uses as few pools as possible, while reconstructing $\SIGMA$ with high probability. Variants of the problem and its combinatorial ramifications have been studied for at least 35 years. However, the study of the modern question of \emph{efficient} inference of the labels has suggested a statistical-to-computational gap of order $\log n$ in the minimum number of pools needed for theoretically possible versus efficient inference. In this article, we resolve the question whether this $\log n$-gap is artificial or of a fundamental nature by the design of an efficient algorithm, called \algoname, based upon a novel pooling scheme on a number of pools very close to the information-theoretic threshold.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. M. Aldridge, O. Johnson and J. Scarlett “Group Testing: An Information Theory Perspective” In Foundations and Trends in Communications and Information Theory 15.3–4, 2019, pp. 196–392
  2. “The Franz-Parisi Criterion and Computational Trade-offs in High Dimensional Statistics” In Advances in Neural Information Processing Systems 35, 2022, pp. 33831–33844
  3. E.J. Candes, J. Romberg and T. Tao “Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information” In IEEE Transactions on Information Theory 52.2, 2006, pp. 489–509
  4. “Optimal Design of Process Flexibility for General Production Systems” In Operations Research 67.2 INFORMS, 2019, pp. 516–531
  5. “Optimal group testing” In Combinatorics, Probability and Computing 30.6 Cambridge University Press, 2021, pp. 811–848
  6. “Statistical and Computational Phase Transitions in Group Testing” In Proceedings of 35th Conference on Learning Theory (COLT) 178 PMLR, 2022, pp. 4764–4781
  7. A. Djackov “On a Search Model of False Coins” In Topics in Information Theory (Colloquia Mathematica Societatis János Bolyai 16). Budapest, Hungary: Hungarian Academy of Sciences 16, 1975, pp. 163–170
  8. “Thresholds for the Recovery of Sparse Solutions via L1 Minimization” In 2006 40th Annual Conference on Information Sciences and Systems, 2006, pp. 202–206 IEEE
  9. “Decoding from Pooled Data: Phase Transitions of Message Passing” In IEEE Transactions on Information Theory 65.1 IEEE, 2018, pp. 572–585
  10. “Quantitative Group Testing and the rank of random matrices” CoRR, abs/2006.09074, 2020 arXiv:2006.09074 [cs.IT]
  11. “Time-Varying Periodic Convolutional Codes with Low-Density Parity-Check Matrix” In IEEE Transactions on Information Theory 45.6, 1999, pp. 2181–2191
  12. “Information-Theoretic and Algorithmic Aspects of Parallel and Distributed Reconstruction from Pooled Data” In Journal of Parallel and Distributed Computing 180 Elsevier, 2023, pp. 104718
  13. “Optimal Reconstruction of Graphs under the Additive Model” In Algorithmica 28.1, 2000, pp. 104–124
  14. “Near Optimal Efficient Decoding from Pooled Data” In Proceedings of 35th Conference on Learning Theory (COLT) 178, 2022, pp. 3395–3409 PMLR
  15. S. Janson “On Concentration Of Probability” In Contemporary Combinatorics, Bolyai Society Mathematical Studies. Budapest, Hungary: János Bolyai Mathematical Society 10, 2002, pp. 289–301
  16. Svante Janson, Tomasz Luczak and Andrzej Rucinski “Random Graphs” John Wiley & Sons, 2011
  17. “Sparse Graph Codes for Non-adaptive Quantitative Group Testing” In 2019 IEEE Information Theory Workshop (ITW), 2019, pp. 1–5
  18. “Non-adaptive Quantitative Group Testing Using Irregular Sparse Graph Codes” In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2019, pp. 608–614 IEEE
  19. S. Kudekar, T. Richardson and R.L. Urbanke “Spatially Coupled Ensembles Universally Achieve Capacity under Belief Propagation” In IEEE Transactions on Information Theory 59.12, 2013, pp. 7761–7813
  20. Shrinivas Kudekar and Henry D Pfister “The Effect of Spatial Coupling on Compressive Sensing” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 347–353 IEEE
  21. “Neural Group Testing to Accelerate Deep Learning” In 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 958–963
  22. J.P. Martins, R. Santos and R. Sousa “Testing the Maximum by the Mean in Quantitative Group Tests” In New Advances in Statistical Modeling and Applications Springer, 2014, pp. 55–63
  23. “Support Recovery in Universal One-Bit Compressed Sensing” In 13th Innovations in Theoretical Computer Science Conference, 2022, pp. 106:1–106:20 Schloss Dagstuhl-Leibniz-Zentrum für Informatik
  24. “Phase Transitions in the Pooled Data Problem” In Advances in Neural Information Processing Systems 30, 2017, pp. 376–384
  25. “DNA Pooling: A Tool for Large-Scale Association Studies” In Nature Reviews Genetics 3.11, 2002, pp. 862–871
  26. H.S. Shapiro “Problem E 1399” In Amer. Math. Monthly 67, 1960, pp. 82
  27. Mahdi Soleymani, Hessam Mahdavifar and Tara Javidi “Non-Adaptive Quantitative Group Testing via Plotkin-Type Constructions” In 2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 1854–1859
  28. C. Wang, Q. Zhao and C.N. Chuah “Group Testing under Sum Observations for Heavy Hitter Detection” In 2015 Information Theory and Applications Workshop (ITA), 2015, pp. 149–153 IEEE
  29. Alexander S Wein “Optimal Low-Degree Hardness of Maximum Independent Set” In Mathematical Statistics and Learning 4.3, 2022, pp. 221–251
  30. “Parallel Feature Selection Inspired by Group Testing” In Advances in Neural Information Processing Systems 27, 2014, pp. 3554–3562
Citations (1)

Summary

We haven't generated a summary for this paper yet.