Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Making Differential Privacy Work for Census Data Users (2305.07208v2)

Published 12 May 2023 in cs.CY

Abstract: The U.S. Census Bureau collects and publishes detailed demographic data about Americans which are heavily used by researchers and policymakers. The Bureau has recently adopted the framework of differential privacy in an effort to improve confidentiality of individual census responses. A key output of this privacy protection system is the Noisy Measurement File (NMF), which is produced by adding random noise to tabulated statistics. The NMF is critical to understanding any errors introduced in the data, and performing valid statistical inference on published census data. Unfortunately, the current release format of the NMF is difficult to access and work with. We describe the process we use to transform the NMF into a usable format, and provide recommendations to the Bureau for how to release future versions of the NMF. These changes are essential for ensuring transparency of privacy measures and reproducibility of scientific research built on census data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. The 2020 Census Disclosure Avoidance System TopDown Algorithm. Harvard Data Science Review, (Special Issue 2). https://hdsr.mitpress.mit.edu/pub/7evz361i.
  2. 2010 Census Production Settings Redistricting Data (P.L. 94-171) Demonstration Noisy Measurement File (2023-04-03).
  3. 2020 Census Redistricting Data (P.L. 94-171) Noisy Measurement File, Version 1. Harvard Dataverse, DOI: 10.7910/DVN/5LAVKV.
  4. Concentrated differential privacy: Simplifications, extensions, and lower bounds. arXiv preprint 1605.02065.
  5. The discrete gaussian for differential privacy. Advances in Neural Information Processing Systems, 33:15676–15688.
  6. Geographic spines in the 2020 Census Disclosure Avoidance System. arXiv preprint 2203.16654.
  7. Letter to U.S. Census Bureau: Request for release of “noisy measurements file” by September 30 along with redistricting data products.
  8. JASON (2022). Consistency of data products and formal privacy methods for the 2020 census (jsr-21-02, January 11, 2022). Technical report, The MITRE Corporation.
  9. The impact of the US Census disclosure avoidance system on redistricting and voting rights analysis. Science Advances, 7(41):1–17.
  10. Comment: The essential role of policy evaluation for the 2020 census disclosure avoidance system. Harvard Data Science Review, Special Issue 2:1–16.
  11. Evaluating bias and noise induced by the U.S. census bureau’s privacy protection methods. ArXiv Preprint, https://arxiv.org/pdf/2306.07521.pdf.
  12. Bayesian and frequentist semantics for common variations of differential privacy: Applications to the 2020 census. arXiv preprint 2209.03310.
  13. Researchers need better access to us census data. Science, 380(6648):902–903.
  14. National Academies of Sciences, E. and Medicine (2020). 2020 Census Data Products: Data Needs and Privacy Considerations: Proceedings of a Workshop. The National Academies Press, Washington, DC.
  15. National Congress of American Indians (2021). Letter to Dr. Ron S. Jarmin from Dante Desiderio, Chief Executive Officer.
  16. Phillips v. Census Bureau (2022). Phillips v. Census Bureau. 1:2022cv09304, US District Court for the Southern District of New York.
  17. Balancing utility versus privacy in the 2020 census: Sentiments from data users. Available at SSRN 4089888.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com