Assumption-Lean and Data-Adaptive Post-Prediction Inference (2311.14220v4)

Published 23 Nov 2023 in stat.ME, cs.LG, and stat.ML

Abstract: A primary challenge facing modern scientific research is the limited availability of gold-standard data which can be costly, labor-intensive, or invasive to obtain. With the rapid development of ML, scientists can now employ ML algorithms to predict gold-standard outcomes with variables that are easier to obtain. However, these predicted outcomes are often used directly in subsequent statistical analyses, ignoring imprecision and heterogeneity introduced by the prediction procedure. This will likely result in false positive findings and invalid scientific conclusions. In this work, we introduce PoSt-Prediction Adaptive inference (PSPA) that allows valid and powerful inference based on ML-predicted data. Its "assumption-lean" property guarantees reliable statistical inference without assumptions on the ML prediction. Its "data-adaptive" feature guarantees an efficiency gain over existing methods, regardless of the accuracy of ML prediction. We demonstrate the statistical superiority and broad applicability of our method through simulations and real-data applications.

References (33)

Citations (9)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/Q_StatGen/status/1840798799171596649

https://twitter.com/Q_StatGen/status/1743292369548488976

https://twitter.com/litscraper/status/1755434536815788424

https://twitter.com/Q_StatGen/status/1743747282023891009

https://twitter.com/Q_StatGen/status/1744865893769773509

https://twitter.com/litscraper/status/1754721599394292072

Assumption-Lean and Data-Adaptive Post-Prediction Inference (2311.14220v4)

Summary

Related Papers

Tweets