Multicalibration for Modeling Censored Survival Data with Universal Adaptability
Abstract: Traditional statistical and machine learning methods typically assume that the training and test data follow the same distribution. However, this assumption is frequently violated in real-world applications, where the training data in the source domain may under-represent specific subpopulations in the test data of the target domain. This paper addresses target-independent learning under covariate shift, focusing on multicalibration for survival probability and restricted mean survival time. A black-box post-processing boosting algorithm specifically designed for censored survival data is introduced. By leveraging pseudo-observations, our method produces a multicalibrated predictor that is competitive with inverse propensity score weighting in predicting the survival outcome in an unlabeled target domain, ensuring not only overall accuracy but also fairness across diverse subpopulations. Our theoretical analysis of pseudo-observations builds upon the functional delta method and the $p$-variational norm. The algorithm's sample complexity, convergence properties, and multicalibration guarantees for post-processed predictors are provided. Our results establish a fundamental connection between multicalibration and universal adaptability, demonstrating that our calibrated function is comparable to, or outperforms, the inverse propensity score weighting estimator. Extensive numerical simulations and a real-world case study on cardiovascular disease risk prediction using two large prospective cohort studies validate the effectiveness of our approach.
- Pseudo-observations in survival analysis. Statistical methods in medical research, 19(1):71–99.
- Boosting transfer learning with survival data from heterogeneous domains. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 57–65. PMLR.
- Discriminative learning under covariate shift. Journal of Machine Learning Research, 10(9).
- Multi-Ethnic Study of Atherosclerosis: Objectives and Design. American Journal of Epidemiology, 156(9):871–881.
- Pseudo-observations for competing risks with covariate dependent censoring. Lifetime data analysis, 20:303–315.
- Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77–91. PMLR.
- Generalizing evidence from randomized clinical trials to target populations: the actg 320 trial. American journal of epidemiology, 172(1):107–115.
- Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202.
- Davison, A. C. (2003). Statistical models, volume 11. Cambridge university press.
- Happymap: A generalized multi-calibration method. arXiv preprint arXiv:2303.04379.
- Differentiability of six operators on nonsmooth functions and p-variation. Number 1703.
- Concrete functional calculus. Springer.
- Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226.
- Computer age statistical inference, student edition: algorithms, evidence, and data science, volume 6. Cambridge University Press.
- Frangakis, C. (2009). The calibration of treatment effects from clinical trials to target populations.
- X-cal: Explicit calibration for survival analysis. Advances in neural information processing systems, 33:18296–18307.
- On pseudo-values for regression analysis in competing risks models. Lifetime data analysis, 15:241–255.
- Effective ways to build and evaluate individual survival distributions. Journal of Machine Learning Research, 21(85):1–63.
- Evaluating the yield of medical tests. Jama, 247(18):2543–2546.
- Regression modelling strategies for improved prognostic prediction. Statistics in medicine, 3(2):143–152.
- Multicalibration: Calibration for the (Computationally-identifiable) masses. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1939–1948. PMLR.
- A note on the large sample properties of estimators based on generalized linear models for correlated pseudo-observations. Scandinavian Journal of Statistics, 43(3):845–862.
- Estimating calibrated individualized survival curves with deep learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 240–248.
- Multiaccuracy: Black-box post-processing for fairness in classification. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 247–254.
- Universal adaptability: Target-independent inference that competes with propensity scoring. Proceedings of the National Academy of Sciences, 119(4).
- Chronic renal insufficiency cohort (cric) study: baseline characteristics and associations with kidney function. Clinical journal of the American Society of Nephrology : CJASN, 4(8):1302–1311.
- Accommodating time-varying heterogeneity in risk estimation under the cox model: A transfer learning approach. Journal of the American Statistical Association, 118(544):2276–2287.
- Miller, R. G. (1974). The jackknife-a review. Biometrika, 61(1):1–15.
- A unifying view on dataset shift in classification. Pattern recognition, 45(1):521–530.
- Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations. The Annals of Statistics, pages 1988–2015.
- Pseudo-observations under covariate-dependent censoring. Journal of Statistical Planning and Inference, 202:112–122.
- mcboost: Multi-calibration boosting for r. Journal of Open Source Software, 6(64):3453.
- Roth, A. (2022). Uncertain: Modern topics in uncertainty estimation.
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 66(5):688.
- Multi-source survival domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9752–9762.
- Vaart, A. W. v. d. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
- Wei, L.-J. (1992). The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Statistics in medicine, 11(14-15):1871–1879.
- Selection criteria and generalizability within the counterfactual framework: explaining the paradox of antidepressant-induced suicidality? Clinical trials, 6(2):109–118.
- Propensity score weighting analysis of survival outcomes using pseudo-observations. arXiv preprint arXiv:2103.00605.
- A modern theory for high-dimensional cox regression models. arXiv preprint arXiv:2204.01161.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.