Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparse joint shift in multinomial classification (2303.16971v3)

Published 29 Mar 2023 in stat.ML, cs.LG, math.ST, and stat.TH

Abstract: Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions, and suggest potential improvements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. [10.1109/TPAMI.2021.3086060] K. Azizzadenesheli, Importance weight estimation and generalization in domain adaptation under label shift, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 (2022), 6578-6584.
  2. [10.7551/mitpress/7503.003.0022] S. Ben-David, J. Blitzer, K. Crammer and F. Pereira, Analysis of representations for domain adaptation, in Advances in Neural Information Processing Systems, (eds. B. Schölkopf, J. Platt and T. Hoffman), vol. 19, MIT Press, 2006, 137-144.
  3. (MR830424) P. Billingsley, Probability and Measure, 2nd edition, John Wiley & Sons, 1986.
  4. L. Chen, M. Zaharia and J. Zou, \doititleEstimating and explaining model performance when both covariates and labels shift, in Advances in Neural Information Processing Systems, NeurIPS 2022, (eds. S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho and A. Oh), vol. 35, Curran Associates, Inc., 2022, 11467-11479.
  5. C. Elkan, \doititleThe foundations of cost-sensitive learning, in Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001, (ed. B. Nebel), Morgan Kaufmann, 2001, 973-978.
  6. (MR2434763) [10.1007/s10618-008-0097-y] G. Forman, \doititleQuantifying counts and costs via classification, Data Mining and Knowledge Discovery, 17 (2008), 164-206.
  7. H. He, Y. Yang and H. Wang, \doititleDomain adaptation with factorizable joint shift, Presented at the ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning, (2021).
  8. (MR3300919) [10.1016/j.ejor.2014.11.022] V. Hofer, Adapting a classification rule to local and global shift when only unlabelled data are available, European Journal of Operational Research, 243 (2015), 177-189.
  9. (MR3192004) [10.1214/13-AOAS709] H. Holzmann and M. Eulert, The role of the information set for forecasting – with applications to risk management, The Annals of Applied Statistics, 8 (2014), 595-621.
  10. F. Johansson, D. Sontag and R. Ranganath, \doititleSupport and invertibility in domain-invariant representations, in Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, (eds. K. Chaudhuri and M. Sugiyama), vol. 89 of Proceedings of Machine Learning Research, PMLR, 2019, 527-536.
  11. M. Kirchmeyer, A. Rakotomamonjy, E. de Bezenac and P. Gallinari, \doititleMapping conditional distributions for domain adaptation under generalized target shift, 2021, arXiv:2110.15057, Presented at ICLR 2022.
  12. (MR2160228) [10.1142/p386] F. C. Klebaner, Introduction to Stochastic Calculus with Applications, 2nd edition, Imperial College Press, 2005.
  13. (MR3112259) [10.1007/978-1-4471-5361-0] A. Klenke, Probability Theory: A Comprehensive Course, Springer Science & Business Media, 2013.
  14. [10.1016/j.patcog.2011.06.019] J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodriguez, N. V. Chawla and F. Herrera, \doititleA unifying view on dataset shift in classification, Pattern Recognition, 45 (2012), 521-530.
  15. [10.1162/089976602753284446] M. Saerens, P. Latinne and C. Decaestecker, Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure, Neural Computation, 14 (2001), 21-41.
  16. (MR3932867) C. Scott, \doititleA generalized Neyman-Pearson criterion for optimal domain adaptation, in Proceedings of Machine Learning Research, 30th International Conference on Algorithmic Learning Theory, vol. 98, 2019, 1-24.
  17. (MR1795598) [10.1016/S0378-3758(00)00115-4] H. Shimodaira, Improving predictive inference under covariate shift by weighting the log-likelihood function, Journal of Statistical Planning and Inference, 90 (2000), 227-244.
  18. [10.7551/mitpress/7921.003.0004] A. Storkey, \doititleWhen training and test sets are different: Characterizing learning transfer, in Dataset Shift in Machine Learning (eds. J. Quiñonero-Candela, M. Sugiyama, A. Schwaighofer and N. Lawrence), The MIT Press, Cambridge, Massachusetts, 2009, chapter 1, 3-28.
  19. (MR2895762) [10.1017/CBO9781139035613] M. Sugiyama, T. Suzuki and T. Kanamori, Density Ratio Estimation in Machine Learning, Cambridge University Press, 2012.
  20. (MR4422842) [10.1080/02331888.2021.2016767] D. Tasche, \doititleCalibrating sufficiently, Statistics, 55 (2021), 1356-1386.
  21. D. Tasche, \doititleClass prior estimation under covariate shift: No problem?, in ECML/PKDD 2022 Workshop Learning to Quantify: Methods and Applications (LQ 2022), 2022.
  22. [10.3390/make4030038] D. Tasche, \doititleFactorizable joint shift in multinomial classification, Machine Learning and Knowledge Extraction, 4 (2022), 779-802.
  23. K. Zhang, B. Schölkopf, K. Muandet and Z. Wang, \doititleDomain adaptation under target and conditional shift, in Proceedings of the 30th International Conference on International Conference on Machine Learning – Volume 28, ICML’13, JMLR.org, 2013, III–819-III–827.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets