Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Optimization and Model Selection for Domain Generalization: A Mixup-guided Solution (2209.00652v2)

Published 1 Sep 2022 in cs.LG

Abstract: The distribution shifts between training and test data typically undermine the performance of models. In recent years, lots of work pays attention to domain generalization (DG) where distribution shifts exist, and target data are unseen. Despite the progress in algorithm design, two foundational factors have long been ignored: 1) the optimization for regularization-based objectives, and 2) the model selection for DG since no knowledge about the target domain can be utilized. In this paper, we propose Mixup guided optimization and selection techniques for DG. For optimization, we utilize an adapted Mixup to generate an out-of-distribution dataset that can guide the preference direction and optimize with Pareto optimization. For model selection, we generate a validation dataset with a closer distance to the target distribution, and thereby it can better represent the target data. We also present some theoretical insights behind our proposals. Comprehensive experiments demonstrate that our model optimization and selection techniques can largely improve the performance of existing domain generalization algorithms and even achieve new state-of-the-art results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. M. Nambiar, S. Ghosh, P. Ong, Y. E. Chan, Y. M. Bee, and P. Krishnaswamy, “Deep offline reinforcement learning for real-world treatment optimization applications,” in SIGKDD, 2023, pp. 4673–4684.
  2. J. C. L. Chai, T.-S. Ng, C.-Y. Low, J. Park, and A. B. J. Teoh, “Recognizability embedding enhancement for very low-resolution face recognition and quality estimation,” in CVPR, 2023, pp. 9957–9967.
  3. Y. Tang, A. Y. Sun, H. Inaguma, X. Chen, N. Dong, X. Ma, P. D. Tomasello, and J. Pino, “Hybrid transducer and attention based encoder-decoder modeling for speech-to-text tasks,” arXiv preprint arXiv:2305.03101, 2023.
  4. P. Danassis, S. Verma, J. A. Killian, A. Taneja, and M. Tambe, “Limited resource allocation in a non-markovian world: The case of maternal and child healthcare,” in IJCAI, 2023.
  5. J. V. Jeyakumar, A. Sarker, L. A. Garcia, and M. Srivastava, “X-char: A concept-based explainable complex human activity recognition model,” IMWUT, vol. 7, no. 1, pp. 1–28, 2023.
  6. S. J. Pan and Q. Yang, “A survey on transfer learning,” TKDE, vol. 22, no. 10, pp. 1345–1359, 2009.
  7. G. Wilson and D. J. Cook, “A survey of unsupervised deep domain adaptation,” TIST, vol. 11, no. 5, pp. 1–46, 2020.
  8. J. Wang, C. Lan, C. Liu, Y. Ouyang, W. Zeng, and T. Qin, “Generalizing to unseen domains: A survey on domain generalization,” TKDE, 2022.
  9. S. Choi, D. Das, S. Choi, S. Yang, H. Park, and S. Yun, “Progressive random convolutions for single domain generalization,” in CVPR, 2023, pp. 10 312–10 322.
  10. S. Shankar, V. Piratla, S. Chakrabarti, S. Chaudhuri, P. Jyothi, and S. Sarawagi, “Generalizing across domains via cross-gradient training,” in ICLR, 2018.
  11. Y. Jia, J. Zhang, S. Shan, and X. Chen, “Single-side domain generalization for face anti-spoofing,” in CVPR, 2020, pp. 8484–8493.
  12. A. Rame, C. Dancette, and M. Cord, “Fishr: Invariant gradient variances for out-of-distribution generalization,” pp. 18 347–18 377, 2022.
  13. Q. Xu, R. Zhang, Y. Zhang, Y. Wang, and Q. Tian, “A fourier-based framework for domain generalization,” in CVPR, 2021, pp. 14 383–14 392.
  14. M. Planamente, C. Plizzari, E. Alberti, and B. Caputo, “Domain generalization through audio-visual relative norm alignment in first person action recognition,” in WACV, 2022, pp. 1807–1818.
  15. B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,” in ECCV.   Springer, 2016, pp. 443–450.
  16. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in ICML.   PMLR, 2015, pp. 1180–1189.
  17. F. Lv, J. Liang, K. Gong, S. Li, C. H. Liu, H. Li, D. Liu, and G. Wang, “Pareto domain adaptation,” in NeurIPS, 2021.
  18. P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-validation.” Encyclopedia of database systems, vol. 5, pp. 532–538, 2009.
  19. I. Gulrajani and D. Lopez-Paz, “In search of lost domain generalization,” in ICLR, 2021.
  20. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in ICLR, 2018.
  21. Y. Shi, J. Seely, P. Torr, N. Siddharth, A. Hannun, N. Usunier, and G. Synnaeve, “Gradient matching for domain generalization,” in ICLR, 2022.
  22. D. Mahajan, S. Tople, and A. Sharma, “Domain generalization using causal matching,” in ICML.   PMLR, 2021, pp. 7313–7324.
  23. W. Zucchini, “An introduction to model selection,” Journal of mathematical psychology, vol. 44, no. 1, pp. 41–61, 2000.
  24. D. Li, H. Gouk, and T. Hospedales, “Finding lost dg: Explaining domain generalization via model complexity,” arXiv preprint arXiv:2202.00563, 2022.
  25. H. Ye, C. Xie, T. Cai, R. Li, Z. Li, and L. Wang, “Towards a theoretical framework of out-of-distribution generalization,” NeurIPS, vol. 34, 2021.
  26. E. Zitzler and L. Thiele, “Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach,” IEEE Transactions on Evolutionary Computation, vol. 3, no. 4, pp. 257–271, 1999.
  27. D. Mahapatra and V. Rajan, “Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization,” in ICML.   PMLR, 2020, pp. 6597–6607.
  28. K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain generalization with mixstyle,” in ICLR, 2021.
  29. H. Yao, Y. Wang, S. Li, L. Zhang, W. Liang, J. Zou, and C. Finn, “Improving out-of-distribution robustness via selective augmentation,” in ICML.   PMLR, 2022, pp. 25 407–25 437.
  30. M. B. Cohen, Y. T. Lee, and Z. Song, “Solving linear programs in the current matrix multiplication time,” JACM, vol. 68, no. 1, pp. 1–39, 2021.
  31. S. Ben-David, J. Blitzer, K. Crammer, and e. a. Kulesza, Alex, “A theory of learning from different domains,” Machine learning, vol. 79, no. 1, pp. 151–175, 2010.
  32. A. Sicilia, X. Zhao, and S. J. Hwang, “Domain adversarial neural networks for domain generalization: When it works and how to improve,” Machine Learning, pp. 1–37, 2023.
  33. B. Barshan and M. C. Yüksek, “Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units,” The Computer Journal, vol. 57, no. 11, pp. 1649–1667, 2014.
  34. J. Wang, Y. Chen, L. Hu, X. Peng, and S. Y. Philip, “Stratified transfer learning for cross-domain activity recognition,” in PerCom.   IEEE, 2018, pp. 1–10.
  35. M. Zhang and A. A. Sawchuk, “Usc-had: a daily activity dataset for ubiquitous activity recognition using wearable sensors,” in Proceedings of the 2012 ACM conference on ubiquitous computing, 2012, pp. 1036–1043.
  36. A. Reiss and D. Stricker, “Introducing a new benchmarked dataset for activity monitoring,” in International Symposium on Wearable Computers.   IEEE, 2012, pp. 108–109.
  37. A. Bulling, U. Blanke, and B. Schiele, “A tutorial on human activity recognition using body-worn inertial sensors,” CSUR, vol. 46, no. 3, pp. 1–33, 2014.
  38. G. Parascandolo, A. Neitz, A. Orvieto, L. Gresele, and B. Schölkopf, “Learning explanations that are hard to vary,” in ICLR, 2021.
  39. H. Qian, S. J. Pan, and C. Miao, “Latent independent excitation for generalizable sensor-based cross-person activity recognition,” in AAAI, vol. 35, 2021.
  40. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” in NeurIPS, vol. 32, 2019, pp. 8026–8037.
  41. D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain generalization,” in ICCV, 2017, pp. 5542–5550.
  42. S. Sagawa, P. W. Koh, T. B. Hashimoto, and P. Liang, “Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization,” in ICLR, 2020.
  43. Z. Huang, H. Wang, E. P. Xing, and D. Huang, “Self-challenging improves cross-domain generalization,” in ECCV.   Springer, 2020, pp. 124–140.
Citations (5)

Summary

We haven't generated a summary for this paper yet.