IM-Context: In-Context Learning for Imbalanced Regression Tasks (2405.18202v2)
Abstract: Regression models often fail to generalize effectively in regions characterized by highly imbalanced label distributions. Previous methods for deep imbalanced regression rely on gradient-based weight updates, which tend to overfit in underrepresented regions. This paper proposes a paradigm shift towards in-context learning as an effective alternative to conventional in-weight learning methods, particularly for addressing imbalanced regression. In-context learning refers to the ability of a model to condition itself, given a prompt sequence composed of in-context samples (input-label pairs) alongside a new query input to generate predictions, without requiring any parameter updates. In this paper, we study the impact of the prompt sequence on the model performance from both theoretical and empirical perspectives. We emphasize the importance of localized context in reducing bias within regions of high imbalance. Empirical evaluations across a variety of real-world datasets demonstrate that in-context learning substantially outperforms existing in-weight learning methods in scenarios with high levels of imbalance.
- Transformers learn to implement preconditioned gradient descent for in-context learning. Advances in Neural Information Processing Systems, 36, 2024.
- What learning algorithm is in-context learning? investigations with linear models. In The Eleventh International Conference on Learning Representations, 2023.
- Deep evidential regression. Advances in Neural Information Processing Systems, 33, 2020.
- Transformers as statisticians: Provable in-context learning with in-context algorithm selection, 2023.
- Simplicity bias in transformers and their ability to learn sparse boolean functions, 2023.
- Smogn: a pre-processing approach for imbalanced regression. In First international workshop on learning with imbalanced domains: Theory and applications, pages 36–50. PMLR, 2017.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems, 32, 2019.
- Data distributional properties drive emergent in-context learning in transformers. Advances in Neural Information Processing Systems, 35:18878–18891, 2022.
- Aerodynamic design optimization and shape exploration using generative adversarial networks. In AIAA Scitech 2019 forum, page 2351, 2019.
- Andrew Frank. Uci machine learning repository. http://archive. ics. uci. edu/ml, 2010.
- What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598, 2022.
- RankSim: Ranking similarity regularization for deep imbalanced regression. In International Conference on Machine Learning (ICML), 2022.
- Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1):81–102, 1978.
- The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
- Pcdgan: A continuous conditional diverse generative adversarial network for inverse design. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery; Data Mining, KDD ’21. ACM, August 2021.
- Improving contrastive learning on imbalanced data via open-world sampling. Advances in Neural Information Processing Systems, 34:5997–6009, 2021.
- Conr: Contrastive regularizer for deep imbalanced regression. In The Twelfth International Conference on Learning Representations, 2024.
- One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention. In The Twelfth International Conference on Learning Representations, 2024.
- Colin McDiarmid et al. On the method of bounded differences. Surveys in combinatorics, 141(1):148–188, 1989.
- Agedb: the first manually collected, in-the-wild age database. In proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 51–59, 2017.
- Pfns4bo: In-context learning for bayesian optimization. In International Conference on Machine Learning, pages 25444–25470. PMLR, 2023.
- Transformers can do bayesian inference. In International Conference on Learning Representations, 2022.
- Thomas Nagler. Statistical foundations of prior-data fitted networks. In International Conference on Machine Learning, pages 25660–25676. PMLR, 2023.
- Abalone. UCI Machine Learning Repository, 1995. DOI: https://doi.org/10.24432/C55C7W.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Untrained and Unmatched: Fast and Accurate Zero-Training Classification for Tabular Engineering Data. Journal of Mechanical Design, 146(9):091705, 03 2024.
- Learning transferable visual models from natural language supervision, 2021.
- Michael Redmond. Communities and Crime. UCI Machine Learning Repository, 2009. DOI: https://doi.org/10.24432/C53W3X.
- Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019.
- Balanced mse for imbalanced visual regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7926–7935, 2022.
- Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision, 126(2):144–157, 2018.
- Moving window regression: A novel approach to ordinal regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18760–18769, June 2022.
- Density-based weighting for imbalanced regression. Machine Learning, 110:2187–2211, 2021.
- Boris van Breugel and Mihaela van der Schaar. Why tabular foundation models should be a research priority. arXiv preprint arXiv:2405.01147, 2024.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Transformers learn in-context by gradient descent. In International Conference on Machine Learning, pages 35151–35174. PMLR, 2023.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Tal Linzen, Grzegorz Chrupała, and Afra Alishahi, editors, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355, Brussels, Belgium, November 2018. Association for Computational Linguistics.
- A unified generalization analysis of re-weighting and logit-adjustment for imbalanced learning. Advances in Neural Information Processing Systems, 36, 2024.
- Variational imbalanced regression: Fair uncertainty quantification via probabilistic smoothing. Advances in Neural Information Processing Systems, 36, 2024.
- Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9):1526–1541, 2023.
- An explanation of in-context learning as implicit bayesian inference. In International Conference on Learning Representations, 2022.
- Rethinking the value of labels for improving class-imbalanced learning. Advances in neural information processing systems, 33:19290–19301, 2020.
- Delving into deep imbalanced regression. In International conference on machine learning, pages 11842–11851. PMLR, 2021.
- I-Cheng Yeh. Concrete Compressive Strength. UCI Machine Learning Repository, 2007. DOI: https://doi.org/10.24432/C5PK67.
- Deep long-tailed learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.