Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications (2403.03535v1)
Abstract: Few-shot learning (FSL) aims to learn novel tasks with very few labeled samples by leveraging experience from \emph{related} training tasks. In this paper, we try to understand FSL by delving into two key questions: (1) How to quantify the relationship between \emph{training} and \emph{novel} tasks? (2) How does the relationship affect the \emph{adaptation difficulty} on novel tasks for different models? To answer the two questions, we introduce Task Attribute Distance (TAD) built upon attributes as a metric to quantify the task relatedness. Unlike many existing metrics, TAD is model-agnostic, making it applicable to different FSL models. Then, we utilize TAD metric to establish a theoretical connection between task relatedness and task adaptation difficulty. By deriving the generalization error bound on a novel task, we discover how TAD measures the adaptation difficulty on novel tasks for FSL models. To validate our TAD metric and theoretical findings, we conduct experiments on three benchmarks. Our experimental results confirm that TAD metric effectively quantifies the task relatedness and reflects the adaptation difficulty on novel tasks for various FSL methods, even if some of them do not learn attributes explicitly or human-annotated attributes are not available. Finally, we present two applications of the proposed TAD metric: data augmentation and test-time intervention, which further verify its effectiveness and general applicability. The source code is available at https://github.com/hu-my/TaskAttributeDistance.
- S. Carey and E. Bartlett, “Acquiring a single new word,” Papers and Reports on Child Language Development, vol. 15, pp. 17–29, 1978.
- E. G. Miller, N. E. Matsakis, and P. A. Viola, “Learning from one example through shared densities on transforms,” in CVPR, 2000.
- B. Hariharan and R. Girshick, “Low-shot visual recognition by shrinking and hallucinating features,” in ICCV, 2017.
- A. Antoniou, A. Storkey, and H. Edwards, “Data augmentation generative adversarial networks,” in ICLR, 2017.
- E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein, “Delta-encoder: an effective sample synthesis method for few-shot object recognition,” in NeurIPS, 2018.
- Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, “Generalizing from a few examples: A survey on few-shot learning,” ACM computing surveys, vol. 53, pp. 1–34, 2020.
- O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra et al., “Matching networks for one shot learning,” in NeurIPS, 2016.
- J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in NeurIPS, 2017.
- F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales, “Learning to compare: Relation network for few-shot learning,” in CVPR, 2018.
- R. Hou, H. Chang, B. Ma, S. Shan, and X. Chen, “Cross attention network for few-shot classification,” in NeurIPS, 2019.
- K. Allen, E. Shelhamer, H. Shin, and J. Tenenbaum, “Infinite mixture prototypes for few-shot learning,” in ICML, 2019.
- N. Fei, Y. Gao, Z. Lu, and T. Xiang, “Z-score normalization, hubness, and few-shot learning,” in ICCV, 2021.
- A. Afrasiyabi, H. Larochelle, J.-F. Lalonde, and C. Gagné, “Matching feature sets for few-shot image classification,” in CVPR, 2022.
- J. Xie, F. Long, J. Lv, Q. Wang, and P. Li, “Joint distribution matters: Deep brownian distance covariance for few-shot classification,” in CVPR, 2022.
- C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in ICML, 2017.
- Q. Sun, Y. Liu, T.-S. Chua, and B. Schiele, “Meta-transfer learning for few-shot learning,” in CVPR, 2019.
- Z. Li, F. Zhou, F. Chen, and H. Li, “Meta-sgd: Learning to learn quickly for few-shot learning,” arXiv, 2017.
- A. Antoniou, H. Edwards, and A. Storkey, “How to train your MAML,” in ICLR, 2019.
- Y. Guo, N. C. Codella, L. Karlinsky, J. V. Codella, J. R. Smith, K. Saenko, T. Rosing, and R. Feris, “A broader study of cross-domain few-shot learning,” in ECCV, 2020.
- E. Triantafillou, T. Zhu, V. Dumoulin, P. Lamblin, U. Evci, K. Xu, R. Goroshin, C. Gelada, K. Swersky, P.-A. Manzagol et al., “Meta-dataset: A dataset of datasets for learning to learn from few examples,” in ICLR, 2020.
- Y. Song, T. Wang, P. Cai, S. K. Mondal, and J. P. Sahoo, “A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities,” ACM Computing Surveys, vol. 55, pp. 1–40, 2023.
- H.-Y. Tseng, H.-Y. Lee, J.-B. Huang, and M.-H. Yang, “Cross-domain few-shot classification via learned feature-wise transformation,” in ICLR, 2020.
- Y. Hu and A. J. Ma, “Adversarial feature augmentation for cross-domain few-shot classification,” in ECCV, 2022.
- C. P. Phoo and B. Hariharan, “Self-training for few-shot transfer across extreme task differences,” in ICLR, 2021.
- A. Islam, C.-F. R. Chen, R. Panda, L. Karlinsky, R. Feris, and R. J. Radke, “Dynamic distillation network for cross-domain few-shot recognition with unlabeled data,” in NeurIPS, 2021.
- H. Liang, Q. Zhang, P. Dai, and J. Lu, “Boosting the generalization capability in cross-domain few-shot learning via noise-enhanced supervised autoencoder,” in ICCV, 2021.
- Y. Rubner, C. Tomasi, and L. J. Guibas, “A metric for distributions with applications to image databases,” in ICCV, 1998.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in NeurIPS, 2017.
- T. Milbich, K. Roth, S. Sinha, L. Schmidt, M. Ghassemi, and B. Ommer, “Characterizing generalization under out-of-distribution shifts in deep metric learning,” in NeurIPS, 2021.
- J. Oh, S. Kim, N. Ho, J.-H. Kim, H. Song, and S.-Y. Yun, “Understanding cross-domain few-shot learning based on domain similarity and few-shot difficulty,” in NeurIPS, 2022.
- A. Achille, M. Lam, R. Tewari, A. Ravichandran, S. Maji, C. C. Fowlkes, S. Soatto, and P. Perona, “Task2vec: Task embedding for meta-learning,” in ICCV, 2019.
- C. P. Le, J. Dong, M. Soltani, and V. Tarokh, “Task affinity with maximum bipartite matching in few-shot learning,” in ICLR, 2022.
- M. Ren, E. Triantafillou, K.-C. Wang, J. Lucas, J. Snell, X. Pitkow, A. S. Tolias, and R. Zemel, “Probing few-shot generalization with attributes,” arXiv, 2020.
- M. B. Sariyildiz, Y. Kalantidis, D. Larlus, and K. Alahari, “Concept generalization in visual representation learning,” in ICCV, 2021.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, pp. 2278–2324, 1998.
- S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy to train strong classifiers with localizable features,” in ICCV, 2019.
- Z. Chen, Y. Fu, Y.-X. Wang, L. Ma, W. Liu, and M. Hebert, “Image deformation meta-networks for one-shot learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019.
- J. Li, Z. Wang, and X. Hu, “Learning intact features by erasing-inpainting for few-shot classification,” in AAAI, 2021.
- K. Li, Y. Zhang, K. Li, and Y. Fu, “Adversarial feature hallucination networks for few-shot learning,” in CVPR, 2020.
- B. Zhang, X. Li, Y. Ye, Z. Huang, and L. Zhang, “Prototype completion with primitive knowledge for few-shot learning,” in CVPR, 2021.
- J. Xu and H. Le, “Generating representative samples for few-shot classification,” in CVPR, 2022.
- Y. An, H. Xue, X. Zhao, and J. Wang, “From instance to metric calibration: A unified framework for open-world few-shot learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, pp. 9757–9773, 2023.
- S. Yang, L. Liu, and M. Xu, “Free lunch for few-shot learning: Distribution calibration,” in ICLR, 2021.
- S. Yang, S. Wu, T. Liu, and M. Xu, “Bridging the gap between few-shot and many-shot learning via distribution calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, pp. 9830–9843, 2021.
- X. Wei, W. Du, H. Wan, and W. Min, “Feature distribution fitting with direction-driven weighting for few-shot images classification,” in AAAI, 2023.
- W. Chen, Y. Liu, Z. Kira, Y. F. Wang, and J. Huang, “A closer look at few-shot classification,” in ICLR, 2019.
- C. Zhang, Y. Cai, G. Lin, and C. Shen, “Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers,” in CVPR, 2020.
- Y. Chen, Z. Liu, H. Xu, T. Darrell, and X. Wang, “Meta-baseline: Exploring simple meta-learning for few-shot learning,” in ICCV, 2021.
- C. Zhang, Y. Cai, G. Lin, and C. Shen, “Deepemd: Differentiable earth mover’s distance for few-shot learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, pp. 5632–5648, 2022.
- F. Hao, F. He, L. Liu, F. Wu, D. Tao, and J. Cheng, “Class-aware patch embedding adaptation for few-shot image classification,” in ICCV, 2023.
- H.-J. Ye, D.-W. Zhou, L. Hong, Z. Li, X.-S. Wei, and D.-C. Zhan, “Contextualizing meta-learning via learning to decompose,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, pp. 117 – 133, 2023.
- Q. Sun, Y. Liu, Z. Chen, T.-S. Chua, and B. Schiele, “Meta-transfer learning through hard tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, pp. 1443–1456, 2020.
- H. Wang, Y. Wang, R. Sun, and B. Li, “Global convergence of maml and theory-inspired neural architecture search for few-shot learning,” in CVPR, 2022.
- S. Baik, J. Choi, H. Kim, D. Cho, J. Min, and K. M. Lee, “Meta-learning with task-adaptive loss function for few-shot learning,” in CVPR, 2021.
- A. Pentina and C. Lampert, “A PAC-Bayesian bound for lifelong learning,” in ICML, 2014.
- R. Amit and R. Meir, “Meta-learning by adjusting priors based on extended PAC-Bayes theory,” in ICML, 2018.
- J. Rothfuss, V. Fortuin, M. Josifoski, and A. Krause, “PACOH: Bayes-optimal meta-learning with PAC-guarantees,” in ICML, 2021.
- N. Ding, X. Chen, T. Levinboim, S. Goodman, and R. Soricut, “Bridging the gap between practice and PAC-Bayes theory in few-shot meta-learning,” in NeurIPS, 2021.
- J. Guan and Z. Lu, “Fast-rate PAC-Bayesian generalization bounds for meta-learning,” in ICML, 2022.
- D. A. McAllester, “Some PAC-Bayesian theorems,” Machine Learning, vol. 37, pp. 355–363, 1998.
- ——, “PAC-Bayesian model averaging,” in COLT, 1999.
- T. Cao, M. T. Law, and S. Fidler, “A theoretical analysis of the number of shots in few-shot learning,” in ICLR, 2020.
- N. Tripuraneni, C. Jin, and M. Jordan, “Provable meta-learning of linear representations,” in ICML, 2021.
- S. S. Du, W. Hu, S. M. Kakade, J. D. Lee, and Q. Lei, “Few-shot learning via learning the representation, provably,” in ICLR, 2021.
- X. Luo, L. Wei, L. Wen, J. Yang, L. Xie, Z. Xu, and Q. Tian, “Rectifying the shortcut learning of background for few-shot learning,” in NeurIPS, 2021.
- S. X. Hu, D. Li, J. Stühmer, M. Kim, and T. M. Hospedales, “Pushing the limits of simple pipelines for few-shot learning: External data and fine-tuning make a difference,” in CVPR, 2022.
- B. Dong, P. Zhou, S. Yan, and W. Zuo, “Self-promoted supervision for few-shot transformer,” in ECCV, 2022.
- M. Hiller, R. Ma, M. Harandi, and T. Drummond, “Rethinking generalization in few-shot classification,” in NeurIPS, 2022.
- H. Lin, G. Han, J. Ma, S. Huang, X. Lin, and S.-F. Chang, “Supervised masked knowledge distillation for few-shot transformers,” in CVPR, 2023.
- M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in ICCV, 2021.
- J. Zhou, C. Wei, H. Wang, W. Shen, C. Xie, A. Yuille, and T. Kong, “ibot: Image bert pre-training with online tokenizer,” in ICLR, 2022.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021.
- M. Tsimpoukelli, J. L. Menick, S. Cabi, S. Eslami, O. Vinyals, and F. Hill, “Multimodal few-shot learning with frozen language models,” in NeurIPS, 2021.
- J. Li, D. Li, C. Xiong, and S. Hoi, “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in ICML, 2022.
- J. Li, D. Li, S. Savarese, and S. Hoi, “Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,” arXiv, 2023.
- I. Najdenkoska, X. Zhen, and M. Worring, “Meta learning to bridge vision and language models for multimodal few-shot learning,” in ICLR, 2023.
- K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” International Journal of Computer Vision, vol. 130, pp. 2337–2348, 2022.
- ——, “Conditional prompt learning for vision-language models,” in CVPR, 2022.
- X. Liu, Y. Bai, Y. Lu, A. Soltoggio, and S. Kolouri, “Wasserstein task embedding for measuring task similarities,” arXiv, 2022.
- A. R. Zamir, A. Sax, W. Shen, L. J. Guibas, J. Malik, and S. Savarese, “Taskonomy: Disentangling task transfer learning,” in CVPR, 2018.
- C. Liu, Z. Wang, D. Sahoo, Y. Fang, K. Zhang, and S. C. Hoi, “Adaptive task sampling for meta-learning,” in ECCV, 2020.
- S. Arnold, G. Dhillon, A. Ravichandran, and S. Soatto, “Uniform sampling over episode difficulty,” in NeurIPS, 2021.
- A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” in CVPR, 2016.
- Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in ICML, 2009.
- S. Jiang, Y. Zhu, C. Liu, X. Song, X. Li, and W. Min, “Dataset bias in few-shot image recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, pp. 229–246, 2022.
- A. L. Gibbs and F. E. Su, “On choosing and bounding probability metrics,” International statistical review, vol. 70, pp. 419–435, 2002.
- H. W. Kuhn, “The hungarian method for the assignment problem,” Naval research logistics quarterly, vol. 2, pp. 83–97, 1955.
- C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “Caltech-ucsd birds-200-2011,” California Institute of Technology, Tech. Rep., 2011.
- D. Kang, H. Kwon, J. Min, and M. Cho, “Relational embedding for few-shot classification,” in ICCV, 2021.
- P. W. Koh, T. Nguyen, Y. S. Tang, S. Mussmann, E. Pierson, B. Kim, and P. Liang, “Concept bottleneck models,” in ICML, 2020.
- G. Patterson, C. Xu, H. Su, and J. Hays, “The sun attribute database: Beyond categories for deeper scene understanding,” International Journal of Computer Vision, vol. 108, pp. 59–81, 2014.
- O. Russakovsky and L. Fei-Fei, “Attribute learning in large-scale datasets,” in ECCV, 2010.
- M. Fu, Y.-H. Cao, and J. Wu, “Worst case matters for few-shot recognition,” in ECCV, 2022.