Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KITE: A Kernel-based Improved Transferability Estimation Method (2405.01603v1)

Published 1 May 2024 in cs.LG

Abstract: Transferability estimation has emerged as an important problem in transfer learning. A transferability estimation method takes as inputs a set of pre-trained models and decides which pre-trained model can deliver the best transfer learning performance. Existing methods tackle this problem by analyzing the output of the pre-trained model or by comparing the pre-trained model with a probe model trained on the target dataset. However, neither is sufficient to provide reliable and efficient transferability estimations. In this paper, we present a novel perspective and introduce Kite, as a Kernel-based Improved Transferability Estimation method. Kite is based on the key observations that the separability of the pre-trained features and the similarity of the pre-trained features to random features are two important factors for estimating transferability. Inspired by kernel methods, Kite adopts centered kernel alignment as an effective way to assess feature separability and feature similarity. Kite is easy to interpret, fast to compute, and robust to the target dataset size. We evaluate the performance of Kite on a recently introduced large-scale model selection benchmark. The benchmark contains 8 source dataset, 6 target datasets and 4 architectures with a total of 32 pre-trained models. Extensive results show that Kite outperforms existing methods by a large margin for transferability estimation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4):433–459, 2010.
  2. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495, 2017.
  3. An information-theoretic approach to transferability in task transfer learning. In 2019 IEEE International Conference on Image Processing (ICIP), pages 2309–2313. IEEE, 2019.
  4. Yoshua Bengio. Deep learning of representations for unsupervised and transfer learning. In Proceedings of ICML workshop on unsupervised and transfer learning, pages 17–36. JMLR Workshop and Conference Proceedings, 2012.
  5. Scalable diverse model selection for accessible transfer learning. Advances in Neural Information Processing Systems, 34, 2021.
  6. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  7. Concentration of random feature matrices in high-dimensions. arXiv preprint arXiv:2204.06935, 2022.
  8. Algorithms for learning kernels based on centered alignment. The Journal of Machine Learning Research, 13:795–828, 2012.
  9. On kernel-target alignment. Advances in neural information processing systems, 14, 2001.
  10. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  11. Decaf: A deep convolutional activation feature for generic visual recognition. In International conference on machine learning, pages 647–655. PMLR, 2014.
  12. Duality diagram similarity: a generic framework for initialization selection in task transfer learning. In European Conference on Computer Vision, pages 497–513. Springer, 2020.
  13. Representation similarity analysis for efficient task taxonomy & transfer learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12387–12396, 2019.
  14. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2):303–338, 2010.
  15. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, 28(4):594–611, 2006.
  16. Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York, 2007.
  17. Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine-tuning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1086–1095, 2017.
  18. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
  19. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256. JMLR Workshop and Conference Proceedings, 2010.
  20. Measuring statistical dependence with hilbert-schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.
  21. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33:21271–21284, 2020.
  22. Adafilter: Adaptive filter fine-tuning for deep transfer learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 4060–4066, 2020.
  23. Spottune: transfer learning through adaptive fine-tuning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4805–4814, 2019.
  24. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  25. Rethinking imagenet pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4918–4927, 2019.
  26. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  27. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  28. Geoffrey E Hinton. To recognize shapes, first learn to generate images. Progress in brain research, 165:535–547, 2007.
  29. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), volume 2. Citeseer, 2011.
  30. Bayesian evidence and model selection. Digital Signal Processing, 47:50–67, 2015.
  31. Why do better loss functions lead to less transferable features? Advances in Neural Information Processing Systems, 34, 2021.
  32. Do better imagenet models transfer better? In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2661–2671, 2019.
  33. Learning multiple layers of features from tiny images. 2009.
  34. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  35. A random matrix analysis of random fourier features: beyond the gaussian kernel, a precise phase transition, and the corresponding double descent. Advances in Neural Information Processing Systems, 33:13939–13950, 2020.
  36. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
  37. A random matrix approach to neural networks. The Annals of Applied Probability, 28(2):1190–1248, 2018.
  38. What is being transferred in transfer learning? Advances in neural information processing systems, 33:512–523, 2020.
  39. Leep: A new measure to evaluate transferability of learned representations. In International Conference on Machine Learning, pages 7294–7305. PMLR, 2020.
  40. Transferability estimation using bhattacharyya class separability. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9172–9182, 2022.
  41. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition, pages 3498–3505. IEEE, 2012.
  42. Nonlinear random matrix theory for deep learning. Advances in neural information processing systems, 30, 2017.
  43. Transfusion: Understanding transfer learning for medical imaging. Advances in neural information processing systems, 32, 2019.
  44. Random features for large-scale kernel machines. Advances in neural information processing systems, 20, 2007.
  45. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
  46. Cnn features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 806–813, 2014.
  47. Learning with kernels, volume 4. Citeseer, 1998.
  48. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  49. Transferability and hardness of supervised classification tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1395–1405, 2019.
  50. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  51. Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 595–604, 2015.
  52. Sebastiano Vigna. A weighted correlation index for rankings with ties. In Proceedings of the 24th international conference on World Wide Web, pages 1166–1176, 2015.
  53. The caltech-ucsd birds-200-2011 dataset. 2011.
  54. Explicit inductive bias for transfer learning with convolutional networks. In International Conference on Machine Learning, pages 2825–2834. PMLR, 2018.
  55. How transferable are features in deep neural networks? Advances in neural information processing systems, 27, 2014.
  56. Logme: Practical assessment of pre-trained models for transfer learning. In International Conference on Machine Learning, pages 12133–12143. PMLR, 2021.
  57. Taskonomy: Disentangling task transfer learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3712–3722, 2018.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com