Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FREE: Faster and Better Data-Free Meta-Learning (2405.00984v2)

Published 2 May 2024 in cs.LG and cs.CV

Abstract: Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-trained models. In response to these challenges, we introduce the Faster and Better Data-Free Meta-Learning (FREE) framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks. Specifically, within the module Faster Inversion via Meta-Generator, each pre-trained model is perceived as a distinct task. The meta-generator can rapidly adapt to a specific task in just five steps, significantly accelerating the data recovery. Furthermore, we propose Better Generalization via Meta-Learner and introduce an implicit gradient alignment algorithm to optimize the meta-learner. This is achieved as aligned gradient directions alleviate potential conflicts among tasks from heterogeneous pre-trained models. Empirical experiments on multiple benchmarks affirm the superiority of our approach, marking a notable speed-up (20$\times$) and performance enhancement (1.42%$\sim$4.78%) in comparison to the state-of-the-art.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Meta-learning with task-adaptive loss function for few-shot learning. In ICCV, pages 9465–9474, 2021.
  2. Meta-learning with differentiable closed-form solvers. In ICLR, 2019.
  3. Dynamic kernel selection for improved generalization and memory efficiency in meta-learning. In CVPR, pages 9851–9860, 2022.
  4. Variational attention: Propagating domain-specific knowledge for multi-domain learning in crowd counting. In ICCV, pages 16065–16075, 2021a.
  5. Data-free learning of student networks. In ICCV, pages 3514–3522, 2019.
  6. Variational metric scaling for metric-based meta-learning. In AAAI, pages 3478–3485, 2020.
  7. Meta-baseline: Exploring simple meta-learning for few-shot learning. In ICCV, pages 9062–9071, 2021b.
  8. Implicit gradient alignment in distributed and federated learning. In AAAI, pages 6454–6462, 2022.
  9. Gradient agreement as an optimization objective for meta-learning. arXiv preprint arXiv:1810.08178, 2018.
  10. Up to 100x faster data-free knowledge distillation. In AAAI, pages 6597–6604, 2022.
  11. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, pages 1126–1135, 2017.
  12. Towards data-free domain generalization. In ACML, pages 327–342, 2023.
  13. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. In ICLR, 2018.
  14. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2:665–673, 2020.
  15. Memory-efficient backpropagation through time. In NeurIPS, 2016.
  16. Recon: Reducing conflicting gradients from the root for multi-task learning. In ICLR, 2022.
  17. One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation. In NeurIPS, 2023.
  18. Architecture, dataset and model-scale agnostic data-free meta-learning. In CVPR, pages 7736–7745, 2023a.
  19. Zixuan Hu et al. Learning to learn from apis:black-box data-free meta-learning. In ICML, 2023b.
  20. Task agnostic meta-learning for few-shot learning. In CVPR, pages 11719–11727, 2019.
  21. Similarity of neural network representations revisited. In ICML, pages 3519–3529, 2019.
  22. Repurposing pretrained models for robust out-of-domain few-shot learning. In ICLR, 2020.
  23. Contextual gradient scaling for few-shot learning. In WACV, pages 834–843, 2022.
  24. Deep model fusion: A survey. arXiv preprint arXiv:2309.15698, 2023.
  25. Adaptive task sampling for meta-learning. In ECCV, pages 752–769, 2020.
  26. Data-free knowledge transfer: A survey. arXiv preprint arXiv:2112.15278, 2021.
  27. Zero-shot knowledge transfer via adversarial belief matching. In NeurIPS, 2019.
  28. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999, 2018.
  29. Learning to retain while acquiring: Combating distribution-shift in adversarial data-free knowledge distillation. In CVPR, pages 7786–7794, 2023.
  30. Bi-level meta-learning for few-shot domain generalization. In CVPR, pages 15900–15910, 2023.
  31. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In ICLR, 2019.
  32. itaml: An incremental task-agnostic meta-learning approach. In CVPR, pages 13588–13597, 2020.
  33. Meta-learning with implicit gradients. In NeurIPS, 2019.
  34. Optimization as a model for few-shot learning. In ICLR, 2017.
  35. Learning to reweight examples for robust deep learning. In ICML, pages 4334–4343, 2018.
  36. Towards data-free model stealing in a hard label setting. In CVPR, pages 15284–15293, 2022.
  37. Gradient matching for domain generalization. In ICLR, 2021.
  38. Model fusion via optimal transport. In NeurIPS, pages 22045–22055, 2020.
  39. Prototypical networks for few-shot learning. In NeurIPS, pages 4077–4087, 2017.
  40. Robust texture description using local grouped order pattern and non-local binary pattern. IEEE TCSVT, 31:189–202, 2020.
  41. Data-free model extraction. In CVPR, pages 4771–4780, 2021.
  42. The Caltech-UCSD birds-200-2011 dataset. Technical report, California Institute of Technology, 2011.
  43. Learning to learn and remember super long multi-domain task sequence. In CVPR, pages 7982–7992, 2022a.
  44. Meta-learning without data via wasserstein distributionally-robust model fusion. In UAI, pages 2045–2055, 2022b.
  45. Few-shot classification with feature map reconstruction networks. In CVPR, pages 8012–8021, 2021.
  46. Hierarchically structured meta-learning. In ICML, pages 7045–7054, 2019.
  47. Online structured meta-learning. In NeurIPS, 2020.
  48. Meta-learning with an adaptive task scheduler. In NeurIPS, pages 7497–7509, 2021.
  49. Dreaming to distill: Data-free knowledge transfer via deepinversion. In CVPR, pages 8715–8724, 2020.
  50. Data-free knowledge distillation via feature exchange and activation region constraint. In CVPR, pages 24266–24275, 2023.
  51. Gradient surgery for multi-task learning. In NeurIPS, pages 5824–5836, 2020.
  52. Learn from model beyond fine-tuning: A survey. arXiv preprint arXiv:2310.08184, 2023.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets