Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using Mirror Descent (2312.13486v2)

Published 20 Dec 2023 in cs.LG

Abstract: Utilizing task-invariant prior knowledge extracted from related tasks, meta-learning is a principled framework that empowers learning a new task especially when data records are limited. A fundamental challenge in meta-learning is how to quickly "adapt" the extracted prior in order to train a task-specific model within a few optimization steps. Existing approaches deal with this challenge using a preconditioner that enhances convergence of the per-task training process. Though effective in representing locally a quadratic training loss, these simple linear preconditioners can hardly capture complex loss geometries. The present contribution addresses this limitation by learning a nonlinear mirror map, which induces a versatile distance metric to enable capturing and optimizing a wide range of loss geometries, hence facilitating the per-task training. Numerical tests on few-shot learning datasets demonstrate the superior expressiveness and convergence of the advocated approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “Low data drug discovery with one-shot learning,” ACS Central Science, vol. 3, no. 4, pp. 283–293, 2017.
  2. “Meta-learning for low-resource neural machine translation,” arXiv preprint arXiv:1808.08437, 2018.
  3. “Learning to adapt in dynamic, real-world environments through meta-reinforcement learning,” in Proc. Int. Conf. Learn. Represent., 2019.
  4. “Optimization as a model for few-shot learning,” in Proc. Int. Conf. Learn. Represent., 2017.
  5. “A simple neural attentive meta-learner,” in Proc. Int. Conf. Learn. Represent., 2018.
  6. “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. Int. Conf. Mach. Learn., 2017, vol. 70, pp. 1126–1135.
  7. “Meta-learning with implicit gradients,” in Proc. Adv. Neural Inf. Process. Syst., 2019, vol. 32.
  8. “Meta-learning with differentiable convex optimization,” in Proc. IEEE/CVF Conf. on Comp. Vis. and Pat. Recog., 2019.
  9. “Scalable bayesian meta-learning through generalized implicit gradients,” in Proc. AAAI Conf. Artif. Intel., 2023, vol. 37(9), pp. 11298–11306.
  10. “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent., 2015.
  11. “Meta-sgd: Learning to learn quickly for few-shot learning,” arXiv preprint arXiv:1707.09835, 2017.
  12. “Meta mirror descent: Optimiser learning for fast convergence,” arXiv preprint arXiv:2203.02711, 2022.
  13. “Meta-curvature,” in Proc. Adv. Neural Inf. Process. Syst., 2019, vol. 32.
  14. “Gradient-based meta-learning with learned layerwise metric and subspace,” in Proc. Int. Conf. Mach. Learn., 2018, vol. 80, pp. 2927–2936.
  15. “Meta-learning with warped gradient descent,” in Proc. Int. Conf. Learn. Represent., 2020.
  16. “On enforcing better conditioned meta-learning for rapid few-shot adaptation,” in Proc. Adv. Neural Inf. Process. Syst., 2022, vol. 35, pp. 4059–4071.
  17. “When maml can adapt fast and how to assist when it cannot,” in Proc. Int. Conf. Artif. Intel. and Stats., 2021, vol. 130, pp. 244–252.
  18. “Recasting gradient-based meta-learning as hierarchical Bayes,” in Proc. Int. Conf. Learn. Represent., 2018.
  19. “Improved variational inference with inverse autoregressive flow,” in Proc. Adv. Neural Inf. Process. Syst., 2016, vol. 29.
  20. “The ordered subsets mirror descent optimization method with applications to tomography,” SIAM Journal on Optimization, vol. 12, no. 1, pp. 79–108, 2001.
  21. “Stochastic mirror descent in variationally coherent optimization problems,” in Proc. Adv. Neural Inf. Process. Syst., 2017, vol. 30.
  22. “Matching networks for one shot learning,” in Proc. Adv. Neural Inf. Process. Syst., 2016, vol. 29.
  23. “Compressed sensing using generative models,” in Proc. Int. Conf. Mach. Learn., Doina Precup and Yee Whye Teh, Eds., 2017, vol. 70, pp. 537–546.

Summary

We haven't generated a summary for this paper yet.