Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using Mirror Descent (2312.13486v2)
Abstract: Utilizing task-invariant prior knowledge extracted from related tasks, meta-learning is a principled framework that empowers learning a new task especially when data records are limited. A fundamental challenge in meta-learning is how to quickly "adapt" the extracted prior in order to train a task-specific model within a few optimization steps. Existing approaches deal with this challenge using a preconditioner that enhances convergence of the per-task training process. Though effective in representing locally a quadratic training loss, these simple linear preconditioners can hardly capture complex loss geometries. The present contribution addresses this limitation by learning a nonlinear mirror map, which induces a versatile distance metric to enable capturing and optimizing a wide range of loss geometries, hence facilitating the per-task training. Numerical tests on few-shot learning datasets demonstrate the superior expressiveness and convergence of the advocated approach.
- “Low data drug discovery with one-shot learning,” ACS Central Science, vol. 3, no. 4, pp. 283–293, 2017.
- “Meta-learning for low-resource neural machine translation,” arXiv preprint arXiv:1808.08437, 2018.
- “Learning to adapt in dynamic, real-world environments through meta-reinforcement learning,” in Proc. Int. Conf. Learn. Represent., 2019.
- “Optimization as a model for few-shot learning,” in Proc. Int. Conf. Learn. Represent., 2017.
- “A simple neural attentive meta-learner,” in Proc. Int. Conf. Learn. Represent., 2018.
- “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. Int. Conf. Mach. Learn., 2017, vol. 70, pp. 1126–1135.
- “Meta-learning with implicit gradients,” in Proc. Adv. Neural Inf. Process. Syst., 2019, vol. 32.
- “Meta-learning with differentiable convex optimization,” in Proc. IEEE/CVF Conf. on Comp. Vis. and Pat. Recog., 2019.
- “Scalable bayesian meta-learning through generalized implicit gradients,” in Proc. AAAI Conf. Artif. Intel., 2023, vol. 37(9), pp. 11298–11306.
- “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent., 2015.
- “Meta-sgd: Learning to learn quickly for few-shot learning,” arXiv preprint arXiv:1707.09835, 2017.
- “Meta mirror descent: Optimiser learning for fast convergence,” arXiv preprint arXiv:2203.02711, 2022.
- “Meta-curvature,” in Proc. Adv. Neural Inf. Process. Syst., 2019, vol. 32.
- “Gradient-based meta-learning with learned layerwise metric and subspace,” in Proc. Int. Conf. Mach. Learn., 2018, vol. 80, pp. 2927–2936.
- “Meta-learning with warped gradient descent,” in Proc. Int. Conf. Learn. Represent., 2020.
- “On enforcing better conditioned meta-learning for rapid few-shot adaptation,” in Proc. Adv. Neural Inf. Process. Syst., 2022, vol. 35, pp. 4059–4071.
- “When maml can adapt fast and how to assist when it cannot,” in Proc. Int. Conf. Artif. Intel. and Stats., 2021, vol. 130, pp. 244–252.
- “Recasting gradient-based meta-learning as hierarchical Bayes,” in Proc. Int. Conf. Learn. Represent., 2018.
- “Improved variational inference with inverse autoregressive flow,” in Proc. Adv. Neural Inf. Process. Syst., 2016, vol. 29.
- “The ordered subsets mirror descent optimization method with applications to tomography,” SIAM Journal on Optimization, vol. 12, no. 1, pp. 79–108, 2001.
- “Stochastic mirror descent in variationally coherent optimization problems,” in Proc. Adv. Neural Inf. Process. Syst., 2017, vol. 30.
- “Matching networks for one shot learning,” in Proc. Adv. Neural Inf. Process. Syst., 2016, vol. 29.
- “Compressed sensing using generative models,” in Proc. Int. Conf. Mach. Learn., Doina Precup and Yee Whye Teh, Eds., 2017, vol. 70, pp. 537–546.