Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hacking Task Confounder in Meta-Learning (2312.05771v5)

Published 10 Dec 2023 in cs.LG and stat.ML

Abstract: Meta-learning enables rapid generalization to new tasks by learning knowledge from various tasks. It is intuitively assumed that as the training progresses, a model will acquire richer knowledge, leading to better generalization performance. However, our experiments reveal an unexpected result: there is negative knowledge transfer between tasks, affecting generalization performance. To explain this phenomenon, we conduct Structural Causal Models (SCMs) for causal analysis. Our investigation uncovers the presence of spurious correlations between task-specific causal factors and labels in meta-learning. Furthermore, the confounding factors differ across different batches. We refer to these confounding factors as "Task Confounders". Based on these findings, we propose a plug-and-play Meta-learning Causal Representation Learner (MetaCRL) to eliminate task confounders. It encodes decoupled generating factors from multiple tasks and utilizes an invariant-based bi-level optimization mechanism to ensure their causality for meta-learning. Extensive experiments on various benchmark datasets demonstrate that our work achieves state-of-the-art (SOTA) performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Invariant Risk Minimization. CoRR, abs/1907.02893.
  2. Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136.
  3. Variational metric scaling for metric-based meta-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 34, 3478–3485.
  4. Learning to learn with variational information bottleneck for domain generalization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, 200–216. Springer.
  5. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, 1126–1135. PMLR.
  6. Meta-learning probabilistic inference for prediction. arXiv preprint arXiv:1805.09921.
  7. Meta-learning in neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(9): 5149–5169.
  8. Improving multi-task generalization via regularizing spurious correlation. Advances in Neural Information Processing Systems, 35: 11450–11466.
  9. How much position information do convolutional neural networks encode? arXiv preprint arXiv:2001.08248.
  10. Task agnostic meta-learning for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11719–11727.
  11. The Role of Deconfounding in Meta-learning. In International Conference on Machine Learning, 10161–10176. PMLR.
  12. When is invariance useful in an Out-of-Distribution Generalization problem? arXiv preprint arXiv:2008.01883.
  13. The Omniglot challenge: a 3-year progress report. Current Opinion in Behavioral Sciences, 29: 97–104.
  14. Meta dropout: Learning to perturb latent features for generalization.
  15. Gradient-based meta-learning with learned layerwise metric and subspace. In International Conference on Machine Learning, 2927–2936. PMLR.
  16. Learning to generalize: Meta-learning for domain generalization. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
  17. Meta-sgd: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.
  18. Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning, 4114–4124. PMLR.
  19. A review on machine learning styles in computer vision-techniques and future directions. IEEE Access.
  20. All-assay-Max2 pQSAR: activity predictions as accurate as four-concentration IC50s for 8558 Novartis assays. Journal of chemical information and modeling, 59(10): 4450–4459.
  21. Offline meta-reinforcement learning with advantage weighting. In International Conference on Machine Learning, 7780–7791. PMLR.
  22. Reptile: a scalable metalearning algorithm. arXiv preprint arXiv:1803.02999, 2(3): 4.
  23. Methods and tools for causal discovery and causal inference. Wiley interdisciplinary reviews: data mining and knowledge discovery, 12(2): e1449.
  24. Boil: Towards representation change for few-shot learning. arXiv preprint arXiv:2008.08882.
  25. Rapid learning or feature reuse? towards understanding the effectiveness of maml. arXiv preprint arXiv:1909.09157.
  26. Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803.00676.
  27. Meta-features for meta-learning. Knowledge-Based Systems, 240: 108101.
  28. Mind meld: Personalized meta-learning for robot-centric imitation learning. In 2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 157–165. IEEE.
  29. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
  30. Efficient and effective multi-task grouping via meta learning on task combinations. Advances in Neural Information Processing Systems, 35: 37647–37659.
  31. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1199–1208.
  32. Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In International Conference on Machine Learning, 6056–6065. PMLR.
  33. Meta learning for causal direction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 9897–9905.
  34. Matching networks for one shot learning. Advances in neural information processing systems, 29.
  35. Bridging multi-task learning and meta-learning: Towards efficient training and effective adaptation. In International conference on machine learning, 10991–11002. PMLR.
  36. Beyond pascal: A benchmark for 3d object detection in the wild. In IEEE winter conference on applications of computer vision, 75–82. IEEE.
  37. Deconfounded image captioning: A causal retrospect. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  38. Improving generalization in meta-learning via task augmentation. In International conference on machine learning, 11887–11897. PMLR.
  39. Meta-learning without memorization. arXiv preprint arXiv:1912.03820.
  40. Interventional few-shot learning. Advances in neural information processing systems, 33: 2734–2746.
  41. Causal intervention for weakly-supervised semantic segmentation. Advances in Neural Information Processing Systems, 33: 655–666.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com