Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interventional Few-Shot Learning (2009.13000v2)

Published 28 Sep 2020 in cs.LG and cs.CV

Abstract: We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution of IFSL is orthogonal to existing fine-tuning and meta-learning based FSL methods, hence IFSL can improve all of them, achieving a new 1-/5-shot state-of-the-art on \textit{mini}ImageNet, \textit{tiered}ImageNet, and cross-domain CUB. Code is released at https://github.com/yue-zhongqi/ifsl.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zhongqi Yue (17 papers)
  2. Hanwang Zhang (161 papers)
  3. Qianru Sun (65 papers)
  4. Xian-Sheng Hua (85 papers)
Citations (206)

Summary

Overview of "Interventional Few-Shot Learning"

The paper by Yue et al. introduces a novel paradigm named Interventional Few-Shot Learning (IFSL) to address deficiencies identified in traditional Few-Shot Learning (FSL) methods. The key problem highlighted is that the pre-trained knowledge used in standard FSL acts as a confounder, leading to performance limitations. The authors propose using causal inference, specifically through a Structural Causal Model (SCM) and backdoor adjustments, to mitigate these effects and improve FSL performance.

Key Contributions

  1. Causality in Few-Shot Learning: The authors assert traditional assumptions in FSL, suggesting instead that pre-trained knowledge can act negatively by introducing confounding biases. They model these insights using an SCM to represent relationships amongst the pre-trained knowledge, sample features, and labels.
  2. Interventional Few-Shot Learning (IFSL): IFSL introduces causal adjustments into the FSL framework. Utilizing backdoor adjustments, IFSL aims to neutralize the confounder role played by pre-trained knowledge, enhancing the correlation of sample features directly to class labels.
  3. Algorithmic Implementations: Three implementations of IFSL are provided—feature-wise adjustment, class-wise adjustment, and a combined approach. These methods employ model properties such as feature activation and class centroids to estimate causal effects more accurately.
  4. Evaluation and Results: Employing benchmarks such as miniImageNet, tieredImageNet, and a cross-domain experiment on CUB, the paper demonstrates substantial improvements in classification tasks. The application of IFSL results in a new state-of-the-art performance on 1-shot and 5-shot benchmarks.

Implications

The approach taken introduces robustness to FSL methods by tackling spurious correlations, often overlooked in FSL paradigms. Using causal models to reinterpret FSL also aligns the process with many-shot learning, theoretically and practically enhancing the model's adaptability to variances in query hardness.

Future Developments

The paper suggests several potential research directions, the most intriguing of which includes further exploration of meta-learning as a form of implicit causal intervention, due to its episodic sampling characteristic. Moreover, the extension of causal reasoning into domain adaptation and counterfactual reasoning in FSL are potential avenues for advancing the existing paradigm.

Conclusion

This paper highlights an ever-overlooked aspect of FSL—causality—and provides a robust theoretical framework, backed by empirically validated enhancements. The concept of employing causal inference to disentangle the complexities of pre-training biases opens new possibilities in the development of adaptable learning models. The proposed IFSL and its implementations significantly enrich the discourse in Few-Shot Learning, paving the way for future methodological advancements in machine learning and artificial intelligence.

In summary, while the nuanced causal adjustments employed in IFSL may initially seem complex, their impact offers valuable insights into improving FSL mechanics and setting a strong foundation for future explorations in both AI theory and applications.

Github Logo Streamline Icon: https://streamlinehq.com