Explainable Attention for Few-shot Learning and Beyond (2310.07800v2)
Abstract: Attention mechanisms have exhibited promising potential in enhancing learning models by identifying salient portions of input data. This is particularly valuable in scenarios where limited training samples are accessible due to challenges in data collection and labeling. Drawing inspiration from human recognition processes, we posit that an AI baseline's performance could be more accurate and dependable if it is exposed to essential segments of raw data rather than the entire input dataset, akin to human perception. However, the task of selecting these informative data segments, referred to as hard attention finding, presents a formidable challenge. In situations with few training samples, existing studies struggle to locate such informative regions due to the large number of training parameters that cannot be effectively learned from the available limited samples. In this study, we introduce a novel and practical framework for achieving explainable hard attention finding, specifically tailored for few-shot learning scenarios, called FewXAT. Our approach employs deep reinforcement learning to implement the concept of hard attention, directly impacting raw input data and thus rendering the process interpretable for human understanding. Through extensive experimentation across various benchmark datasets, we demonstrate the efficacy of our proposed method.
- Searching for objects driven by context. Advances in Neural Information Processing Systems, 25.
- Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755.
- Learning wake-sleep recurrent attention models. Advances in Neural Information Processing Systems, 28.
- A baseline for few-shot image classification. arXiv preprint arXiv:1909.02729.
- Saccader: Improving accuracy of hard attention models for vision. Advances in Neural Information Processing Systems, 32.
- Reinforced attention for few-shot learning and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 913–923.
- Supervised contrastive learning. Advances in neural information processing systems, 33: 18661–18673.
- Learning multiple layers of features from tiny images.
- Protogan: Towards few shot learning for action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 0–0.
- Meta-sgd: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.
- End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 1871–1880.
- Few-shot learning for road object detection. In AAAI Workshop on Meta-Learning and MetaDL Challenge, 115–126. PMLR.
- Recurrent models of visual attention. Advances in neural information processing systems, 27.
- Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
- Joint selection using deep reinforcement learning for skeleton-based activity recognition. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 1056–1061. IEEE.
- Spatial Hard Attention Modeling via Deep Reinforcement Learning for Skeleton-Based Human Activity Recognition. IEEE Transactions on Systems, Man, and Cybernetics: Systems.
- Spatio-temporal hard attention learning for skeleton-based activity recognition. Pattern Recognition, 139: 109428.
- Deep reinforcement learning in human activity recognition: A survey.
- A probabilistic hard attention model for sequentially observed scenes. arXiv preprint arXiv:2111.07534.
- Ranzato, M. 2014. On learning where to look. arXiv preprint arXiv:1405.5488.
- Optimization as a model for few-shot learning. In International conference on learning representations.
- Incremental few-shot learning with attention attractor networks. Advances in neural information processing systems, 32.
- Glimpse-attend-and-explore: Self-attention for active visual exploration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 16137–16146.
- Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in neural information processing systems, 34: 6747–6761.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
- Few-shot learning for low-data drug discovery. Journal of Chemical Information and Modeling, 63(1): 27–42.
- Video captioning via hierarchical reinforcement learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4213–4222.
- Few-shot hash learning for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision Workshops, 1228–1237.
- Caltech-UCSD birds 200.
- Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4): 229–256.
- A Dual Attention Network with Semantic Embedding for Few-Shot Learning. In AAAI, volume 33, 9079–9086.
- Action-decision networks for visual tracking with deep reinforcement learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2711–2720.