Active Learning with Contrastive Pre-training for Facial Expression Recognition (2307.02744v1)
Abstract: Deep learning has played a significant role in the success of facial expression recognition (FER), thanks to large models and vast amounts of labelled data. However, obtaining labelled data requires a tremendous amount of human effort, time, and financial resources. Even though some prior works have focused on reducing the need for large amounts of labelled data using different unsupervised methods, another promising approach called active learning is barely explored in the context of FER. This approach involves selecting and labelling the most representative samples from an unlabelled set to make the best use of a limited 'labelling budget'. In this paper, we implement and study 8 recent active learning methods on three public FER datasets, FER13, RAF-DB, and KDEF. Our findings show that existing active learning methods do not perform well in the context of FER, likely suffering from a phenomenon called 'Cold Start', which occurs when the initial set of labelled samples is not well representative of the entire dataset. To address this issue, we propose contrastive self-supervised pre-training, which first learns the underlying representations based on the entire unlabelled dataset. We then follow this with the active learning methods and observe that our 2-step approach shows up to 9.2% improvement over random sampling and up to 6.7% improvement over the best existing active learning baseline without the pre-training. We will make the code for this study public upon publication at: github.com/ShuvenduRoy/ActiveFER.
- “Clifer: Continual learning with imagination for facial expression recognition,” in IEEE International Conference on Automatic Face and Gesture Recognition, 2020, pp. 322–328.
- “Face trees for expression recognition,” in IEEE International Conference on Automatic Face and Gesture Recognition, 2021, pp. 1–5.
- “Facetoponet: Facial expression recognition using face topology learning,” IEEE Transactions on Artificial Intelligence, 2022.
- “Learning bases of activity for facial expression recognition,” IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 1965–1978, 2017.
- “Affect analysis in-the-wild: Valence-arousal, expressions, action units and a unified framework,” arXiv preprint arXiv:2103.15792, 2021.
- “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017.
- “Analysis of semi-supervised methods for facial expression recognition,” in 10th International Conference on Affective Computing and Intelligent Interaction. IEEE, 2022, pp. 1–8.
- “Self-supervised contrastive learning of multi-view facial expressions,” in International Conference on Multimodal Interaction, 2021, pp. 253–257.
- “Spatiotemporal contrastive learning of facial expressions in videos,” pp. 1–8, 2021.
- “Impact of labelled set selection and supervision policies on semi-supervised learning,” arXiv preprint arXiv:2211.14912, 2022.
- “A survey of deep active learning,” ACM computing surveys, vol. 54, no. 9, pp. 1–40, 2021.
- “Action unit classification for facial expression recognition using active learning and svm,” Multimedia Tools and Applications, vol. 80, no. 16, pp. 24287–24301, 2021.
- “Wild facial expression recognition based on incremental active learning,” Cognitive Systems Research, vol. 52, pp. 212–222, 2018.
- “Active learning for convolutional neural networks: A core-set approach,” arXiv preprint arXiv:1708.00489, 2017.
- “Deep batch active learning by diverse, uncertain gradient lower bounds,” arXiv preprint arXiv:1906.03671, 2019.
- Burr Settles, “Active learning literature survey,” 2009.
- Dan Wang and Yi Shang, “A new active labeling method for deep learning,” in International Joint Conference on Neural Networks. IEEE, 2014, pp. 112–119.
- “Glister: Generalization based data subset selection for efficient and robust learning,” in AAAI Conference on Artificial Intelligence, 2021, vol. 35, pp. 8110–8118.
- “Deep bayesian active learning with image data,” in International Conference on Machine Learning. PMLR, 2017, pp. 1183–1192.
- “Adversarial active learning for deep networks: a margin based approach,” arXiv preprint arXiv:1802.09841, 2018.
- “Momentum contrast for unsupervised visual representation learning,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9729–9738.
- “Bootstrap your own latent-a new approach to self-supervised learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 21271–21284, 2020.
- “Unsupervised learning of visual features by contrasting cluster assignments,” Advances in Neural Information Processing Systems, vol. 33, pp. 9912–9924, 2020.
- “Barlow twins: Self-supervised learning via redundancy reduction,” in International Conference on Machine Learning. PMLR, 2021, pp. 12310–12320.
- “A simple framework for contrastive learning of visual representations,” in International Conference on Machine Learning. PMLR, 2020, pp. 1597–1607.
- “Challenges in representation learning: A report on three machine learning contests,” in International Conference on Neural Information Processing. Springer, 2013, pp. 117–124.
- “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 2852–2861.
- “The karolinska directed emotional faces,” CD ROM from Department of Clinical Neuroscience, Psychology Section, Karolinska Institutet, vol. 91, no. 630, pp. 2–2, 1998.