Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label (2404.09475v1)
Abstract: Weakly-supervised learning approaches have gained significant attention due to their ability to reduce the effort required for human annotations in training neural networks. This paper investigates a framework for weakly-supervised object localization, which aims to train a neural network capable of predicting both the object class and its location using only images and their image-level class labels. The proposed framework consists of a shared feature extractor, a classifier, and a localizer. The localizer predicts pixel-level class probabilities, while the classifier predicts the object class at the image level. Since image-level class labels are insufficient for training the localizer, weakly-supervised object localization methods often encounter challenges in accurately localizing the entire object region. To address this issue, the proposed method incorporates adversarial erasing and pseudo labels to improve localization accuracy. Specifically, novel losses are designed to utilize adversarially erased foreground features and adversarially erased feature maps, reducing dependence on the most discriminative region. Additionally, the proposed method employs pseudo labels to suppress activation values in the background while increasing them in the foreground. The proposed method is applied to two backbone networks (MobileNetV1 and InceptionV3) and is evaluated on three publicly available datasets (ILSVRC-2012, CUB-200-2011, and PASCAL VOC 2012). The experimental results demonstrate that the proposed method outperforms previous state-of-the-art methods across all evaluated metrics.
- Multi-resolution-based deep learning approach for rice field monitoring. Canadian Journal of Remote Sensing 48, 278–298. doi:10.1080/07038992.2021.2010036.
- Rethinking class activation mapping for weakly supervised object localization, in: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham. pp. 618–634.
- Itran: A novel transformer-based approach for industrial anomaly detection and localization. Engineering Applications of Artificial Intelligence 125, 106677. URL: https://www.sciencedirect.com/science/article/pii/S0952197623008618, doi:https://doi.org/10.1016/j.engappai.2023.106677.
- Broad learning system: An effective and efficient incremental learning system without the need for deep architecture. IEEE Transactions on Neural Networks and Learning Systems 29, 10–24. doi:10.1109/TNNLS.2017.2716952.
- Attention-based dropout layer for weakly supervised single object localization and semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 4256–4271. doi:10.1109/TPAMI.2020.2999099.
- Attention-based dropout layer for weakly supervised object localization, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2214–2223. doi:10.1109/CVPR.2019.00232.
- An image is worth 16x16 words: Transformers for image recognition at scale. CoRR abs/2010.11929. URL: https://arxiv.org/abs/2010.11929, arXiv:2010.11929.
- The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88, 303–338. URL: https://doi.org/10.1007/s11263-009-0275-4, doi:10.1007/s11263-009-0275-4.
- Go deep or broad? exploit hybrid network architecture for weakly supervised object classification and localization. IEEE Transactions on Neural Networks and Learning Systems , 1–14doi:10.1109/TNNLS.2022.3225180.
- Ts-cam: Token semantic coupled attention map for weakly supervised object localization, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2866–2875. doi:10.1109/ICCV48922.2021.00288.
- Strengthen learning tolerance for weakly supervised object localization, in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7399–7408. doi:10.1109/CVPR46437.2021.00732.
- Vitol: Vision transformer for weakly supervised object localization, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 4101–4110.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861. URL: http://arxiv.org/abs/1704.04861, arXiv:1704.04861.
- Weakly supervised semantic segmentation via graph recalibration with scaling weight unit. Engineering Applications of Artificial Intelligence 119, 105706. URL: https://www.sciencedirect.com/science/article/pii/S0952197622006960, doi:https://doi.org/10.1016/j.engappai.2022.105706.
- Layercam: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing 30, 5875–5888. doi:10.1109/TIP.2021.3089943.
- Multiscale vision transformer with deep clustering-guided refinement for weakly supervised object localization, in: 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 1–5. doi:10.1109/VCIP59821.2023.10402750.
- Normalization matters in weakly supervised object localization, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3407–3416. doi:10.1109/ICCV48922.2021.00341.
- Keep calm and improve visual feature attribution, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8330–8340. doi:10.1109/ICCV48922.2021.00824.
- Transformer for object detection: Review and benchmark. Engineering Applications of Artificial Intelligence 126, 107021. URL: https://www.sciencedirect.com/science/article/pii/S0952197623012058, doi:https://doi.org/10.1016/j.engappai.2023.107021.
- A novel seminar learning framework for weakly supervised salient object detection. Engineering Applications of Artificial Intelligence 126, 106961. URL: https://www.sciencedirect.com/science/article/pii/S0952197623011454, doi:https://doi.org/10.1016/j.engappai.2023.106961.
- Geometry constrained weakly supervised object localization, in: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham. pp. 481–496.
- Foreground activation maps for weakly supervised object localization, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3365–3375. doi:10.1109/ICCV48922.2021.00337.
- Is object localization for free? - weakly-supervised learning with convolutional neural networks, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 685–694. doi:10.1109/CVPR.2015.7298668.
- Unveiling the potential of structure preserving for weakly supervised object localization, in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11637–11646. doi:10.1109/CVPR46437.2021.01147.
- Fdcnet: Feature drift compensation network for class-incremental weakly supervised object localization, in: Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, New York, NY, USA.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 211–252. doi:10.1007/s11263-015-0816-y.
- Grad-cam: Visual explanations from deep networks via gradient-based localization, in: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 618–626. doi:10.1109/ICCV.2017.74.
- Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization, in: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3544–3553. doi:10.1109/ICCV.2017.381.
- Rethinking the inception architecture for computer vision, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826. doi:10.1109/CVPR.2016.308.
- The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.
- Shallow feature matters for weakly supervised object localization, in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5989–5997. doi:10.1109/CVPR46437.2021.00593.
- Unsupervised object discovery and co-localization by deep descriptor transformation. Pattern Recognition 88, 113–126. doi:https://doi.org/10.1016/j.patcog.2018.10.022.
- Cbam: Convolutional block attention module, in: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (Eds.), Computer Vision – ECCV 2018, Springer International Publishing, Cham. pp. 3–19.
- Background activation suppression for weakly supervised object localization, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14248–14257.
- Hierarchical fusion and divergent activation based weakly supervised learning for object detection from remote sensing images. Information Fusion 80, 23–43. URL: https://www.sciencedirect.com/science/article/pii/S1566253521002189, doi:https://doi.org/10.1016/j.inffus.2021.10.010.
- Online refinement of low-level feature based activation map for weakly supervised object localization, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 132–141. doi:10.1109/ICCV48922.2021.00020.
- C2am: Contrastive learning of class-agnostic activation map for weakly supervised object localization and semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 989–998.
- Danet: Divergent activation for weakly supervised object localization, in: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6588–6597. doi:10.1109/ICCV.2019.00669.
- Combinational class activation maps for weakly supervised object localization, in: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 2930–2938. doi:10.1109/WACV45572.2020.9093566.
- Rethinking the route towards weakly supervised object localization, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13457–13466. doi:10.1109/CVPR42600.2020.01347.
- Weakly supervised object localization and detection: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 5866–5885. doi:10.1109/TPAMI.2021.3074313.
- Adversarial complementary learning for weakly supervised object localization, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1325–1334. doi:10.1109/CVPR.2018.00144.
- Self-produced guidance for weakly-supervised object localization, in: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (Eds.), Computer Vision – ECCV 2018, Springer International Publishing, Cham. pp. 610–625.
- Inter-image communication for weakly supervised localization, in: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham. pp. 271–287.
- Learning deep features for discriminative localization, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929. doi:10.1109/CVPR.2016.319.
- Weakly supervised object localization as domain adaption, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14637–14646.
- Byeongkeun Kang (22 papers)
- Sinhae Cha (2 papers)
- Yeejin Lee (15 papers)