Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Vision-language Assisted Attribute Learning (2312.07009v2)

Published 12 Dec 2023 in cs.CV

Abstract: Attribute labeling at large scale is typically incomplete and partial, posing significant challenges to model optimization. Existing attribute learning methods often treat the missing labels as negative or simply ignore them all during training, either of which could hamper the model performance to a great extent. To overcome these limitations, in this paper we leverage the available vision-language knowledge to explicitly disclose the missing labels for enhancing model learning. Given an image, we predict the likelihood of each missing attribute label assisted by an off-the-shelf vision-LLM, and randomly select to ignore those with high scores in training. Our strategy strikes a good balance between fully ignoring and negatifying the missing labels, as these high scores are found to be informative on revealing label ambiguity. Extensive experiments show that our proposed vision-language assisted loss can achieve state-of-the-art performance on the newly cleaned VAW dataset. Qualitative evaluation demonstrates the ability of the proposed method in predicting more complete attributes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. “Describing objects by their attributes,” in CVPR. IEEE, 2009, pp. 1778–1785.
  2. “Coco attributes: Attributes for people, animals, and objects,” in ECCV. Springer, 2016, pp. 85–100.
  3. “Unifying visual attribute learning with object recognition in a multiplicative framework,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1747–1760, 2019.
  4. “From captions to visual concepts and back,” in CVPR, 2015, pp. 1473–1482.
  5. “Vqa: Visual question answering,” in ICCV, 2015, pp. 2425–2433.
  6. “Anigan: Style-guided generative adversarial networks for unsupervised anime face generation,” IEEE Transactions on Multimedia, vol. 24, pp. 4077–4091, 2021.
  7. “High quality disparity remapping with two-stage warping,” in CVPR, 2021, pp. 2269–2278.
  8. “Attributes as operators: factorizing unseen attribute-object compositions,” in ECCV, 2018, pp. 169–185.
  9. “Open world compositional zero-shot learning,” in CVPR, 2021, pp. 5222–5230.
  10. “Learning to predict visual attributes in the wild,” in CVPR, 2021, pp. 13018–13028.
  11. “Incomplete attribute learning with auxiliary labels.,” in IJCAI, 2017, pp. 2252–2258.
  12. “Learning a deep convnet for multi-label classification with partial labels,” in CVPR, June 2019.
  13. “Exploiting weakly supervised visual patterns to learn from partial annotations,” Advances in Neural Information Processing Systems, vol. 33, pp. 561–572, 2020.
  14. “Multi-label classification with partial annotations using class-aware selective loss,” in CVPR, June 2022, pp. 4764–4772.
  15. “Attribute learning with knowledge enhanced partial annotations,” in ICIP. IEEE, 2023, pp. 1715–1719.
  16. “Learning transferable visual models from natural language supervision,” in ICML. PMLR, 2021, pp. 8748–8763.
  17. “Asymmetric loss for multi-label classification,” in CVPR, 2021, pp. 82–91.

Summary

We haven't generated a summary for this paper yet.