Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Less but Better: Enabling Generalized Zero-shot Learning Towards Unseen Domains by Intrinsic Learning from Redundant LLM Semantics (2403.14362v4)

Published 21 Mar 2024 in cs.CV

Abstract: Generalized zero-shot learning (GZSL) focuses on recognizing seen and unseen classes against domain shift problem (DSP) where data of unseen classes may be misclassified as seen classes. However, existing GZSL is still limited to seen domains. In the current work, we pioneer cross-domain GZSL (CDGZSL) which addresses GZSL towards unseen domains. Different from existing GZSL methods which alleviate DSP by generating features of unseen classes with semantics, CDGZSL needs to construct a common feature space across domains and acquire the corresponding intrinsic semantics shared among domains to transfer from seen to unseen domains. Considering the information asymmetry problem caused by redundant class semantics annotated with LLMs, we present Meta Domain Alignment Semantic Refinement (MDASR). Technically, MDASR consists of two parts: Inter-class Similarity Alignment (ISA), which eliminates the non-intrinsic semantics not shared across all domains under the guidance of inter-class feature relationships, and Unseen-class Meta Generation (UMG), which preserves intrinsic semantics to maintain connectivity between seen and unseen classes by simulating feature generation. MDASR effectively aligns the redundant semantic space with the common feature space, mitigating the information asymmetry in CDGZSL. The effectiveness of MDASR is demonstrated on the Office-Home and Mini-DomainNet, and we have shared the LLM-based semantics for these datasets as the benchmark.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  2. W. X. Zhao and K. Zhou, “A survey of large language models,” arXiv preprint arXiv: 2303.18223, 2023.
  3. J. Wang, P. Song, C. Zhao, and J. Ding, “Federated knowledge amalgamation with unbiased semantic attributes under cloud–edge collaboration for heterogeneous fault diagnosis,” Journal of Process Control, vol. 131, p. 103095, 2023.
  4. S. Ji, S. Pan, E. Cambria, P. Marttinen, and P. S. Yu, “A survey on knowledge graphs: Representation, acquisition, and applications,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 494–514, 2022.
  5. B. Li and C. Zhao, “Federated zero-shot industrial fault diagnosis with cloud-shared semantic knowledge base,” IEEE Internet of Things Journal, vol. 10, no. 13, pp. 11 619–11 630, 2023.
  6. F. Pourpanah, M. Abdar, Y. Luo, X. Zhou, R. Wang, C. P. Lim, X.-Z. Wang, and Q. J. Wu, “A review of generalized zero-shot learning methods,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 4, pp. 4051–4070, 2022.
  7. Z. Liu, Y. Li, L. Yao, J. McAuley, and S. Dixon, “Rethink, revisit, revise: A spiral reinforced self-revised network for zero-shot learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 1, pp. 657–669, 2024.
  8. L. Feng, C. Zhao, and X. Li, “Bias-eliminated semantic refinement for any-shot learning,” IEEE Transactions on Image Processing, vol. 31, pp. 2229–2244, 2022.
  9. S. Chen, Z. Hong, G. Xie, Q. Peng, X. You, W. Ding, and L. Shao, “Gndan: Graph navigated dual attention network for zero-shot learning,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2022.
  10. L. Feng and C. Zhao, “Fault description based attribute transfer for zero-sample industrial fault diagnosis,” IEEE Transactions on Industrial Informatics, vol. 17, no. 3, pp. 1852–1862, 2021.
  11. C. H. Lampert, H. Nickisch, and S. Harmeling, “Attribute-based classification for zero-shot visual object categorization,” IEEE transactions on pattern analysis and machine intelligence, vol. 36, no. 3, pp. 453–465, 2013.
  12. X. Chen, B. Zhang, C. Zhao, J. Ding, and W. Wang, “From coarse to fine: Hierarchical zero-shot fault diagnosis with multi-grained attributes,” IEEE Transactions on Fuzzy Systems, 2024. Doi: 10.1109/TFUZZ.2024.3363708.
  13. Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid, “Label-embedding for image classification,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 7, pp. 1425–1438, 2015.
  14. Y. Xian, C. H. Lampert, B. Schiele, and Z. Akata, “Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 9, pp. 2251–2265, 2018.
  15. G.-S. Xie, Z. Zhang, G. Liu, F. Zhu, L. Liu, L. Shao, and X. Li, “Generalized zero-shot learning with multiple graph adaptive generative networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 7, pp. 2903–2915, 2022.
  16. L. Feng and C. Zhao, “Transfer increment for generalized zero-shot learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 6, pp. 2506–2520, 2021.
  17. J. Yue, J. Zhao, and C. Zhao, “Similarity makes difference: SSHTN for generalized zero-shot industrial fault diagnosis by leveraging auxiliary set,” IEEE Transactions on Industrial Informatics, 2024. Doi: 10.1109/TII.2024.3359460.
  18. Y. Xian, T. Lorenz, B. Schiele, and Z. Akata, “Feature generating networks for zero-shot learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5542–5551.
  19. Y. Xian, S. Sharma, B. Schiele, and Z. Akata, “f-vaegan-d2: A feature generating framework for any-shot learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10 275–10 284.
  20. X. Sun, J. Gu, and H. Sun, “Research progress of zero-shot learning,” Applied Intelligence, vol. 51, pp. 3600–3614, 2021.
  21. M. F. Naeem, M. G. Z. A. Khan, Y. Xian, M. Z. Afzal, D. Stricker, L. Van Gool, and F. Tombari, “I2mvformer: Large language model generated multi-view document supervision for zero-shot image classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15 169–15 179.
  22. M. Mancini, Z. Akata, E. Ricci, and B. Caputo, “Towards recognizing unseen categories in unseen domains,” in European Conference on Computer Vision.   Springer, 2020, pp. 466–483.
  23. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “Mixup: Beyond empirical risk minimization,” in International Conference on Learning Representations, 2018.
  24. B. Mondal and S. Biswas, “SEIC: Semantic embedding with intermediate classes for zero-shot domain generalization,” in Proceedings of the Asian Conference on Computer Vision, 2022, pp. 789–806.
  25. Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” The journal of machine learning research, vol. 17, no. 1, pp. 2096–2030, 2016.
  26. X. Cheng, Z. Rao, Y. Chen, and Q. Zhang, “Explaining knowledge distillation by quantifying the knowledge,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 12 925–12 935.
  27. N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3982–3992.
  28. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of Wasserstein GANs,” Advances in neural information processing systems, vol. 30, pp. 5769–5779, 2017.
  29. H. Venkateswara, J. Eusebio, S. Chakraborty, and S. Panchanathan, “Deep hashing network for unsupervised domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5018–5027.
  30. K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain adaptive ensemble learning,” IEEE Transactions on Image Processing, vol. 30, pp. 8008–8018, 2021.
  31. K. W. Church, “Word2Vec,” Natural Language Engineering, vol. 23, no. 1, pp. 155–162, 2017.
  32. J. Kocoń, I. Cichecki, O. Kaszyca, M. Kochanek, D. Szydło, J. Baran, J. Bielaniewicz, M. Gruza, A. Janz, K. Kanclerz et al., “ChatGPT: Jack of all trades, master of none,” Information Fusion, p. 101861, 2023.
  33. X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, “Moment matching for multi-source domain adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 1406–1415.
  34. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  35. S. Chandhok, S. Narayan, H. Cholakkal, R. M. Anwer, V. N. Balasubramanian, F. S. Khan, and L. Shao, “Structured latent embeddings for recognizing unseen classes in unseen domains,” arXiv preprint arXiv:2107.05622, 2021.
  36. V. Raunak, V. Gupta, and F. Metze, “Effective dimensionality reduction for word embeddings,” in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), 2019, pp. 235–243.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jiaqi Yue (3 papers)
  2. Jiancheng Zhao (11 papers)
  3. Chunhui Zhao (16 papers)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com