Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment (2403.01203v1)

Published 2 Mar 2024 in cs.LG, cs.CL, and cs.DB

Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Wukong-CMNER: A Large-Scale Chinese Multimodal NER Dataset with Images Modality. In International Conference on Database Systems for Advanced Applications, 582–596. Springer.
  2. Mutual information neural estimation. In International conference on machine learning, 531–540. PMLR.
  3. MMEA: entity alignment for multi-modal knowledge graph. In Knowledge Science, Engineering and Management: 13th International Conference, KSEM 2020, Hangzhou, China, August 28–30, 2020, Proceedings, Part I 13, 134–147. Springer.
  4. Multi-modal siamese network for entity alignment. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, 118–126.
  5. Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems, 35: 16664–16678.
  6. Meaformer: Multi-modal entity alignment transformer for meta modality hybrid. In Proceedings of the 31st ACM International Conference on Multimedia, 3317–3327.
  7. MultiJAF: Multi-modal joint entity alignment framework for multi-modal knowledge graph. Neurocomputing, 500: 581–591.
  8. Fast-MoCo: Boost momentum-based contrastive learning with combinatorial patches. In European Conference on Computer Vision, 290–306. Springer.
  9. Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5089–5098.
  10. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 320–335.
  11. Deep Reinforcement Learning for Entity Alignment. In Findings of the Association for Computational Linguistics: ACL 2022, 2754–2765.
  12. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
  13. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT, 4171–4186.
  14. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In International Conference on Learning Representations.
  15. Attribute-Consistent Knowledge Graph Representation Learning for Multi-Modal Entity Alignment. In Proceedings of the ACM Web Conference 2023, 2499–2508.
  16. Multi-modal Contrastive Representation Learning for Entity Alignment. In Proceedings of the 29th International Conference on Computational Linguistics, 2572–2584.
  17. Visual pivoting for (unsupervised) entity alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 4257–4266.
  18. Selfkg: self-supervised entity alignment in knowledge graphs. In Proceedings of the ACM Web Conference 2022, 860–870.
  19. MMKG: multi-modal knowledge graphs. In The Semantic Web: 16th International Conference, ESWC 2019, Portorož, Slovenia, June 2–6, 2019, Proceedings 16, 459–474. Springer.
  20. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
  21. SSMI: Semantic Similarity and Mutual Information Maximization Based Enhancement for Chinese NER. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 13474–13482.
  22. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1): 5485–5551.
  23. Prompting large language models with answer heuristics for knowledge-based visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14974–14983.
  24. Multi-modal knowledge graphs for recommender systems. In Proceedings of the 29th ACM international conference on information & knowledge management, 1405–1414.
  25. A benchmarking study of embedding-based entity alignment for knowledge graphs. Proceedings of the VLDB Endowment, 13(12).
  26. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  27. Graph Attention Networks. In International Conference on Learning Representations.
  28. Interactive contrastive learning for self-supervised entity alignment. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2465–2475.
  29. Reinforced active entity alignment. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2477–2486.
  30. Reinforcement learning–based collective entity alignment with adaptive features. ACM Transactions on Information Systems (TOIS), 39(3): 1–31.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Luyao Wang (10 papers)
  2. Pengnian Qi (2 papers)
  3. Xigang Bao (1 paper)
  4. Chunlai Zhou (8 papers)
  5. Biao Qin (7 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.