Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning (2405.16869v4)

Published 27 May 2024 in cs.AI and cs.CL

Abstract: Learning high-quality multi-modal entity representations is an important goal of multi-modal knowledge graph (MMKG) representation learning, which can enhance reasoning tasks within the MMKGs, such as MMKG completion (MMKGC). The main challenge is to collaboratively model the structural information concealed in massive triples and the multi-modal features of the entities. Existing methods focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts. To address this issue, we introduce a novel framework with Mixture of Modality Knowledge experts (MoMoK for short) to learn adaptive multi-modal entity representations for better MMKGC. We design relation-guided modality knowledge experts to acquire relation-aware modality embeddings and integrate the predictions from multi-modalities to achieve joint decisions. Additionally, we disentangle the experts by minimizing their mutual information. Experiments on four public MMKG benchmarks demonstrate the outstanding performance of MoMoK under complex scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng., 29(12):2724–2743, 2017.
  2. A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multimodal. 2022.
  3. Knowledge graphs meet multi-modal learning: A comprehensive survey. CoRR, abs/2402.05391, 2024.
  4. Knowledgeable preference alignment for llms in domain-specific question answering. CoRR, abs/2311.06503, 2023.
  5. KGAT: knowledge graph attention network for recommendation. In KDD, pages 950–958. ACM, 2019.
  6. Tele-knowledge pre-training for fault analysis. In ICDE, pages 3453–3466. IEEE, 2023.
  7. Is Visual Context Really Helpful for Knowledge Graph? A Representation Learning Perspective. In ACM Multimedia, pages 2735–2743. ACM, 2021.
  8. OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport. In NeurIPS, 2022.
  9. Multimodal Biological Knowledge Graph Completion via Triple Co-Attention Mechanism. In ICDE, pages 3928–3941. IEEE, 2023.
  10. VISTA: Visual-Textual Knowledge Graph Representation Learning. In EMNLP (Findings), pages 7314–7328. Association for Computational Linguistics, 2023.
  11. Image-embodied Knowledge Representation Learning. In IJCAI, pages 3140–3146. ijcai.org, 2017.
  12. A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning. In *SEM@NAACL-HLT, pages 225–234. Association for Computational Linguistics, 2018.
  13. Embedding Multimodal Relational Data for Knowledge Base Completion. In EMNLP, pages 3208–3218. Association for Computational Linguistics, 2018.
  14. Multimodal Data Enhanced Representation Learning for Knowledge Graphs. In IJCNN, pages 1–8. IEEE, 2019.
  15. IMF: Interactive Multimodal Fusion Model for Link Prediction. In WWW, pages 2572–2580. ACM, 2023.
  16. Structure guided multi-modal pre-trained transformer for knowledge graph reasoning. CoRR, abs/2307.03591, 2023.
  17. Hybrid transformer with multi-level fusion for multimodal knowledge graph completion. In SIGIR, pages 904–915. ACM, 2022.
  18. MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning. Appl. Intell., 52(7):7480–7497, 2022.
  19. Unleashing the power of imbalanced modality information for multi-modal knowledge graph completion. CoRR, abs/2402.15444, 2024.
  20. Native: Multi-modal knowledge graph completion in the wild. Authorea Preprints, 2024.
  21. Modality-aware negative sampling for multi-modal knowledge graph embedding. In IJCNN, pages 1–8. IEEE, 2023.
  22. Knowledge graph completion with pre-trained multimodal transformer and twins negative sampling. CoRR, abs/2209.07084, 2022.
  23. Translating Embeddings for Modeling Multi-relational Data. In NIPS, pages 2787–2795, 2013.
  24. Mod-squad: Designing mixtures of experts as modular multi-task learners. In CVPR, pages 11828–11837. IEEE, 2023.
  25. Scaling vision with sparse mixture of experts. In NeurIPS, pages 8583–8595, 2021.
  26. Mistral 7b. CoRR, abs/2310.06825, 2023.
  27. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD, pages 1930–1939. ACM, 2018.
  28. Multi-modal mixture of experts represetation learning for sequential recommendation. In CIKM, pages 110–119. ACM, 2023.
  29. Towards universal sequence representation learning for recommender systems. In KDD, pages 585–593. ACM, 2022.
  30. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  31. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171–4186. Association for Computational Linguistics, 2019.
  32. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. In ICLR (Poster). OpenReview.net, 2017.
  33. Tucker: Tensor factorization for knowledge graph completion. In EMNLP/IJCNLP (1), pages 5184–5193. Association for Computational Linguistics, 2019.
  34. CLUB: A contrastive log-ratio upper bound of mutual information. In ICML, volume 119 of Proceedings of Machine Learning Research, pages 1779–1788. PMLR, 2020.
  35. Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion. In ACM Multimedia, pages 3857–3866. ACM, 2022.
  36. MMKG: multi-modal knowledge graphs. In ESWC, volume 11503 of Lecture Notes in Computer Science, pages 459–474. Springer, 2019.
  37. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In ICLR (Poster), 2015.
  38. Complex Embeddings for Simple Link Prediction. In ICML, volume 48 of JMLR Workshop and Conference Proceedings, pages 2071–2080. JMLR.org, 2016.
  39. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. In ICLR (Poster). OpenReview.net, 2019.
  40. PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. In Proc. of ACL, 2021.
  41. MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion. In EMNLP, pages 10527–10536. Association for Computational Linguistics, 2022.
  42. TIVA-KG: A multimodal knowledge graph with text, image, video and audio. In ACM Multimedia, pages 2391–2399. ACM, 2023.
  43. Wikidata: a free collaborative knowledgebase. Commun. ACM, 57(10):78–85, 2014.
  44. Yago: a core of semantic knowledge. In WWW, pages 697–706. ACM, 2007.
  45. Dbpedia - A large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 2015.
  46. Kuaipedia: a large-scale multi-modal short-video encyclopedia. CoRR, abs/2211.00732, 2022.
  47. KG-BERT: BERT for knowledge graph completion. CoRR, abs/1909.03193, 2019.
  48. Deep sparse rectifier neural networks. In AISTATS, volume 15 of JMLR Proceedings, pages 315–323. JMLR.org, 2011.
  49. Adam: A method for stochastic optimization. In ICLR (Poster), 2015.

Summary

We haven't generated a summary for this paper yet.