Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Noise-powered Multi-modal Knowledge Graph Representation Framework (2403.06832v3)

Published 11 Mar 2024 in cs.CL and cs.AI

Abstract: The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph (MMKG) representation learning framework. Such a framework is essential for embedding structured knowledge into multi-modal LLMs effectively, alleviating issues like knowledge misconceptions and multi-modal hallucinations. In this work, we explore the efficacy of models in accurately embedding entities within MMKGs through two pivotal tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking to robustly integrate multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility. Moreover, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Code and data are available at https://github.com/zjukg/SNAG.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zhuo Chen (319 papers)
  2. Yin Fang (32 papers)
  3. Yichi Zhang (184 papers)
  4. Lingbing Guo (27 papers)
  5. Huajun Chen (198 papers)
  6. Wen Zhang (170 papers)
  7. Jiaoyan Che (1 paper)
  8. Jeff Z. Pan (78 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.