Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation (2404.09468v2)

Published 15 Apr 2024 in cs.AI

Abstract: Multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given knowledge graphs, collaboratively leveraging structural information from the triples and multi-modal information of the entities to overcome the inherent incompleteness. Existing MMKGC methods usually extract multi-modal features with pre-trained models, resulting in coarse handling of multi-modal entity information, overlooking the nuanced, fine-grained semantic details and their complex interactions. To tackle this shortfall, we introduce a novel framework MyGO to tokenize, fuse, and augment the fine-grained multi-modal representations of entities and enhance the MMKGC performance. Motivated by the tokenization technology, MyGO tokenizes multi-modal entity information as fine-grained discrete tokens and learns entity representations with a cross-modal entity encoder. To further augment the multi-modal representations, MyGO incorporates fine-grained contrastive learning to highlight the specificity of the entity representations. Experiments on standard MMKGC benchmarks reveal that our method surpasses 19 of the latest models, underlining its superior performance. Code and data can be found in https://github.com/zjukg/MyGO

References (53)

Authors (8)

Yichi Zhang (185 papers)
Zhuo Chen (319 papers)
Lingbing Guo (27 papers)
Yajing Xu (17 papers)
Binbin Hu (42 papers)
Ziqi Liu (78 papers)
Huajun Chen (199 papers)
Wen Zhang (170 papers)

Summary

The paper introduces an innovative approach that integrates tokenization, fusion, and augmentation to build detailed multi-modal entity representations.
It systematically combines diverse data modalities to enhance the granularity and interpretability of entity representations.
Experimental evaluations demonstrate improved performance across benchmarks, underscoring significant implications for multi-modal applications.

Comprehensive Overview of ACM's `acmart` LaTeX Document Class and Its Features

Introduction and Purpose of ACM's Template

The ACM consolidated article template, utilizing the acmart document class, serves as a unified LaTeX style for various ACM publications. It integrates essential features such as accessibility and metadata-extraction functions, crucial for future expansions of the ACM Digital Library. This document class supports different stages of publication across numerous ACM platforms, simplifying the publication process for both new and seasoned authors within the ACM community.

Utilization across Publications

The acmart document class is adaptable for an array of documentation types, from dual-anonymous initial submissions to camera-ready journal articles. This versatility is achieved through specific template styles and parameters:

Journal Styles: Different ACM journals utilize various styles such as acmsmall, acmlarge, and acmtog, each catering to the particular needs of the journal's focus and format.
Conference Proceedings: The majority utilize the acmconf style, with specific adaptations available for SIG-specific conferences like sigchi for SIGCHI articles, or sigplan for SIGPLAN conferences.

The choice of template style dictates the formatting nuances of the publication, ensuring consistency and adherence to ACM's publication standards.

Key Features and Parameters

The template supports numerous parameters that further refine the publication output to meet specific requirements, such as anonymous and review for double-blind review processes, or screen for color hyperlinks. Detailed guidance on these parameters enables authors to enhance the accessibility and functionality of their documents effectively.

Adhering to Formatting Standards

The introduction of the acmart class necessitates strict adherence to formatting standards. Modifications to template elements like margins, typefaces, or line spacing are generally prohibited to maintain a uniform appearance across publications. Violations of these standards require document revision, emphasizing the importance of following the preset guidelines closely.

Typeface and Presentation Requirements

Use of the “Libertine” typeface family is mandated, providing a standardized visual aesthetic for ACM publications. Also, attention to the proper capitalization and presentation of titles and subtitles is critical, ensuring clarity and professionalism in the document's appearance.

Author and Affiliation Documentation

Accurate metadata identification is imperative, necessitating detailed documentation of each author and their affiliations. This structured approach aids in the proper indexing and accessibility of the paper within the ACM ecosystem.

Rights and Licensing

Authors must manage rights information meticulously, including integrating specific LaTeX commands provided by ACM post-rights form completion. This ensures legal compliance and proper attribution of the published work.

Taxonomic Tools and Classification

Authors are encouraged to use the ACM Computing Classification System for better taxonomy and indexing of their work. Additionally, user-defined keywords offer a flexible tool for describing the research in more accessible terms to enhance discoverability and relevance in digital searches.

Conclusion: Implications and Future Adaptations

The acmart LaTeX document class standardizes ACM's publication process, ensuring a consistent and professional presentation of scholarly work. It accommodates the evolving needs of digital publishing with an emphasis on accessibility and metadata completeness. Future enhancements may likely focus on expanding this framework to include advanced digital publishing tools and greater customization options, potentially increasing the engagement and reach of ACM publications.

The meticulous structuring and detailed parameterization offered by acmart stand as testaments to ACM's commitment to high standards in the dissemination of scientific knowledge, ensuring that the system meets the dynamic needs of its diverse academic audience.

Related Papers

Tweets

https://twitter.com/1SHL10/status/1784636894015373799

https://twitter.com/SolidReturnLda/status/1780112903552876590