Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 79 tok/s

Gemini 2.5 Pro 54 tok/s Pro

GPT-5 Medium 21 tok/s Pro

GPT-5 High 30 tok/s Pro

GPT-4o 98 tok/s Pro

Kimi K2 187 tok/s Pro

GPT OSS 120B 453 tok/s Pro

Claude Sonnet 4.5 37 tok/s Pro

2000 character limit reached

MoralCLIP: Contrastive Alignment of Vision-and-Language Representations with Moral Foundations Theory (2506.05696v1)

Published 6 Jun 2025 in cs.CV

Abstract: Recent advances in vision-LLMs have enabled rich semantic understanding across modalities. However, these encoding methods lack the ability to interpret or reason about the moral dimensions of content-a crucial aspect of human cognition. In this paper, we address this gap by introducing MoralCLIP, a novel embedding representation method that extends multimodal learning with explicit moral grounding based on Moral Foundations Theory (MFT). Our approach integrates visual and textual moral cues into a unified embedding space, enabling cross-modal moral alignment. MoralCLIP is grounded on the multi-label dataset Social-Moral Image Database to identify co-occurring moral foundations in visual content. For MoralCLIP training, we design a moral data augmentation strategy to scale our annotated dataset to 15,000 image-text pairs labeled with MFT-aligned dimensions. Our results demonstrate that explicit moral supervision improves both unimodal and multimodal understanding of moral content, establishing a foundation for morally-aware AI systems capable of recognizing and aligning with human moral values.