MMA-Diffusion: MultiModal Attack on Diffusion Models (2311.17516v4)

Published 29 Nov 2023 in cs.CR and cs.CV

Abstract: In recent years, Text-to-Image (T2I) models have seen remarkable advancements, gaining widespread adoption. However, this progress has inadvertently opened avenues for potential misuse, particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion, a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches, MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers, thus exposing and highlighting the vulnerabilities in existing defense mechanisms.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (37)

Authors (6)

Yijun Yang (46 papers)
Ruiyuan Gao (18 papers)
Xiaosen Wang (30 papers)
Tsung-Yi Ho (57 papers)
Nan Xu (83 papers)
Qiang Xu (129 papers)

Citations (30)

View on Semantic Scholar

Tweets

https://twitter.com/bdsqlsz/status/1778682382356455873

MMA-Diffusion: MultiModal Attack on Diffusion Models (2311.17516v4)

Related Papers

Tweets