Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images (2304.13023v3)

Published 25 Apr 2023 in cs.AI and cs.CV

Abstract: Photos serve as a way for humans to record what they experience in their daily lives, and they are often regarded as trustworthy sources of information. However, there is a growing concern that the advancement of AI technology may produce fake photos, which can create confusion and diminish trust in photographs. This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content. Our study benchmarks both human capability and cutting-edge fake image detection AI algorithms, using a newly collected large-scale fake image dataset Fake2M. In our human perception evaluation, titled HPBench, we discovered that humans struggle significantly to distinguish real photos from AI-generated ones, with a misclassification rate of 38.7%. Along with this, we conduct the model capability of AI-Generated images detection evaluation MPBench and the top-performing model from MPBench achieves a 13% failure rate under the same setting used in the human evaluation. We hope that our study can raise awareness of the potential risks of AI-generated images and facilitate further research to prevent the spread of false information. More information can refer to https://github.com/Inf-imagine/Sentry.

PDF HTML Abstract

Benchmarking Human and AI Perception of Synthetic Images

The paper "Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images" presents a comprehensive evaluation of human and machine abilities to discern AI-generated images from real photographs. This paper responds to escalating concerns regarding the fidelity of AI-generated imagery and its potential implications for society.

The authors introduce two benchmarks: HPBench and MPBench. HPBench evaluates human perception, revealing that humans frequently struggle to differentiate AI-generated images from authentic ones, achieving an average accuracy of 61.3%, which equates to a misclassification rate of 38.7%. This difficulty underscores the increasing sophistication of AI image synthesis methods, which have begun to erode the reliability of images as truth-bearing records.

Concurrently, MPBench assesses the performance of current AI algorithms designed to detect synthetic images. The AI models tested demonstrate superior performance compared to humans, with the most capable AI achieving a misclassification rate of 13% under comparable settings. These findings illuminate the potential for AI-driven solutions to surpass human abilities in the detection of synthetic media.

A significant contribution of this paper is the introduction of the Fake2M dataset, a large-scale collection of over two million AI-generated images, which serves to train and evaluate these detection algorithms. The dataset encompasses outputs from state-of-the-art models including Stable Diffusion and StyleGAN, presenting a diverse challenge reflective of contemporary synthesis capabilities.

The paper's implications are twofold. Practically, the findings advise caution in relying on images as sources of factual information, as AI-generated content can convincingly mimic reality. Theoretically, they push forward the conversation on the limits of AI and human perception, emphasizing the necessity for more robust detection systems capable of coping with the rapid advancements in AI.

Looking ahead, the paper opens several avenues for future research. There is a need to develop AI detection models that maintain high performance across varied datasets, adjusting for differences in style or synthesis method. Moreover, this research touches on the societal impacts of synthetic imagery, such as misinformation and erosion of trust, suggesting a broader exploration of ethical guidelines and detection methodologies is warranted to safeguard against malicious use.

In summary, while AI capabilities in image generation continue to advance, they present new challenges that require equally sophisticated detection methods. The paper sets a foundation for future exploration and response to these increasingly blurred lines between reality and synthesis in digital imagery.

PDF Markdown Bookmark Chat (Pro)

References (85)

Authors (7)

Zeyu Lu (16 papers)
Di Huang (203 papers)
Lei Bai (154 papers)
Jingjing Qu (4 papers)
Chengyue Wu (22 papers)
Xihui Liu (92 papers)
Wanli Ouyang (358 papers)

Citations (37)

View on Semantic Scholar

GitHub

GitHub - Inf-imagine/Sentry: [NeurIPS 2023] Sentry-Image: Detect Any AI-generated Images (87 stars)

Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images (2304.13023v3)

Benchmarking Human and AI Perception of Synthetic Images

Related Papers

GitHub