Introduction
AI-generated media has evolved rapidly, posing significant challenges to digital society by enabling the creation of convincing forgeries. While numerous automated detection techniques have been proposed, the human capacity to discern AI-generated content from authentic media remains under-explored. Addressing this gap, a comprehensive survey was conducted to understand human detection abilities across different countries and media types—specifically audio, image, and text.
Survey and Methodology
A large-scale paper surveyed 3,002 participants from the USA, Germany, and China to assess their ability to differentiate between human- and machine-generated media. Participants encountered an equal mix of real and synthetic media and rated each piece's authenticity. The paper aimed to answer three research questions: Can individuals identify advanced AI-generated media? Do demographic factors affect detection accuracy? And what cognitive factors impact accuracy?
To explore influencing factors on detection capabilities, personal variables such as generalized trust, cognitive reflection, and familiarity with deepfakes were included, drawing from literature on deepfake and fake news research. A regression analysis was performed to determine the impact of these variables on participants' decisions.
Results and Findings
The paper's findings revealed that participants, on average, struggled to correctly identify AI-generated media. Across all media types and countries, accuracy was consistently low, with participants more often than not mistaking synthetic content for human-generated. The average detection rate for images was below 50%, never exceeding 60% for other media types. Additionally, participants generally believed that media presented to them were created by humans, contrary to a 50/50 distribution of real and synthetic samples.
Regression analysis disclosed significant influences of generalized trust, cognitive reflection, and self-reported familiarity with deepfakes across all media categories. Other variables displayed media-dependent impact. This suggests while AI facilities are currently generating highly convincing media, certain personal traits can still sway an individual's detection abilities.
Demographic and Cognitive Influences
Regarding demographic influences, the analysis showed that German participants were notably proficient at distinguishing AI-generated audio content, potentially due to the lower quality of the German synthetic audio samples used. Other demographic variables, such as age and education, had marginal effects on the detection rate.
Cognitive factors played a more definitive role. Trust negatively correlated with the ability to spot machine-generated media while having a positive correlation with identifying human-generated content. System 2 reasoning, measured through the Cognitive Reflection Test, correlated positively with discerning fakes and negatively with recognizing genuine media. Familiarity with deepfakes was found to assist in detecting authentic audio while complicating the identification of fake images and text.
Conclusion
The paper provides a clear depiction of the current challenges faced by humans in distinguishing AI-generated media from genuine content. The findings underscore the need for continued research into automated detection tools and legislative efforts to manage the societal impacts of generated media. The nuanced understanding drawn from the influence of cognitive and demographic factors on detection rates highlights the complex interplay between human psychology and technology.