The paper "J-Guard: Journalism Guided Adversarially Robust Detection of AI-generated News" addresses the critical issue of detecting AI-generated news, which poses a significant threat due to its potential to disseminate misinformation. Recognizing that existing AI text detectors often succumb to adversarial attacks and produce false positives—particularly in the context of the nuanced journalistic style—this work introduces a novel framework named J-Guard.
J-Guard is developed through an interdisciplinary approach, combining insights from both computer science and journalism to enhance the performance and robustness of AI-generated text detection methods specifically tailored for news articles. The framework focuses on incorporating journalistic stylistic cues, which are distinctive attributes observed in professional news writing, to train AI detectors more effectively. These stylistic cues are crucial in differentiating authentic journalistic content from AI-generated text.
The authors conducted extensive experiments involving news articles generated by a variety of AI models, including ChatGPT (GPT3.5). The findings demonstrate that J-Guard significantly enhances detection capabilities. One of the noteworthy achievements of this framework is its ability to maintain an approximate performance decrease of only 7% under adversarial conditions. This indicates that J-Guard offers a resilient solution immune to simple adversarial attacks without compromising on detection accuracy.
In summary, J-Guard presents a promising direction for the detection of AI-generated news by embedding journalistic principles into the detection process, ultimately safeguarding the credibility of news organizations and mitigating the spread of misinformation online.