Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text (2501.03212v1)

Published 6 Jan 2025 in cs.CL and cs.CY

Abstract: The development of Generative AI LLMs raised the alarm regarding identifying content produced through generative AI or humans. In one case, issues arise when students heavily rely on such tools in a manner that can affect the development of their writing or coding skills. Other issues of plagiarism also apply. This study aims to support efforts to detect and identify textual content generated using LLM tools. We hypothesize that LLMs-generated text is detectable by ML, and investigate ML models that can recognize and differentiate texts generated by multiple LLMs tools. We leverage several ML and Deep Learning (DL) algorithms such as Random Forest (RF), and Recurrent Neural Networks (RNN), and utilized Explainable Artificial Intelligence (XAI) to understand the important features in attribution. Our method is divided into 1) binary classification to differentiate between human-written and AI-text, and 2) multi classification, to differentiate between human-written text and the text generated by the five different LLM tools (ChatGPT, LLaMA, Google Bard, Claude, and Perplexity). Results show high accuracy in the multi and binary classification. Our model outperformed GPTZero with 98.5\% accuracy to 78.3\%. Notably, GPTZero was unable to recognize about 4.2\% of the observations, but our model was able to recognize the complete test dataset. XAI results showed that understanding feature importance across different classes enables detailed author/source profiles. Further, aiding in attribution and supporting plagiarism detection by highlighting unique stylistic and structural elements ensuring robust content originality verification.

Authors (4)

Ayat Najjar (1 paper)
Huthaifa I. Ashqar (49 papers)
Omar Darwish (33 papers)
Eman Hammad (2 papers)

Summary

Analyzing Text Attribution Through Explainable AI in Differentiating Human and LLM-Generated Texts

The research paper titled "Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text" presents a rigorous examination of the burgeoning issue of text attribution in the era of LLMs. With the proliferation of sophisticated LLM tools like ChatGPT, LLaMA, Google Bard, Claude, and Perplexity, the challenge of distinguishing between human and AI-generated content has become critically important across various domains, notably in educational settings where the integrity of student submissions must be preserved.

The paper's objective is twofold: firstly, to employ ML and deep learning (DL) algorithms to accurately classify whether a piece of text is human-written or generated by LLMs, and secondly, to further distinguish texts by their LLM source in a multi-class classification setting. The authors propose employing models such as Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Recurrent Neural Networks (RNN) for these purposes. Their methodology underscores the importance of Explainable AI (XAI) in elucidating the features critical for classification, thereby providing transparency in the decision-making process of these models.

This paper stands out for its strong quantitative results. The multi-classification model achieved a high accuracy of 98.5%, outperforming the existing GPTZero tool, which recorded an accuracy of 78.3%. Noteworthy is the model's ability to correctly classify the complete test dataset, addressing the shortcoming of GPTZero, which failed to recognize 4.2% of the data. The authors' deployment of XAI techniques is particularly insightful, allowing for nuanced understanding and verification of textual content. By generating feature importance profiles, the XAI results help in forming detailed author or source profiles that are instrumental in plagiarism detection by highlighting unique stylistic and structural elements.

The implications of this research are manifold. Practically, the model offers a robust tool for educational institutions to safeguard against plagiarism and ensure the development of student writing skills is not undermined by over-reliance on AI text generation. Theoretically, this work contributes significantly to the field of AI ethics, emphasizing transparency and accountability in machine-generated content. The paper's methodology provides a foundation for future research, particularly in the development of more refined models capable of operating in increasingly complex environments with multiple generative sources.

In future developments, expanding the scope to include more languages and domains could be beneficial, given the globalization of educational and professional environments where AI text generation tools are employed. Additionally, refining the models to handle more complex content and subtle linguistic nuances will be crucial as LLMs continue to grow in sophistication. This paper lays essential groundwork for continued exploration and development in AI-driven textual attribution and plagiarism detection.

PDF Markdown

Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text (2501.03212v1)

Summary

Analyzing Text Attribution Through Explainable AI in Differentiating Human and LLM-Generated Texts

Related Papers