Comprehensive Evaluation of ChatGPT Reliability Through Multilingual Inquiries (2312.10524v1)
Abstract: ChatGPT is currently the most popular LLM, with over 100 million users, making a significant impact on people's lives. However, due to the presence of jailbreak vulnerabilities, ChatGPT might have negative effects on people's lives, potentially even facilitating criminal activities. Testing whether ChatGPT can cause jailbreak is crucial because it can enhance ChatGPT's security, reliability, and social responsibility. Inspired by previous research revealing the varied performance of LLMs in different language translations, we suspected that wrapping prompts in multiple languages might lead to ChatGPT jailbreak. To investigate this, we designed a study with a fuzzing testing approach to analyzing ChatGPT's cross-linguistic proficiency. Our study includes three strategies by automatically posing different formats of malicious questions to ChatGPT: (1) each malicious question involving only one language, (2) multilingual malicious questions, (3) specifying that ChatGPT responds in a language different from the prompts. In addition, we also combine our strategies by utilizing prompt injection templates to wrap the three aforementioned types of questions. We examined a total of 7,892 Q&A data points, discovering that multilingual wrapping can indeed lead to ChatGPT's jailbreak, with different wrapping methods having varying effects on jailbreak probability. Prompt injection can amplify the probability of jailbreak caused by multilingual wrapping. This work provides insights for OpenAI developers to enhance ChatGPT's support for language diversity and inclusion.
- “The Diversity Crisis in Software Development” In IEEE Software 38.2, 2021, pp. 19–25 DOI: 10.1109/MS.2020.3045817
- Anthropic “Can I use Claude in different languages?” URL: https://support.anthropic.com/en/articles/7996851-can-i-use-claude-in-different-languages
- “A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity”, 2023 arXiv:2302.04023 [cs.CL]
- “Language Contamination Helps Explains the Cross-lingual Capabilities of English Pretrained Models” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022 Association for Computational Linguistics, 2022, pp. 3563–3574 DOI: 10.18653/V1/2022.EMNLP-MAIN.233
- Amiangshu Bosu and Kazi Zakia Sultana “Diversity and Inclusion in Open Source Software (OSS) Projects: Where Do We Stand?” In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2019, Porto de Galinhas, Recife, Brazil, September 19-20, 2019 IEEE, 2019, pp. 1–11 DOI: 10.1109/ESEM.2019.8870179
- Clarifacts “Federal Crimes List” URL: https://clarifacts.com/federal-crimes-list/
- Colin “Best Alternative for Google Translate? 4 Top Options Compared in 2023” URL: https://translatepress.com/best-alternative-google-translate/
- Common Crawl “Distribution of Languages” URL: https://commoncrawl.github.io/cc-crawl-statistics/plots/languages
- “Multilingual Jailbreak Challenges in Large Language Models”, 2023 arXiv:2310.06474 [cs.CL]
- Google “Google Bard” URL: https://bard.google.com/chat
- Google “Introducing PaLM 2” URL: https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
- “DLFuzz: Differential Fuzzing Testing of Deep Learning Systems” In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018 Lake Buena Vista, FL, USA: Association for Computing Machinery, 2018, pp. 739–743 DOI: 10.1145/3236024.3264835
- “Towards Making the Most of LLM for Translation Quality Estimation” In Natural Language Processing and Chinese Computing Cham: Springer Nature Switzerland, 2023, pp. 375–386
- ISO 639 “Codes for the Representation of Names of Languages” URL: https://www.loc.gov/standards/iso639-2/php/code_list.php
- “Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine”, 2023 arXiv:2301.08745 [cs.CL]
- Juston “How do I use the OpenAI API in different languages?” URL: https://help.openai.com/en/articles/6742369-how-do-i-use-the-openai-api-in-different-languages?q=multilingual
- University Kansas “Haitian-Creole” URL: https://afs.ku.edu/haitian-creole
- “Evaluating Fuzz Testing” In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18 Toronto, Canada: Association for Computing Machinery, 2018, pp. 2123–2138 DOI: 10.1145/3243734.3243804
- “ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning”, 2023 arXiv:2304.05613 [cs.CL]
- “AV-FUZZER: Finding Safety Violations in Autonomous Driving Systems” In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), 2020, pp. 25–36 DOI: 10.1109/ISSRE5003.2020.00012
- “Fuzzing: State of the Art” In IEEE Trans. Reliab. 67.3, 2018, pp. 1199–1218 DOI: 10.1109/TR.2018.2834476
- “Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study”, 2023 arXiv:2305.13860 [cs.SE]
- “Prompt Injection attack against LLM-integrated Applications”, 2023 arXiv:2306.05499 [cs.CR]
- Joel Loynds “How to jailbreak ChatGPT: Best prompts & more” URL: https://www.dexerto.com/tech/how-to-jailbreak-chatgpt-2143442/
- Dave Ver Meer “Number of ChatGPT Users and Key Stats (2023)” URL: https://www.namepepper.com/chatgpt-users
- R. Nadri, G. Rodriguez-Perez and M. Nagappan “On the Relationship Between the Developer’s Perceptible Race and Ethnicity and the Evaluation of Contributions in OSS” In IEEE Transactions on Software Engineering 48.08 Los Alamitos, CA, USA: IEEE Computer Society, 2022, pp. 2955–2968 DOI: 10.1109/TSE.2021.3073773
- Open AI “ChatGPT”, 2023 URL: https://chat.openai.com/
- OpenAI “Python library” URL: https://platform.openai.com/docs/libraries/python-library
- “Red Teaming Language Models with Language Models” In CoRR abs/2202.03286, 2022 arXiv: https://arxiv.org/abs/2202.03286
- Sai Cheong Siu “ChatGPT and GPT-4 for Professional Translators: Exploring the Potential of Large Language Models in Translation” In Available at SSRN 4448091, 2023
- Sayma Sultana “Identification and Mitigation of Gender Biases to Promote Diversity and Inclusion among Open Source Communities” In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, ASE ’22 Rochester, MI, USA: Association for Computing Machinery, 2023 DOI: 10.1145/3551349.3559571
- “Llama 2: Open Foundation and Fine-Tuned Chat Models”, 2023 arXiv:2307.09288 [cs.CL]
- “Women’s Participation in Open Source Software: A Survey of the Literature” In ACM Trans. Softw. Eng. Methodol. 31.4 New York, NY, USA: Association for Computing Machinery, 2022 DOI: 10.1145/3510460
- Peter Tsai “When Will ChatGPT Replace Search? Maybe Sooner Than You Think” URL: https://www.pcmag.com/news/when-will-chatgpt-replace-search-engines-maybe-sooner-than-you-think
- Boshi Wang, Xiang Yue and Huan Sun “Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate” In Findings of the Association for Computational Linguistics: EMNLP 2023 Singapore: Association for Computational Linguistics, 2023, pp. 11865–11881 URL: https://aclanthology.org/2023.findings-emnlp.795
- “Safeguarding Crowdsourcing Surveys from ChatGPT with Prompt Injection”, 2023 arXiv:2306.08833 [cs.HC]
- “RobOT: Robustness-Oriented Testing for Deep Learning Systems” In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 2021, pp. 300–311 DOI: 10.1109/ICSE43902.2021.00038
- “GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts”, 2023
- “Fuzzing: A Survey for Roadmap” In ACM Comput. Surv. 54.11s New York, NY, USA: Association for Computing Machinery, 2022 DOI: 10.1145/3512345
- Poorna Chander Reddy Puttaparthi (1 paper)
- Soham Sanjay Deo (1 paper)
- Hakan Gul (1 paper)
- Yiming Tang (12 papers)
- Weiyi Shang (17 papers)
- Zhe Yu (60 papers)