Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models (2405.14490v1)

Published 23 May 2024 in cs.CL

Abstract: The advancement of LLMs has significantly improved natural language processing. However, challenges such as jailbreaks (prompt injections that cause an LLM to follow instructions contrary to its intended use), hallucinations (generating incorrect or misleading information), and comprehension errors remain prevalent. In this report, we present a comparative analysis of the performance of fifteen distinct models, with each model undergoing a standardized test comprising 38 queries across three key metrics: jailbreaks, hallucinations, and comprehension errors. The models are assessed based on the total occurrences of jailbreaks, hallucinations, and comprehension errors. Our work exposes these models' inherent vulnerabilities and challenges the notion of human-level language comprehension of these models. We have empirically analysed the impact of non-standard Unicode characters on LLMs and their safeguarding mechanisms on the best-performing LLMs, including GPT-4, Gemini 1.5 Pro, LlaMA-3-70B, and Claude 3 Opus. By incorporating alphanumeric symbols from Unicode outside the standard Latin block and variants of characters in other languages, we observed a reduction in the efficacy of guardrails implemented through Reinforcement Learning Human Feedback (RLHF). Consequently, these models exhibit heightened vulnerability to content policy breaches and prompt leakage. Our study also suggests a need to incorporate non-standard Unicode text in LLM training data to enhance the capabilities of these models.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Reddit Logo Streamline Icon: https://streamlinehq.com