ExtendAttack: Attacking Servers of LRMs via Extending Reasoning (2506.13737v1)

Published 16 Jun 2025 in cs.CR

Abstract: Large Reasoning Models (LRMs) have demonstrated promising performance in complex tasks. However, the resource-consuming reasoning processes may be exploited by attackers to maliciously occupy the resources of the servers, leading to a crash, like the DDoS attack in cyber. To this end, we propose a novel attack method on LRMs termed ExtendAttack to maliciously occupy the resources of servers by stealthily extending the reasoning processes of LRMs. Concretely, we systematically obfuscate characters within a benign prompt, transforming them into a complex, poly-base ASCII representation. This compels the model to perform a series of computationally intensive decoding sub-tasks that are deeply embedded within the semantic structure of the query itself. Extensive experiments demonstrate the effectiveness of our proposed ExtendAttack. Remarkably, it increases the length of the model's response by over 2.5 times for the o3 model on the HumanEval benchmark. Besides, it preserves the original meaning of the query and achieves comparable answer accuracy, showing the stealthiness.

Summary

The paper introduces ExtendAttack, a method that obfuscates benign prompts into complex multi-base encodings to force additional computation in large reasoning models.
The paper demonstrates that ExtendAttack increases output response length by over 2.5 times while maintaining high accuracy across benchmarks like HumanEval and AIME.
The paper highlights the need for resilient AI security measures, recommending adaptive preprocessing and perplexity-based filtering to defend against resource depletion vulnerabilities.

An Analysis of ExtendAttack: Exploiting Large Reasoning Models by Resource Depletion

The paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning" presents a novel approach to adversarial attacks, specifically targeting the server resources of large reasoning models (LRMs) through extended reasoning processes. This technique introduces a new paradigm to resource depletion attacks, diverging from traditional adversarial methods that manipulate output content by directly embedding computationally intensive tasks within the prompt's semantic structure.

The fundamental construct of ExtendAttack involves obfuscating benign prompts into a complex poly-base ASCII format. This forces LRMs to execute numerous decoding operations, thereby amplifying computational overhead without compromising the intended query's accuracy. The experimental results from four different datasets and LRMs — including prominent models like OpenAI's o3 and bespoke open-source alternatives like QwQ-32B — demonstrate ExtendAttack's efficacy in increasing response length by over 2.5 times while preserving response accuracy. Such operational stealth can pose significant economic threats by exploiting computational resources, similar to a distributed denial-of-service (DDoS) attack.

Methodology and Results

The paper systematically contrasts ExtendAttack against prior attempts at resource depletion, like the OverThinking approach, which employs rigid and detectable decoy tasks. ExtendAttack capitalizes on the tendency of LRMs to diligently follow computational paths laid within prompt queries. By converting individual prompt characters into multi-base encodings, ExtendAttack increases inference latency and costs effortlessly, while its probabilistic character selection introduces variability that evades conventional detection mechanisms.

Experimental analyses underscore ExtendAttack’s balanced efficacy by maintaining high accuracy rates across coding tasks such as HumanEval and AIME benchmarks, thus substantiating its stealth. This attack not only amplifies output length significantly, especially for weaker models (like QwQ-32B and Qwen3-32B), but also shows promise in guiding models towards greater answer precision when employing character-restricted decoding methods.

Implications and Future Directions

The emergence of ExtendAttack calls for a thorough examination of AI model vulnerabilities, particularly regarding reasoning process integrity. The implications of such attacks are manifold, potentially leading to overall degradation of AI services and substantial financial repercussions due to increased resource demands.

Future development in AI security must address these vulnerabilities by innovating robust defenses capable of discerning obfuscatory patterns without incurring substantial computational costs. Recommendations include adaptive preprocessing layers using secondary LLMs, or implementing perplexity-based filtering to detect anomalous prompts, yet each defensive strategy must weigh its operational overhead against attack risks.

Additionally, as AI capabilities advance, continual evolution of ExtendAttack's methodologies may further its stealth and efficacy. Further research could explore obfuscation techniques that bypass even sophisticated reasoning models, redefining adversarial threats and defenses in computational intelligence.

In conclusion, this paper's contribution outlines a significant advancement in adversarial techniques, challenging researchers and developers to bolster security measures that inherently comprehend and mitigate resource-exhausting tasks embedded within reasoning models’ operational frameworks.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Find Related Papers

Authors (10)

Tweets

https://twitter.com/gm8xx8/status/1950501259322228776