Do Large Reasoning Models Naturally Resist Multi-Turn Adversarial Pressure?
Determine whether large reasoning models that employ extended chain-of-thought reasoning at inference time exhibit vulnerabilities to multi-turn adversarial follow-up attacks comparable to those observed in instruction-tuned large language models, or whether extended chain-of-thought provides a natural defense that yields greater robustness under adversarial pressure.
References
Whether large reasoning models exhibit similar vulnerabilities, or whether their extended reasoning provides a natural defense, remains an open question.
— Consistency of Large Reasoning Models Under Multi-Turn Attacks
(2602.13093 - Li et al., 13 Feb 2026) in Section 1, Introduction