Robust Steganography from Large Language Models

Published 11 Apr 2025 in cs.CR | (2504.08977v1)

Abstract: Recent steganographic schemes, starting with Meteor (CCS'21), rely on leveraging LLMs to resolve a historically-challenging task of disguising covert communication as innocent-looking'' natural-language communication. However, existing methods are vulnerable tore-randomization attacks,'' where slight changes to the communicated text, that might go unnoticed, completely destroy any hidden message. This is also a vulnerability in more traditional encryption-based stegosystems, where adversaries can modify the randomness of an encryption scheme to destroy the hidden message while preserving an acceptable covertext to ordinary users. In this work, we study the problem of robust steganography. We introduce formal definitions of weak and strong robust LLM-based steganography, corresponding to two threat models in which natural language serves as a covertext channel resistant to realistic re-randomization attacks. We then propose two constructions satisfying these notions. We design and implement our steganographic schemes that embed arbitrary secret messages into natural language text generated by LLMs, ensuring recoverability even under adversarial paraphrasing and rewording attacks. To support further research and real-world deployment, we release our implementation and datasets for public use.