2000 character limit reached
Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model (2104.09833v1)
Published 20 Apr 2021 in cs.CL
Abstract: With advances in neural LLMs, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones. While the latter's payload capacity is impressive, generating genuine-looking texts remains challenging. In this paper, we revisit edit-based linguistic steganography, with the idea that a masked LLM offers an off-the-shelf solution. The proposed method eliminates painstaking rule construction and has a high payload capacity for an edit-based model. It is also shown to be more secure against automatic detection than a generation-based method while offering better control of the security/payload capacity trade-off.