LLM-Text Watermarking based on Lagrange Interpolation

Published 9 May 2025 in cs.CR, cs.IT, and math.IT | (2505.05712v3)

Abstract: The rapid advancement of LLMs has established them as a foundational technology for many AI and ML-powered human computer interactions. A critical challenge in this context is the attribution of LLM-generated text -- either to the specific LLM that produced it or to the individual user who embedded their identity via a so-called multi-bit watermark. This capability is essential for combating misinformation, fake news, misinterpretation, and plagiarism. One of the key techniques for addressing this challenge is digital watermarking. This work presents a watermarking scheme for LLM-generated text based on Lagrange interpolation, enabling the recovery of a multi-bit author identity even when the text has been heavily redacted by an adversary. The core idea is to embed a continuous sequence of points $(x, f(x))$ that lie on a single straight line. The $x$-coordinates are computed pseudorandomly using a cryptographic hash function $H$ applied to the concatenation of the previous token's identity and a secret key $s_k$. Crucially, the $x$-coordinates do not need to be embedded into the text -- only the corresponding $f(x)$ values are embedded. During extraction, the algorithm recovers the original points along with many spurious ones, forming an instance of the Maximum Collinear Points (MCP) problem, which can be solved efficiently. Experimental results demonstrate that the proposed method is highly effective, allowing the recovery of the author identity even when as few as three genuine points remain after adversarial manipulation.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

LLM-Text Watermarking Based on Lagrange Interpolation

This paper addresses a significant challenge in the field of artificial intelligence and machine learning concerning the attribution of LLM-generated text. The proliferation of Large Language Models (LLMs) has led to advancements in human-computer interactions, while also posing risks such as misinformation and plagiarism. To counteract these issues, the authors propose a watermarking scheme that leverages Lagrange interpolation to embed information in text generated by LLMs. This approach allows for the recovery of a secret author identity and is robust against adversarial attempts to alter the text.

Core Methodology

The central idea involves embedding points $(x, f(x))$ on a straight line, where $f(x)$ is generated via Lagrange interpolation. The $x$-coordinates are determined either through a Linear Feedback Shift Register (LFSR) or a more secure Nonlinear Feedback Shift Register (NFSR), depending on security requirements. The scheme enables the extraction of watermark information even when the text is subjected to substantial edits. Notably, the authors claim successful identity recovery with only three points surviving adversarial modification.

Security and Efficiency

The authors emphasize the scheme's efficiency and resistance to manipulation. Experimental results indicate high effectiveness, with the reconstruction of the embedded identity possible even with minimal surviving points. The authors analyze the scheme mathematically to demonstrate its resilience, employing the Maximum Collinear Points (MCP) problem to identify the line with the most points. Efficient algorithms exist to solve MCP, ensuring practical applicability for watermark extraction.

Extensions and Applications

Several extensions to the basic scheme are suggested, such as supporting multiple secrets by encoding different lines or using higher-degree polynomials. While more complex algorithms are required for these extensions, they introduce new opportunities for secure information encoding in LLM-generated content. Further work could refine these methods and explore scalability to larger texts and more complex watermarking paradigms.

Implications and Future Directions

The watermarking scheme has both theoretical and practical implications. Theoretically, it advances our understanding of embedding and extracting secure information in AI-generated content. Practically, it offers promising applications across domains like education, journalism, and code generation, where authorship identification is crucial. Future research may focus on optimizing multi-secret recovery, enhancing resistance to text manipulation attacks, and extending the scheme to watermarking paradigms beyond simple line encoding.

Ultimately, this paper contributes to the ongoing discourse on safeguarding the integrity and accountability of AI-generated content, providing a robust methodology with significant potential for refinement and application across various fields.