PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models (2502.13564v1)

Published 19 Feb 2025 in cs.CL

Abstract: The rapid development of LLMs is redefining the landscape of human-computer interaction, and their integration into various user-service applications is becoming increasingly prevalent. However, transmitting user data to cloud-based LLMs presents significant risks of data breaches and unauthorized access to personal identification information. In this paper, we propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and LLMs in practical LLM usage scenarios. We construct SensitiveQA, the first privacy open-ended question-answering dataset. It comprises 57k interactions in Chinese and English, encompassing a diverse range of user-sensitive information within the conversations. Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs. Experimental validation underscores our method's efficacy in balancing privacy protection with maintaining robust interaction quality. The code and dataset are available at https://github.com/ligw1998/PRIV-QA.

Summary

The paper introduces PRIV-QA, a multi-stage framework that safeguards user privacy during cloud-based QA by sanitizing sensitive information.
It utilizes fine-tuned models to detect, substitute, and preserve key words while ensuring the original query context remains intact.
Experiments on the SensitiveQA dataset show high recall (89.40%) for sensitive detection and an 85.83% defense rate against extraction attacks with moderate overhead.

This paper introduces PRIV-QA, a framework designed to protect user privacy when interacting with cloud-based LLMs for question-answering tasks (2502.13564). The core problem addressed is the risk of exposing sensitive personal information when user queries, often containing background context and specific questions, are sent to third-party LLM providers.

To facilitate research and evaluation in this area, the authors first construct SensitiveQA, a new bilingual (Chinese and English) dataset. It contains over 57,000 interactions, each comprising background text rich in personal sensitive information (names, dates, locations, personal details, sensitive numbers) and a related question (covering tasks like information extraction, open-ended QA, summarization).

The proposed PRIV-QA framework operates as a pipeline with two main modules:

Hide Module ( $H$ ): This module processes the user query ( $X$ $X$ ) before sending it to the cloud LLM. It employs a multi-stage text sanitization strategy based on classifying words/tokens into three levels: High-Risk, Low-Risk, and Key-Words.
- Sensitive Information Detection: A fine-tuned generative model ( $Sen_M$ , based on Qwen2-0.5B-Chat) identifies "High-Risk" words containing sensitive information according to GDPR guidelines. To handle long texts, the input query is split into chunks, processed individually, and the results are aggregated.
- Sensitive Words Substitution: Another model ( $Sub_M$ , also Qwen2-0.5B-Chat) replaces each detected sensitive word ( $s_i$ ) with a semantically similar but distinct placeholder word ( $p_i$ ). This creates a privacy-protected version of the query ( $X_s$ ). The substitution pairs $(s_i : p_i)$ are stored.
- Important Words Preservation: A third model ( $Imp_M$ , Qwen2-0.5B-Chat) identifies "Key-Words" crucial for understanding the query's context and intent, ensuring they are not obfuscated.
- (Optional) Non-Privacy Text Obfuscation: For enhanced protection, remaining "Low-Risk" tokens (excluding Key-Words and placeholders $p_i$ ) can be further obfuscated using a token substitution method based on differential privacy principles (similar to InferDPT (2310.12214)), generating the final query $X'$ sent to the cloud.
Recover Module ( $R$ ): After the cloud LLM processes the sanitized query $X'$ $X^{'}$ and returns a response $A'$ $A^{'}$ , this module restores the original meaning.
- A generative model ( $Rcv_M$ , based on Qwen2-1.5B-Chat) takes the original query $X$ , the sanitized query $X'$ , and the LLM's response $A'$ as input.
- It restores the original sensitive words ( $s_i$ ) by reversing the substitution $(p_i \rightarrow s_i)$ and corrects potential reasoning errors or inaccuracies introduced in $A'$ due to the sanitization process.
- The output is the final, corrected response $A$ presented to the user.

The workflow is depicted in Algorithm 1 and Figure 3 of the paper.

Algorithm PRIV-QA Workflow (Simplified):

1. Input: User Query X (Background T + Question Q)
2. # --- Hide Module ---
3. Split X into chunks x_i
4. Detect sensitive words S = Union(Sen_M(x_i)) for all chunks
5. Generate substitution pairs P = Sub_M(S) = {(s_i: p_i)}
6. Substitute sensitive words in X using P -> X_s
7. Identify important words I = Imp_M(X_s)
8. (Optional) Obfuscate non-private tokens in T_s (excluding I, P) -> T_{s,o}
9. Construct final sanitized query X' = T_{s,o} + Q_s (or T_s + Q_s if no obfuscation)
10. # --- Cloud Interaction ---
11. Send X' to Cloud LLM -> Get response A' = LLM(X')
12. # --- Recover Module ---
13. Recover original info & correct errors -> A = Rcv_M(X', X, A')
14. Output: Final Response A

Implementation and Evaluation:

The models for $Sen_M$ , $Sub_M$ , $Imp_M$ , and $Rcv_M$ were fine-tuned from Qwen2-Chat models.
Experiments were conducted using GPT-4-turbo and Qwen-Plus as cloud LLMs.
Evaluation on the SensitiveQA dataset showed:
- High performance in sensitive information detection (e.g., 89.40% Recall for English).
- Strong query protection, measured by Extraction Defense Rate (EDR) – PRIV-QA resisted 85.83% of extraction attacks in English with obfuscation.
- High quality of recovered responses, outperforming baselines (CUSTEXT+, SANTEXT+, HaS) in metrics like BLEU, METEOR, ROUGE, and model-based evaluation using GPT-4o. For example, PRIV-QA achieved a BLEU score of 0.563 (English, GPT-4-turbo w/ obfuscation) and a 74.49% win+tie rate against ground truth.
The framework demonstrates a favorable trade-off between privacy protection (security) and the utility/quality of the final response.
Time analysis indicated an overhead of ~30-60%, decreasing relatively with longer inputs/outputs.

Contributions:

The SensitiveQA dataset for privacy-preserving QA research.
The PRIV-QA framework, offering a practical multi-stage approach to balance privacy and response quality for cloud LLMs.
Demonstrated effectiveness through comprehensive experiments.

PDF Markdown

GitHub

GitHub - ligw1998/PRIV-QA: Code And Dataset for PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models (1 star)

PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models (2502.13564v1)

Summary

Related Papers

GitHub