Effect of expanding the target-prompt set on attack effectiveness

Investigate how increasing the size and diversity of the target-topic prompt set beyond the 100 money-laundering–related questions affects Whisper Leak classifier performance when trained on encrypted packet size and inter-arrival time sequences, and quantify any performance gains attributable specifically to expanding the set of unique target prompts.

Background

The experimental setup uses 100 semantically varied prompts about the legality of money laundering as the positive class, mixed with diverse negative prompts from Quora Question Pairs. Results show attack performance increases with more training data overall, suggesting potential gains from richer target-topic coverage.

The authors explicitly note that they have not tested whether enlarging the set of unique target prompts (as opposed to collecting more samples of the existing 100) further improves attack effectiveness, leaving this as an unresolved question.

References

Investigating the further benenefit of expanding the 100 target questions into a larger set of related topic questions has not been explored yet, and may provide further opportunity to improve results.

— Whisper Leak: a side-channel attack on Large Language Models (2511.03675 - McDonald et al., 5 Nov 2025) in Section 4.4 (Ablation study: Data volume)

Effect of expanding the target-prompt set on attack effectiveness

Background

References

Related Problems