Create a Video View Paper

Large-scale Online Deanonymization with LLMs

This presentation examines how frontier large language models fundamentally undermine online pseudonymity by enabling fully automated, scalable deanonymization attacks on unstructured text. The authors introduce the ESRC framework—a modular pipeline combining extraction, search, reasoning, and calibration—and demonstrate its effectiveness across multiple real-world datasets including Hacker News, Reddit, and anonymous interview transcripts. The work reveals striking performance gains over classical methods, achieving up to 68% recall at 90% precision, and shows that LLM-based attacks remain effective even at internet scale, necessitating urgent reconsideration of privacy assumptions and defenses.

Script

Imagine your anonymous online persona, carefully maintained across years of posts and comments, identified in minutes by an automated system. This paper reveals how large language models have transformed deanonymization from a labor-intensive manual process into a scalable, automated threat that fundamentally undermines online pseudonymity.

The authors begin by demonstrating what autonomous language model agents can accomplish in real-world deanonymization scenarios.

Building on this threat demonstration, the researchers show that agentic language models can autonomously extract identity signals, search candidate databases, and verify matches with minimal human oversight. In one striking example, they successfully identified 9 out of 33 anonymized scientist interviewees by analyzing publicly available interview transcripts, achieving 82% precision through automated reasoning alone.

To systematically measure and understand this threat, the authors developed a modular pipeline for scalable deanonymization.

The ESRC framework separates deanonymization into four distinct stages: extraction of identity-relevant features using language models, semantic embedding search to narrow candidate pools, extended reasoning to verify matches, and calibrated confidence scoring to control precision-recall tradeoffs. This modular design allows the researchers to systematically ablate each component and quantify the contribution of language model capabilities at every stage, moving beyond black-box agentic approaches.

The performance gap between classical methods and the language model pipeline is dramatic. Where traditional rarity-weighted baselines achieve essentially zero recall at high precision, the ESRC framework reaches 68% recall at 90% precision when matching Hacker News profiles to LinkedIn accounts from a pool of 89,000 candidates. This superiority stems from the language model's ability to extract nuanced identity signals from unstructured text rather than relying solely on explicit structured micro-data.

The researchers validated their framework across three distinct real-world scenarios, demonstrating robustness across platforms and matching contexts. Performance scales with available micro-data: for Reddit movie community matching, recall climbs to 48% when users share 10 or more cited films, while maintaining 99% precision throughout.

Perhaps most concerning is the framework's resilience to scaling: as candidate pool size increases from 1,000 to 100 million profiles, recall decays log-linearly rather than collapsing entirely. The researchers project 27% recall remains achievable at internet scale, a level utterly unattainable with non-language-model methods that drop to zero.

The practical implications demand urgent attention from platforms, policymakers, and users. The practical obscurity that once protected pseudonymous participation has been nullified, exposing vulnerable populations to surveillance, targeted harassment, and manipulation at scale.

The authors acknowledge that traditional anonymization frameworks are fundamentally inadequate for this threat model. Guardrails in language model APIs may fail because the ESRC pipeline components resemble legitimate analytic uses, making detection challenging without new defensive paradigms specifically designed for unstructured text deanonymization.

This research proves that large language models have fundamentally redefined what pseudonymity means online, transforming deanonymization from theoretical risk into practical, scalable reality. To explore the full technical details and consider the implications for your own privacy assumptions, visit EmergentMind.com to learn more.