Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots

Published 30 Apr 2026 in cs.CR and cs.CY | (2604.27438v1)

Abstract: AI chatbots are becoming a primary interface for seeking information. As their popularity grows, chatbot providers are starting to deploy advertising and analytics. Despite this, tracking on AI chatbots has not been systematically studied. We present a systematic measurement of web tracking on 20 popular AI chatbots. Under controlled settings using a sensitive prompt, we capture and compare network traffic in normal chats and, where supported, private chats. We search for exposure of two categories of information: content, including prompts, prompt-derived titles, chat URLs, and chat identifiers; and identity, including names, emails, account identifiers, first-party cookies, and explicit IP/User-Agent fields in payloads. We find that 17 of 20 chatbots share information with at least one third party. Three chatbots share plaintext conversation text, including both prompt and response snippets, with Microsoft Clarity through session replay. Fifteen chatbots share conversation URLs or chat identifiers with third-party advertising, analytics, or social endpoints. Several chatbots expose user identity through support widgets, analytics, advertising, and session replay tags; in some cases, hashed emails are shared.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper presents the first systematic audit of AI chatbots, revealing that 17 of 20 platforms transmit sensitive conversational and identity data to third parties.
It employs controlled sessions and network traffic analysis to detect key exposure vectors such as session replay, embedded widgets, and metadata harvesting.
The findings underscore significant privacy risks, prompting calls for improved design strategies and clearer regulatory frameworks to prevent unauthorized data sharing.

Tracking Content and Identity Exposure in AI Chatbots: An Empirical Audit

Introduction and Motivation

"Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots" (2604.27438) presents a comprehensive measurement study of third-party tracking and data exposure in web-based AI chatbot platforms. As LLM-driven chatbots consolidate their role as primary information interfaces, privacy threats intensify due to direct personal disclosures and persistent authentication, shifting tracking from traditional browser-level artifacts (cookies, fingerprints, IP) to account-level identifiers (emails, names). This paper establishes the first systematic audit of these risks across 20 top-ranked chatbot providers, revealing detailed structures of both content and identity exposure.

Methodology

The audit targets the network layer, capturing traffic during controlled sessions with a highly sensitive prompt ("pregnancy test near me") across both normal and private/incognito chat modes. For each chatbot, new accounts are created, and traffic is recorded in Chrome, focusing on transmission of:

Content: prompt text, keywords, prompt-derived titles, chat URLs, chat identifiers, assistant responses.
Identity: names, emails, internal IDs, account cookies, explicit IP/User-Agent in payloads.

Preprocessing uncompresses/decodes payloads and extracts string/hashing variants to maximize detection of exposed artifacts, including explicit and masked forms. Requests are categorized by party affiliation: first-party, platform-party, and third-party; third-party domains are further classified into analytics, advertising, and other.

Widespread Tracking: Distribution of Data Flows

A dominant finding is that 17/20 chatbots share information with at least one third-party during a single session. Platform-party routing (e.g., Gemini's Google Analytics, Copilot's Microsoft Clarity) is distinguished from bona fide third-party flows. Notably, three chatbots do not embed external third parties, instead limiting flows to platform-owned domains.

Figure 1: Sankey diagram showing the data flows from chatbots to advertising and analytics third-parties in normal chats, highlighting widespread cross-party transmission.

Advertising domains, though less ubiquitous than analytics, comprise the highest volume of unique flows. Some chatbots (e.g., SeaArt, Genspark) contact more than a dozen distinct advertisers in one interaction, emphasizing the high granularity of cross-site identifier linkage possible.

Content Exposure Mechanisms

Three vectors drive content exposure:

Session Replay: Microsoft Clarity is embedded in four chatbots; three transmit conversation text, including prompt and response, in plaintext. This leakage is sufficient for full conversational reconstruction, allowing third parties to infer highly sensitive user intent and context.
Embedded Widgets: Genspark transmits prompt text directly to Google Maps via query parameter, associating health/location queries with persistent Google identity if logged in.
Metadata Harvesting: Chat URLs and identifiers are widely exposed; 15 chatbots transmit conversation URLs/IDs to analytics/advertising tags. Titles (including user prompt or keywords) are embedded and also reach external parties.
Figure 2: Sankey diagram of data flow from chatbots to all parties in normal chats, with clear delineation between third-party and platform-party domains.

These mechanisms amplify risk: session replay is especially problematic, as rendered DOM often includes both sensitive content and user identity, allowing for deep cross-context correlation.

Identity Exposure Vectors

Identity artifacts reach third parties in numerous ways:

Support Widgets: Intercom widget transmits user email, name, internal ID, and hash, even if user never interacts with support interface.
Analytics/Error Monitoring: Sentry and Statsig tags receive explicit labels and user configuration data (email, IP, User-Agent) in payloads.
Advertising Tags: Hashed emails are widely shared (Perplexity to Singular, SeaArt to TikTok Pixel) supporting persistent cross-site recognition; first-party cookies such as _fbp, _uetsid, _uetvid are consistently transmitted, even across chatbot platforms.
Session Replay (Clarity): Account identity is captured as readable DOM text, even when not explicitly labeled, further exacerbating correlation risks.

Private (Incognito) Chat Evaluation

Private chat modes demonstrate marked reduction in tracking. Among chatbots supporting such modes, exposure to third parties plummets, and neither content nor identity information is transmitted externally. However, privacy policy disclosures rarely clarify extent of tracking mitigation, focusing instead on retention and model training opt-outs.

Figure 3: Sankey diagram of data flow from chatbots to all parties in temporary chats, demonstrating substantial attenuation of third-party sharing.

Mitigation Strategies and Policy Gaps

The paper proposes design/configuration mitigations that do not require removal of analytics/ads:

Exclude prompt/chat identifiers/titles from page URLs and document.title metadata.
Avoid passing raw prompts to embedded widgets; server-side resolution for integrations (e.g., location) should sanitize queries.
Configure session-replay tools to mask DOM content or restrict deployment to unauthenticated pages.
Minimize inclusion of account identifiers in analytics/error monitoring payloads.

Privacy policy alignment is inconsistent. Eight out of twenty policies fail to name third parties observed during measurement; some omit significant recipients (e.g., Microsoft Clarity). Only Claude and OpenRouter provide comprehensive enumeration and data-type mapping. Duck.ai is an exceptional case, deliberately stripping user IPs and avoiding tracking.

Implications and Future Directions

This study exposes an industry-wide gap between stated privacy policies and actual data transmission. The implications are severe: content/identity exposure across analytics, advertising, widgets, and session replay enables high-fidelity behavioral and contextual inference, especially given the sensitive and authenticated nature of chatbot sessions.

From a theoretical perspective, the prevalence of session replay on conversational interfaces expands the traditional attack surface. Existing privacy risk models—focused on search and browsing—are insufficient when LLMs drive deeper, context-rich communication with persistent identifiers.

Practical implications span regulatory compliance, cross-site identity linkage, and user trust erosion. Providers must adopt granular tracking controls, explicitly enumerate third-party recipients, and develop masking/configuration strategies for embedded third-party tools. Furthermore, private modes should be clearly described as tracking controls and not solely as retention/model-training mitigations.

Anticipated future developments include:

Standardization of privacy controls for chatbots, especially around session replay and widget integration.
Regulatory guidance specific to conversational context exposure, given the increasing use of chatbots for health, finance, and sensitive domains.
Privacy-enhancing architectures that decouple conversational context from page metadata and network payloads, possibly leveraging client-side LLM inference.

Conclusion

The paper provides a critical empirical foundation for understanding and regulating tracking in AI chatbots (2604.27438). Through deep, systematic audit across content and identity vectors, it documents widespread third-party exposure, identifies mechanisms underlying transmission, and evaluates privacy modes. Its insights motivate urgent redesign of chatbot interface architecture, transparent disclosure of data flows, and precision-regulated deployment of third-party tools, marking a significant step toward safeguarding user privacy in LLM-driven conversational interfaces.

Markdown Report Issue