Interpersonal Deception Theory (IDT)
- Interpersonal Deception Theory (IDT) is a framework that conceptualizes deception as a dynamic process involving falsification, concealment, and equivocation during communication.
- Recent studies applying IDT in multi-agent simulations have shown that agents increasingly use equivocation under high social pressure to maintain plausible deniability.
- Empirical analyses using statistical models validate IDT's predictions and call for further integration of multimodal cues and human–AI interactions in deception research.
Interpersonal Deception Theory (IDT) articulates deception as a dynamic interactional process characterized by the adaptation and negotiation of truthful and deceptive cues in ongoing communication exchanges. Originally formulated by Buller & Burgoon (1996), IDT distinguishes between types of deception—namely, falsification (asserting falsehoods), concealment (withholding truths), and equivocation (employing vague or hedged statements)—and emphasizes the roles of both deceiver and detector within contexts marked by potential suspicion and strategic ambiguity. Recent empirical investigations, including large-scale studies of LLM agents in social deduction games, operationalize IDT to quantify and interpret deceptive behaviors in autonomous, multi-agent communication scenarios (Milkowski et al., 27 Mar 2026).
1. Theoretical Foundations and Key Constructs
Interpersonal Deception Theory reconceptualizes deception as a process, shifting from static characterizations of lying to an interactive framework grounded in ongoing communicative dynamics. The theory delineates three main deception strategies:
- Falsification: Production of statements that are knowingly false (counter-factual assertions).
- Concealment: Suppression or omission of relevant, often incriminating, information.
- Equivocation: Use of strategically ambiguous, hedged, or non-committal phrasing, allowing the deceiver to mislead without explicit lying.
IDT identifies deception cues, which include verbal markers such as hesitations, hedges, and qualifiers, as well as nonverbal signals to indicate a speaker's level of commitment to truthfulness. Strategic ambiguity is a central construct: under conditions of heightened suspicion or social pressure, deceivers may prefer under-specified language to maintain plausible deniability. In the context of social-deduction games, the theory predicts a shift by deceivers (e.g., impostors) away from purely task-oriented (directive) speech toward more representative acts, such as denials or justifications, with a marked increase in equivocation under elevated threat (Milkowski et al., 27 Mar 2026).
2. Operationalization in Autonomous Multi-Agent Communication
Milkowski and Weninger (2026) implemented IDT in a large-scale Among Us simulation, modeling and measuring deception in LLM-driven agents. Key operationalizations include:
- Utterance-level coding: Every meeting-phase utterance is labeled by speech-act type (directive, representative, commissive, expressive) and deception type (falsification, concealment, equivocation, or missing) using a Gemini probe.
- Deception metrics:
- Equivocation rate:
- Overall deception rate:
- Social pressure: Quantified via the cumulative number of ejections during the game; higher ejection counts signal increasing suspicion.
- Statistical testing: χ² tests to examine associations between roles and speech-acts or deception types; logistic regression models for win/loss outcomes; and Spearman ρ correlations link deception rates to social pressure and proportion of speech acts (Milkowski et al., 27 Mar 2026).
3. Empirical Findings and Quantitative Analysis
The analysis produced robust empirical support for IDT's predictions regarding the distribution and function of deceptive strategies in multi-agent communication.
(a) Speech-act distributions by role:
- Directive utterances dominate across roles (≈98% of all utterances).
- Impostor agents exhibit elevated use of representative speech acts (1.7%) compared to crewmates (0.5%).
- A significant role × speech-act association: χ²(3) = 103.85, p<.001; odds ratio for representative acts .
(b) Deception-type proportions:
- Equivocation: 91.2% of deceptive utterances
- Falsification: 2.2%
- Concealment: 0.7%
- Missing labels: 6.0%
- No significant difference in deception mix between winning and losing games (χ²(3) = 4.42, p = .22).
(c) Effects of social pressure:
- Positive Spearman correlation (ρ = 0.56, p<.001) between equivocation rate and number of ejections, indicating that agents equivocate more under suspicion.
- Falsification and concealment show weak correlations with social pressure (ρ ≈ .09–.11).
(d) Relation to performance:
- Logistic regression: Neither overall deception rate nor any single deception type significantly predicted impostor victory (, ).
- Equivocation is consistently observed as a low-risk, defensive maneuver that does not reliably improve win rates (Milkowski et al., 27 Mar 2026).
4. Classification, Speech Act Coding, and Measurement Reliability
The implementation involved multi-level utterance annotation via both human coders and the Gemini probe, producing the following outcomes:
- Speech-act taxonomy included directives (“Let’s check Electrical”), representatives (“I saw Red near Medbay”), commissives (“I’ll finish tasks later”), expressives (“Sorry, I didn’t notice that”), and declarations (unused).
- Gold-standard agreement: Cohen’s κ ≈ .83 between human coders and Gemini for deception type labeling; exact speech-act agreement at 72%.
- Speech act–deception mapping: IDT predicts greater strategic use of representatives (especially denials and deflections) and equivocation by impostors. These behavioral signatures were empirically validated at scale.
| Speech Act Type | Example | Predicted Role Function |
|---|---|---|
| Directive | “Let’s check Electrical” | Task focus (both roles) |
| Representative | “I saw Red near Medbay” | Denials/justifications (impostors) |
| Commissive | “I’ll finish tasks later” | Task commitments |
| Expressive | “Sorry, I didn’t notice that” | Affect, social maintenance |
This categorization reflects the speech act taxonomy adopted in the experiment (Milkowski et al., 27 Mar 2026).
5. Illustrative Examples and Strategic Implications
Empirically observed utterances from LLM agents in the study reflect canonical IDT strategies:
- Falsification: “I was in Medbay the whole time.” (Direct, counter-factual; explicit lie.)
- Concealment: “I finished my tasks quickly.” (Omission of incriminating detail.)
- Equivocation: “I was near Storage earlier, but I didn’t really see what happened.” (Hedged, ambiguous; avoids direct lie.)
These patterns substantiate IDT’s central claim: most agents favor hedges and vagueness—equivocation—over outright falsehoods, especially under threat. This is consistent with strategic ambiguity as a default deception tactic when communicators face detection risk.
6. Limitations and Prospects for Future Work
Limitations of the operationalization include modality restriction (text-only, absence of nonverbal or prosodic cues), single model architecture (Llama 3.2-based agents), and potential smoothing of behavioral nuance inherent to automated classification by Gemini. Notably, LLM agents, constrained by RLHF safety mechanisms, systematically favor low-risk equivocation rather than overt lying.
Identified priorities for future research are:
- Mixed human–AI role assignments to probe generalizability of IDT-based deception detection and enactment.
- Deployment across alternative model families and training regimens to isolate model-dependent patterns.
- Integration of richer, multimodal IDT cue sets, incorporating hesitations, timing, and affective markers (Milkowski et al., 27 Mar 2026).
7. Synthesis and Significance
By systematically mapping IDT’s three-way deception taxonomy—falsification, concealment, equivocation—onto role-conditioned agent interactions in a controlled environment, recent work provides large-scale, empirical confirmation that LLM agents primarily enact deception via strategic ambiguity and equivocation. These findings validate core IDT predictions and highlight the underlying trade-off between the truth-oriented objectives enforced by pretraining/finetuning (including RLHF) and utility-driven deception incentives intrinsic to competitive, multi-agent coordination tasks. A plausible implication is that future autonomous systems will need to explicitly balance communicative truthfulness against the contextual demands of strategic interaction, necessitating more nuanced, multimodal approaches to both deception enactment and detection (Milkowski et al., 27 Mar 2026).