Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 89 tok/s

Gemini 2.5 Pro 53 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 93 tok/s Pro

Kimi K2 221 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play (2406.04643v1)

Published 7 Jun 2024 in cs.CL

Abstract: The boardgame Diplomacy is a challenging setting for communicative and cooperative artificial intelligence. The most prominent communicative Diplomacy AI, Cicero, has excellent strategic abilities, exceeding human players. However, the best Diplomacy players master communication, not just tactics, which is why the game has received attention as an AI challenge. This work seeks to understand the degree to which Cicero succeeds at communication. First, we annotate in-game communication with abstract meaning representation to separate in-game tactics from general language. Second, we run two dozen games with humans and Cicero, totaling over 200 human-player hours of competition. While AI can consistently outplay human players, AI-Human communication is still limited because of AI's difficulty with deception and persuasion. This shows that Cicero relies on strategy and has not yet reached the full promise of communicative and cooperative AI.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a novel AMR-based methodology to evaluate Cicero’s dual performance in strategic play and communicative negotiation.
The study finds that while Cicero wins 84% of games, its persuasion and deception tactics are easily detected by human opponents.
The research reveals a gap between Cicero’s superior strategy and its subpar human-like communication, guiding future AI enhancements in diplomacy.

Assessing Cicero’s Diplomacy Capabilities in a Communicative Context

This paper presents an evaluation of the AI system Cicero, known for playing the strategy board game Diplomacy. It focuses on assessing the extent of Cicero’s proficiency not just in strategic gameplay but also in its communicative abilities. The paper confronts popular assertions proclaiming Cicero's capabilities in human-like negotiation and deception within the game, as reported in prior studies and media narratives.

Strategic Versus Communicative Skills

Previous evaluations of Cicero concentrated predominantly on its strategic success, namely its ability to win games. However, Diplomacy, as a test bed, requires effective communication, which encompasses the dual facets of persuasion and deception. Mastery of these skills is considered integral for an AI to genuinely compete at a human level within the game. This paper introduces novel methodologies to rigorously test these communicative aspects.

The methodology involved annotating in-game communications using Abstract Meaning Representation (AMR) to decode the expressed intents apart from mere strategic actions. The authors have build a parser that manages to achieve a Smatch score of 66.6 after adjusting for specific game nuances, nonetheless acknowledging that there remains room for improvement in parsing accuracy. The distinction between intent and action enabled the researchers to flag deceit (broken commitments) and persuasion attempts through AMR coded interactions.

Key Findings

The paper conducted 24 games involving human players and Cicero, resulting in significant empirical insights:

Strategic Dominance: Cicero continues to demonstrate superior strategic capabilities relative to human players, securing victories in 84% of games.
Communication Attempts: Despite its frequent success in gameplay, Cicero's communication is readily identified by experienced human players, which suggests that it doesn’t convincingly simulate human-like interactions.
Deception and Persuasion Limitations: Humans perceived Cicero’s communications to be deceptive at a higher rate than utterances from other human players. However, when analyzed through detected broken commitments, humans themselves were found to break commitments more frequently than Cicero.
Effectiveness in Persuasion: Despite initiating persuasive attempts at similar rates, humans achieved a higher success rate in persuasion over Cicero, showing a gap in the AI’s ability to influence strategically via dialogue.

These results challenge the notion of Cicero as a "master" of human-like communicative interaction in Diplomacy.

Implications and Future Outlook

The paper highlights Cicero's reliance on strategic execution rather than communicative finesse to win games, demonstrating that while it excels tactically, it achieves human-level competence neither in deception nor in persuading its opponents through dialogue. Advancements in these areas remain crucial for integrating AI that effectively mirrors human strategic communication in competitive settings.

The authors suggest possible future directions to refine AI communicative methodologies in Diplomacy-like scenarios. They advocate for more nuanced analysis and improved models that not only understand tactical and strategic needs but also the subtleties of human collaboration, misdirection, and persuasion. This exploration into AI behavior in games like Diplomacy serves as a stepping-stone toward developing intelligent systems that integrate strategic decision-making with complex human-like communication and interaction models. Overall, the research opens a dialogue for advancing AI's communicative competencies beyond tactical automation.