Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 89 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 221 tok/s Pro
GPT OSS 120B 457 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play (2406.04643v1)

Published 7 Jun 2024 in cs.CL

Abstract: The boardgame Diplomacy is a challenging setting for communicative and cooperative artificial intelligence. The most prominent communicative Diplomacy AI, Cicero, has excellent strategic abilities, exceeding human players. However, the best Diplomacy players master communication, not just tactics, which is why the game has received attention as an AI challenge. This work seeks to understand the degree to which Cicero succeeds at communication. First, we annotate in-game communication with abstract meaning representation to separate in-game tactics from general language. Second, we run two dozen games with humans and Cicero, totaling over 200 human-player hours of competition. While AI can consistently outplay human players, AI-Human communication is still limited because of AI's difficulty with deception and persuasion. This shows that Cicero relies on strategy and has not yet reached the full promise of communicative and cooperative AI.

Citations (1)

Summary

  • The paper introduces a novel AMR-based methodology to evaluate Cicero’s dual performance in strategic play and communicative negotiation.
  • The study finds that while Cicero wins 84% of games, its persuasion and deception tactics are easily detected by human opponents.
  • The research reveals a gap between Cicero’s superior strategy and its subpar human-like communication, guiding future AI enhancements in diplomacy.

Assessing Cicero’s Diplomacy Capabilities in a Communicative Context

This paper presents an evaluation of the AI system Cicero, known for playing the strategy board game Diplomacy. It focuses on assessing the extent of Cicero’s proficiency not just in strategic gameplay but also in its communicative abilities. The paper confronts popular assertions proclaiming Cicero's capabilities in human-like negotiation and deception within the game, as reported in prior studies and media narratives.

Strategic Versus Communicative Skills

Previous evaluations of Cicero concentrated predominantly on its strategic success, namely its ability to win games. However, Diplomacy, as a test bed, requires effective communication, which encompasses the dual facets of persuasion and deception. Mastery of these skills is considered integral for an AI to genuinely compete at a human level within the game. This paper introduces novel methodologies to rigorously test these communicative aspects.

The methodology involved annotating in-game communications using Abstract Meaning Representation (AMR) to decode the expressed intents apart from mere strategic actions. The authors have build a parser that manages to achieve a Smatch score of 66.6 after adjusting for specific game nuances, nonetheless acknowledging that there remains room for improvement in parsing accuracy. The distinction between intent and action enabled the researchers to flag deceit (broken commitments) and persuasion attempts through AMR coded interactions.

Key Findings

The paper conducted 24 games involving human players and Cicero, resulting in significant empirical insights:

  • Strategic Dominance: Cicero continues to demonstrate superior strategic capabilities relative to human players, securing victories in 84% of games.
  • Communication Attempts: Despite its frequent success in gameplay, Cicero's communication is readily identified by experienced human players, which suggests that it doesn’t convincingly simulate human-like interactions.
  • Deception and Persuasion Limitations: Humans perceived Cicero’s communications to be deceptive at a higher rate than utterances from other human players. However, when analyzed through detected broken commitments, humans themselves were found to break commitments more frequently than Cicero.
  • Effectiveness in Persuasion: Despite initiating persuasive attempts at similar rates, humans achieved a higher success rate in persuasion over Cicero, showing a gap in the AI’s ability to influence strategically via dialogue.

These results challenge the notion of Cicero as a "master" of human-like communicative interaction in Diplomacy.

Implications and Future Outlook

The paper highlights Cicero's reliance on strategic execution rather than communicative finesse to win games, demonstrating that while it excels tactically, it achieves human-level competence neither in deception nor in persuading its opponents through dialogue. Advancements in these areas remain crucial for integrating AI that effectively mirrors human strategic communication in competitive settings.

The authors suggest possible future directions to refine AI communicative methodologies in Diplomacy-like scenarios. They advocate for more nuanced analysis and improved models that not only understand tactical and strategic needs but also the subtleties of human collaboration, misdirection, and persuasion. This exploration into AI behavior in games like Diplomacy serves as a stepping-stone toward developing intelligent systems that integrate strategic decision-making with complex human-like communication and interaction models. Overall, the research opens a dialogue for advancing AI's communicative competencies beyond tactical automation.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 5 posts and received 92 likes.

Youtube Logo Streamline Icon: https://streamlinehq.com