Papers
Topics
Authors
Recent
Search
2000 character limit reached

More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play

Published 7 Jun 2024 in cs.CL | (2406.04643v1)

Abstract: The boardgame Diplomacy is a challenging setting for communicative and cooperative artificial intelligence. The most prominent communicative Diplomacy AI, Cicero, has excellent strategic abilities, exceeding human players. However, the best Diplomacy players master communication, not just tactics, which is why the game has received attention as an AI challenge. This work seeks to understand the degree to which Cicero succeeds at communication. First, we annotate in-game communication with abstract meaning representation to separate in-game tactics from general language. Second, we run two dozen games with humans and Cicero, totaling over 200 human-player hours of competition. While AI can consistently outplay human players, AI-Human communication is still limited because of AI's difficulty with deception and persuasion. This shows that Cicero relies on strategy and has not yet reached the full promise of communicative and cooperative AI.

Citations (1)

Summary

  • The paper introduces a novel AMR-based methodology to evaluate Cicero’s dual performance in strategic play and communicative negotiation.
  • The study finds that while Cicero wins 84% of games, its persuasion and deception tactics are easily detected by human opponents.
  • The research reveals a gap between Cicero’s superior strategy and its subpar human-like communication, guiding future AI enhancements in diplomacy.

Assessing Cicero’s Diplomacy Capabilities in a Communicative Context

This paper presents an evaluation of the AI system Cicero, known for playing the strategy board game Diplomacy. It focuses on assessing the extent of Cicero’s proficiency not just in strategic gameplay but also in its communicative abilities. The study confronts popular assertions proclaiming Cicero's capabilities in human-like negotiation and deception within the game, as reported in prior studies and media narratives.

Strategic Versus Communicative Skills

Previous evaluations of Cicero concentrated predominantly on its strategic success, namely its ability to win games. However, Diplomacy, as a test bed, requires effective communication, which encompasses the dual facets of persuasion and deception. Mastery of these skills is considered integral for an AI to genuinely compete at a human level within the game. This paper introduces novel methodologies to rigorously test these communicative aspects.

The methodology involved annotating in-game communications using Abstract Meaning Representation (AMR) to decode the expressed intents apart from mere strategic actions. The authors have build a parser that manages to achieve a Smatch score of 66.6 after adjusting for specific game nuances, nonetheless acknowledging that there remains room for improvement in parsing accuracy. The distinction between intent and action enabled the researchers to flag deceit (broken commitments) and persuasion attempts through AMR coded interactions.

Key Findings

The study conducted 24 games involving human players and Cicero, resulting in significant empirical insights:

  • Strategic Dominance: Cicero continues to demonstrate superior strategic capabilities relative to human players, securing victories in 84% of games.
  • Communication Attempts: Despite its frequent success in gameplay, Cicero's communication is readily identified by experienced human players, which suggests that it doesn’t convincingly simulate human-like interactions.
  • Deception and Persuasion Limitations: Humans perceived Cicero’s communications to be deceptive at a higher rate than utterances from other human players. However, when analyzed through detected broken commitments, humans themselves were found to break commitments more frequently than Cicero.
  • Effectiveness in Persuasion: Despite initiating persuasive attempts at similar rates, humans achieved a higher success rate in persuasion over Cicero, showing a gap in the AI’s ability to influence strategically via dialogue.

These results challenge the notion of Cicero as a "master" of human-like communicative interaction in Diplomacy.

Implications and Future Outlook

The study highlights Cicero's reliance on strategic execution rather than communicative finesse to win games, demonstrating that while it excels tactically, it achieves human-level competence neither in deception nor in persuading its opponents through dialogue. Advancements in these areas remain crucial for integrating AI that effectively mirrors human strategic communication in competitive settings.

The authors suggest possible future directions to refine AI communicative methodologies in Diplomacy-like scenarios. They advocate for more nuanced analysis and improved models that not only understand tactical and strategic needs but also the subtleties of human collaboration, misdirection, and persuasion. This exploration into AI behavior in games like Diplomacy serves as a stepping-stone toward developing intelligent systems that integrate strategic decision-making with complex human-like communication and interaction models. Overall, the research opens a dialogue for advancing AI's communicative competencies beyond tactical automation.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 92 likes about this paper.