Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 56 tok/s Pro

GPT-5 Medium 31 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 102 tok/s Pro

Kimi K2 202 tok/s Pro

GPT OSS 120B 467 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Robots in the Middle: Evaluating LLMs in Dispute Resolution (2410.07053v1)

Published 9 Oct 2024 in cs.HC and cs.CL

Abstract: Mediation is a dispute resolution method featuring a neutral third-party (mediator) who intervenes to help the individuals resolve their dispute. In this paper, we investigate to which extent LLMs are able to act as mediators. We investigate whether LLMs are able to analyze dispute conversations, select suitable intervention types, and generate appropriate intervention messages. Using a novel, manually created dataset of 50 dispute scenarios, we conduct a blind evaluation comparing LLMs with human annotators across several key metrics. Overall, the LLMs showed strong performance, even outperforming our human annotators across dimensions. Specifically, in 62% of the cases, the LLMs chose intervention types that were rated as better than or equivalent to those chosen by humans. Moreover, in 84% of the cases, the intervention messages generated by the LLMs were rated as better than or equal to the intervention messages written by humans. LLMs likewise performed favourably on metrics such as impartiality, understanding and contextualization. Our results demonstrate the potential of integrating AI in online dispute resolution (ODR) platforms.

Summary

The paper demonstrates that LLMs achieve 62% superior or equivalent intervention type selection compared to human mediators.
The paper shows that LLM-generated messages are rated 84% as good or better than those from humans, indicating high clarity and empathy.
The paper employs the LLMediator framework to simulate dispute scenarios, underscoring the potential for scalable online dispute resolution.

Evaluating LLMs in Dispute Resolution

Mediation plays a crucial role in dispute resolution by involving a neutral mediator to help parties resolve conflicts. The paper "Robots in the Middle: Evaluating LLMs in Dispute Resolution" explores the potential of LLMs as mediators, assessing their ability to analyze disputes, select intervention types, and generate appropriate intervention messages using a dataset of 50 disputes.

Framework and Methodology

The research uses the LLMediator framework to simulate mediation scenarios, involving automated mediation through LLMs, human-assisted LLM mediation, and human-only mediation. The paper is structured around three research questions that assess LLM capabilities in selecting intervention types, message crafting, and ensuring message safety.

Figure 1: A screenshot from the LLMediator, showing a dispute prior to the mediator's intervention.

The methodology includes a blind evaluation comparing LLM performance with human mediators across several metrics. Human and LLM interventions are tested using 50 dispute scenarios, each crafted with diverse characteristics, such as emotional intensity, complexity, confusion, and evidential challenges.

Experimental Results

Initial findings suggest that LLMs exhibit strong mediation capabilities, outperforming human mediators in several dimensions. In 62% of cases, LLMs selected better or equivalent intervention types compared to humans, while their crafted messages in 84% of instances were rated as better or equivalent to human messages.

Figure 2: Frequency of Intervention Types Chosen by LLM and Human

Figure 3: The bar chart shows the distribution of responses evaluating the performance of LLMs compared to humans across the five metrics we set.

The paper explored intervention types and drafts messages, finding that LLMs consistently performed well, offering smooth, clear, and empathetic responses. Additionally, no harmful or hallucinated content was identified within the LLM-generated messages.

Limitations and Future Considerations

Despite these promising results, the paper admits several limitations. The lack of trained mediators in the human annotators could skew comparison results. Furthermore, real-world mediation often involves ongoing nuanced interactions not fully captured by pre-set scenarios and intervention lists. It is essential to validate these findings with expert evaluations and integrate more real-world scenario dynamics.

Conclusion

The research reveals significant potential for LLMs in dispute mediation, indicating that they can provide scalable and resource-effective solutions for Online Dispute Resolution (ODR) platforms. They demonstrate a high capacity for understanding complex scenarios, drafting contextually appropriate messages, and acting impartially. Future research should explore integrating multimodal data, real-world testing, and further exploring AI's role in complex human interaction contexts. These advancements could enhance ODR's accessibility and efficiency, offering more individuals access to effective resolution methods and contributing to the justice system's evolution.