Conversations Gone Awry: Detecting Early Signs of Conversational Failure (1805.05345v1)

Published 14 May 2018 in cs.CL, cs.AI, cs.CY, cs.HC, and physics.soc-ph

Abstract: One of the main challenges online social systems face is the prevalence of antisocial behavior, such as harassment and personal attacks. In this work, we introduce the task of predicting from the very start of a conversation whether it will get out of hand. As opposed to detecting undesirable behavior after the fact, this task aims to enable early, actionable prediction at a time when the conversation might still be salvaged. To this end, we develop a framework for capturing pragmatic devices---such as politeness strategies and rhetorical prompts---used to start a conversation, and analyze their relation to its future trajectory. Applying this framework in a controlled setting, we demonstrate the feasibility of detecting early warning signs of antisocial behavior in online discussions.

Citations (176)

View on Semantic Scholar

Summary

The paper introduces a novel framework that leverages politeness and rhetorical markers to forecast conversational failure.
It combines machine learning with crowdsourced annotations from Wikipedia talk pages to analyze early linguistic cues.
Integrating linguistic features with a toxicity classifier boosts prediction accuracy from 61.6% to 64.9%, offering actionable insights for online moderation.

Insights into Predicting Conversational Derailment in Online Settings

The paper, "Conversations Gone Awry: Detecting Early Signs of Conversational Failure," explores the challenging and significant problem of predicting conversational derailment in online platforms. Authored by a collaboration between Cornell University and Jigsaw researchers, this work introduces a novel task of detecting early indicators of potential antisocial behavior within online conversations, such as on Wikipedia talk pages. Unlike traditional approaches, which aim to identify toxic behavior after its occurrence, this research proposes a proactive approach aimed at salvaging conversations that may eventually turn awry.

Computational Framework for Early Detection

The paper develops a framework that leverages pragmatic and rhetorical devices present at the beginning of conversations to predict their future trajectory. Specifically, the paper focuses on politeness strategies and rhetorical prompts that form the linguistic scaffold of the initial exchanges in conversations. These pragmatic markers are critically analyzed to discern between conversations likely to remain civil and those that degenerate into negative interactions, such as personal attacks.

Data Annotation and Controlled Setting

To explore this phenomenon, the researchers compiled a substantial dataset from Wikipedia's talk page discussions. This dataset was curated using a combination of machine learning to pre-select candidates and human annotations via crowdsourcing to confirm the presence of personal attacks or initial civility. Importantly, a controlled setting was established to mitigate topical confounds, allowing a more precise paper of linguistic cues over conversational content or context.

Analysis of Pragmatic Devices

Through the use of log-odds ratios, the paper identifies several pragmatic devices as indicative of potential conversational failure. Markers of linguistic directness, like direct questions and the use of second-person pronouns, correlate with derailment, suggesting potential underlying hostilities. In contrast, positive politeness markers, such as gratitude and hedging, are more commonly associated with conversations that remain on track. This distinction provides insight into the nuanced role of language features in predicting the outcome of conversations.

Predictive Performance and Implications

The paper introduces a balanced prediction task, demonstrating the feasibility of using linguistic features from the initial exchange to predict conversational outcomes. The combination of prompt types and politeness strategies achieves an accuracy of 61.6%, indicating the potential for these features to capture some of the nuanced human intuition in predicting conversational trajectory. Interestingly, when combined with a trained toxicity classifier, the pragmatic features reach an accuracy of 64.9%, providing a benchmark for future work in the area.

Practical and Theoretical Implications

The research presented in this paper has significant implications for both practical applications and theoretical advancements. Practically, automated systems developed through this framework could serve as interventions in online platforms, facilitating early detection and mitigation of conversational derailment to foster healthier online communities. Theoretically, the findings contribute to a deeper understanding of the pragmatics involved in conversational dynamics, highlighting the predictive power of linguistic cues in social interactions.

Future Directions

While the paper offers promising results, it also opens avenues for further exploration, particularly concerning the causal mechanisms underlying conversational derailments and the potential automated interventions that could be developed. Expanding this framework to other online contexts beyond Wikipedia and incorporating more complex models to account for entire conversation dynamics could significantly enhance the understanding and prevention of toxic behaviors in digital communities.

In summary, this paper presents a thoughtful and systematic approach to understanding and predicting conversational derailment, leveraging linguistic analysis to offer early warning signals that can be used to maintain constructive dialogue in online environments.

PDF Markdown

Related Papers

Tweets

https://twitter.com/1487345386226614278/status/1735868449837257164