MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection (1606.03784v1)

Published 13 Jun 2016 in cs.AI and cs.CL

Abstract: We describe MITRE's submission to the SemEval-2016 Task 6, Detecting Stance in Tweets. This effort achieved the top score in Task A on supervised stance detection, producing an average F1 score of 67.8 when assessing whether a tweet author was in favor or against a topic. We employed a recurrent neural network initialized with features learned via distant supervision on two large unlabeled datasets. We trained embeddings of words and phrases with the word2vec skip-gram method, then used those features to learn sentence representations via a hashtag prediction auxiliary task. These sentence vectors were then fine-tuned for stance detection on several hundred labeled examples. The result was a high performing system that used transfer learning to maximize the value of the available training data.

Citations (190)

View on Semantic Scholar

Summary

The paper introduces a transfer learning methodology using a four-layer RNN that achieved a macro-average F1 score of 67.8.
The study demonstrates that leveraging domain-specific pre-training with hashtag prediction significantly enhances stance detection accuracy.
The research underscores the practical benefits of transfer learning for social media analysis, offering insights for advancing NLP techniques.

Transfer Learning for Stance Detection in Social Media: An In-Depth Analysis

The paper "MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection" introduces a cutting-edge system for automatic stance detection in social media messages, specifically focusing on tweets. The authors, Guido Zarrella and Amy Marsh, detail their submission for SemEval-2016 Task 6 (Subtask A), noting the effectiveness of their approach through a remarkable average F1 score of 67.8, securing the highest rank in the competition. The paper highlights a methodology based on transfer learning, utilizing a recurrent neural network (RNN) initialized with features pre-trained on expansive, unannotated text corpora.

Methodological Overview

The authors underscore the distinction between stance detection and sentiment analysis, emphasizing that the former captures an author's position on a topic rather than their emotional state. The paper's principal innovation lies in its application of transfer learning to leverage vast datasets, addressing challenges inherent in stance detection, such as figurative language, informal syntax, and scarcity of labeled data.

Their system architecture integrates a four-layer RNN, consisting of:

A projection layer with 256-dimensional word embeddings trained using the word2vec skip-gram model,
A recursive layer of 128 Long Short-Term Memory (LSTM) units,
A densely connected layer with 128 Rectified Linear Units (ReLU),
A final softmax layer for classification into FAVOR, AGAINST, and NONE categories.

The network's efficacy stems from extensive pre-training: embeddings were learned from 218 million tweets, and the LSTM weights were refined through a hashtag prediction task, involving 197 task-relevant hashtags.

Evaluation and Results

The system's prowess was demonstrated during the SemEval-2016 Task 6, where it was tested against tweets from five topics with various levels of class balance. Using a macro-average F1 metric across FAVOR and AGAINST classes, the system not only achieved the best score among 19 competing entries but also showcased resilience against overfitting, as evidenced by a cross-validation F1 score of 71.1.

The analysis provided insights into the requisite conditions for successful transfer learning in stance detection. Notably, the choice of domain-specific hashtags for pre-training was crucial, as using a broader set of popular hashtags significantly reduced classification performance. Furthermore, the need for in-domain labeled data persisted: the majority stance classes consistently surpassed the minority classes, corroborating the necessity of diversifying data representation.

Implications and Future Directions

This research offers substantial theoretical and practical contributions to the field of NLP, particularly underlining the viability of transfer learning to improve model performance in data-scarce domains. The model's reliance on task-relevant hashtag selection exemplifies a crucial consideration for enhancing the reusability of learned features across different NLP tasks.

Future exploration could delve into refining feature transfer mechanisms and selecting auxiliary tasks that better capture the diversity of stances. Moreover, the findings suggest evaluating alternative strategies for enriching low-prevalence stance representations in auxiliary datasets, thereby better equipping the system to handle a variety of opinion expressions.

In conclusion, this paper underscores the potential of transfer learning to augment stance detection systems, presenting a robust framework that can adapt to the intricacies of social media language. Continued innovation in this domain is likely to harness further advancements in AI, facilitating progressively nuanced understanding of public discourse across platforms.

PDF Markdown