Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Word Affect Intensities (1704.08798v2)

Published 28 Apr 2017 in cs.CL

Abstract: Words often convey affect -- emotions, feelings, and attitudes. Further, different words can convey affect to various degrees (intensities). However, existing manually created lexicons for basic emotions (such as anger and fear) indicate only coarse categories of affect association (for example, associated with anger or not associated with anger). Automatic lexicons of affect provide fine degrees of association, but they tend not to be accurate as human-created lexicons. Here, for the first time, we present a manually created affect intensity lexicon with real-valued scores of intensity for four basic emotions: anger, fear, joy, and sadness. (We will subsequently add entries for more emotions such as disgust, anticipation, trust, and surprise.) We refer to this dataset as the NRC Affect Intensity Lexicon, or AIL for short. AIL has entries for close to 6,000 English words. We used a technique called best-worst scaling (BWS) to create the lexicon. BWS improves annotation consistency and obtains reliable fine-grained scores (split-half reliability > 0.91). We also compare the entries in AIL with the entries in the NRC VAD Lexicon, which has valence, arousal, and dominance (VAD) scores for 20K English words. We find that anger, fear, and sadness words, on average, have very similar VAD scores. However, sadness words tend to have slightly lower dominance scores than fear and anger words. The Affect Intensity Lexicon has applications in automatic emotion analysis in a number of domains such as commerce, education, intelligence, and public health. AIL is also useful in the building of natural language generation systems.

Citations (234)

Summary

  • The paper introduces a novel emotion lexicon using Best-Worst Scaling for nuanced intensity scores of anger, fear, joy, and sadness.
  • It demonstrates a split-half reliability over 0.91, underscoring the robustness of its annotation methodology against traditional rating biases.
  • Applications span sentiment analysis and NLG systems by capturing fine-grained emotional nuances across social media and formal texts.

Overview of the NRC Affect Intensity Lexicon

The paper presents the development of the NRC Affect Intensity Lexicon (AIL), a manually annotated resource that quantifies the intensity of four basic emotions—anger, fear, joy, and sadness—associated with nearly 6,000 English words. This lexicon stands out for utilizing Best-Worst Scaling (BWS), a method that generates fine-grained and reliable emotion intensity scores, thereby overcoming the challenges presented by coarse affect category data and annotation inconsistencies encountered in previous lexicons. The fusion of manual annotation techniques and the systematic approach of BWS marks a distinctive stride in the pursuit of capturing nuanced emotional intensity in language.

Methodology and Features

The methodology underlying the creation of AIL is centered around BWS, where annotators are tasked with identifying the word representing the most and least intensity within a provided 4-tuple. This allocation is significant due to its efficiency in revealing reliable scores while minimizing common issues such as scale-region bias present in conventional rating scale methods. The paper reports a split-half reliability exceeding 0.91, indicating exceptional consistency of the results across repeated annotations.

In terms of dataset specifics, the lexicon includes terms prevalent in both standard English and social media texts, especially Twitter. By extending its scope to digital language forms, the AIL becomes applicable to varied commercial and research domains like sentiment analysis, opinion mining, and natural language generation (NLG) systems. This emphasis on including social media vernacular helps address the growing interest in analyzing public sentiment from online discourse.

Comparison with Existing Lexicons

The paper compares AIL with the NRC VAD Lexicon, which scores words based on valence, arousal, and dominance. The analysis indicates that words associated with anger, fear, and sadness typically share VAD scores. Interestingly, it was observed that words associated with sadness exhibit slightly lower dominance on average, differentiating them subtly from anger and fear words. Such findings suggest potential avenues for further exploration of how intensity and VAD dimensions interact.

Implications and Applications

The implications of AIL's creation are multifaceted. Practically, the lexicon serves as a tool for enhanced emotion analysis across fields such as commerce, arts, public health, and technology by refining sentiment measurement. It plays a pivotal role in the field of NLG systems that require fine-tuned language and empathetic interaction.

From a theoretical standpoint, AIL contributes to the understanding of emotional expression at a granular level, laying groundwork for future research into the intersection of word affect, sentence composition, and overall text emotion. The reliability and discriminative power of BWS suggest its potential for further lexicon development, as AIL plans to expand its emotional categories to include dimensions like disgust, anticipation, trust, and surprise.

Future Directions

Future research inspired by AIL could include extensive studies across different linguistic contexts and cultures, helping to determine universal patterns and language-specific variations in emotion expression. Additionally, the correlation between orthographic elements and emotional perception presents an intriguing line of inquiry, potentially enhancing linguistic and artificial intelligence systems' capabilities in emotion understanding.

In summary, the development of the NRC Affect Intensity Lexicon elucidates significant advancements in emotion lexicon creation, showcasing how sophisticated methodologies like BWS can significantly augment the granularity and reliability of emotion intensity data. Such advancements highlight the ongoing evolution of computational linguistics and the critical intersection of empirical research with real-world applications in understanding human emotions through language.