NRC VAD Lexicon v2
- NRC VAD Lexicon v2 is an extensive resource defining over 55,000 English words and MWEs with continuous VAD scores in the range [–1, 1].
- It employs rigorous annotation methods with 9 independent ratings per term and high internal reliability (e.g., Spearman ρ up to 0.98).
- The lexicon is widely used in NLP, psychology, and digital humanities, enabling detailed sentiment analysis and affective modeling in diverse domains.
The NRC VAD Lexicon v2 is an extensive resource comprising human ratings for Valence (V), Arousal (A), and Dominance (D)—the principal affective dimensions uncovered by factor analysis in semantics. It contains over 55,000 English words and multi-word expressions (MWEs), with each term mapped to continuous VAD scores in the interval [–1, 1]. The lexicon is designed for broad usage in psychology, computational linguistics, public health, digital humanities, and related disciplines, offering fine-grained quantitative signals at the lexical level. The v2 release notably extends prior coverage with thousands of new unigrams and, for the first time, a systematically annotated set of MWEs (Mohammad, 30 Mar 2025, Mohammad, 25 Nov 2025).
1. Valence-Arousal-Dominance Framework
Valence (V), Arousal (A), and Dominance (D) constitute a tripartite framework for affective word meaning.
- Valence (V) quantifies pleasure-displeasure (positiveness-negativeness).
- Arousal (A) measures activation or passivity (excitation–calmness).
- Dominance (D) operationalizes power or agency (dominance–submission), also termed Competence (C) in social cognition contexts.
These dimensions are widely validated as the primary orthogonal axes underlying the semantic evaluation of words and phrases, supporting research into emotion regulation, social competence, and behavioral outcomes.
2. Lexical Coverage and Structure
The NRC VAD Lexicon v2 aggregates three datasets:
- All ≈ 20,000 unigrams from v1.0 sources, including ANEW, Warriner et al., General Inquirer, and NRC Emotion Lexicon.
- ≈ 25,000 additional unigrams from the Prevalence norms (words known by ≥ 70% of respondents, per Brysbaert et al.).
- ≈ 10,000 MWEs selected for high frequency and coverage from Muraki et al.'s concreteness ratings dataset.
The final resource comprises 44,928 unigrams and 10,205 MWEs for a total of 55,133 term–VAD triples (Mohammad, 25 Nov 2025). Each entry is represented by:
term(string)valence,arousal,dominance(floats, [–1, 1])
Lexicon Example Format
| term | valence | arousal | dominance |
|---|---|---|---|
| banquet | 0.72 | 0.10 | 0.45 |
| funeral | –0.62 | 0.25 | –0.12 |
| fight | –0.34 | 0.81 | 0.67 |
| delicate | 0.20 | 0.05 | –0.53 |
| breath of fresh air | 0.80 | 0.40 | 0.30 |
| rock bottom | –0.75 | 0.60 | –0.50 |
Column values above are illustrative.
3. Annotation and Quality Control Methodologies
Term selection involved multi-source aggregation and strict prevalence criteria. For annotation:
- Each term received 9 independent ratings per dimension using a seven-point Likert-style scale, integer-valued from –3 to +3 with dedicated anchors (e.g., –3 = highly negative/inactive/submissive; +3 = highly positive/active/dominant).
- Final scores reflect the item-wise mean of annotator responses, rescaled linearly to [–1, 1]:
- Mechanical Turk deployed U.S., UK, Canada, and India-based native English speakers (average age ≈ 34, ≈ 53% female), yielding 7.8–8.1 valid responses per dimension per term.
- Quality control used ≈2% “gold” items interspersed with the rating sequences; annotators below 80% gold accuracy were excluded and their annotations discarded. Instructions emphasized predominant sense disambiguation.
4. Reliability Assessment
Internal reliability was quantified via repeated split-half correlation analyses (1,000 trials per dimension):
- Valence: Spearman = 0.98, Pearson = 0.99
- Arousal: Spearman = 0.97, Pearson = 0.98
- Dominance: Spearman = 0.96, Pearson = 0.96
These coefficients indicate excellent reproducibility and very high inter-rater agreement. Cronbach’s is inferred to be similarly high (exceeding 0.95) (Mohammad, 30 Mar 2025, Mohammad, 25 Nov 2025).
5. Emotional Compositionality in Multiword Expressions
NRC VAD Lexicon v2 supports the quantitative paper of emotional compositionality in MWEs, particularly bigrams:
- For 8,330 bigrams, constituent words are binned by their raw V/A/D ratings.
- MWEs are grouped by their constituent bins , yielding aggregate scores:
- High constituent scores predict higher MWE VAD.
- A fraction of MWEs display noncompositional emotional meaning: 4.79% of low-valence bigrams originate from neutral–neutral constituents.
This supports both the compositional and idiomatic transmission of affective content in phrasal constructions (Mohammad, 25 Nov 2025).
6. Lexicon Utilization and Computational Integration
Direct application involves token-level or expression-level mapping of text segments into VAD vectors. The core summarization formulas for a fragment containing matched terms are:
Other metrics, such as spread or proportions of high/low VAD, can be derived analogously. Python code snippets (see (Mohammad, 30 Mar 2025)) illustrate lexicon loading, text tokenization, and batch scoring.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import pandas as pd import re lex = pd.read_csv('NRC-VAD-v2.tsv', sep='\t', index_col='term') def tokenize(text): return re.findall(r"\b\w+(?:'\w+)?\b", text.lower()) def text_vad(text, lexicon_df): tokens = tokenize(text) scores = [lexicon_df.loc[t] for t in tokens if t in lexicon_df.index] if not scores: return {'V': None, 'A': None, 'D': None} df = pd.DataFrame(scores) return {'V': df['valence'].mean(), 'A': df['arousal'].mean(), 'D': df['dominance'].mean()} |
Best practices include longest-match scanning for MWEs, weighting by token frequency, explicit treatment of negation, and use of VAD as continuous features in downstream modeling.
7. Disciplinary Applications and Impact
NRC VAD Lexicon v2 is employed in:
- NLP and Computational Linguistics: feature extraction for sentiment/emotion analysis, VAD prediction benchmarks, enhancement of affect-aware embeddings, and lexicon-driven paper of compositionality.
- Psychology & Social Cognition: stereotype content research, developmental studies, mind–body affective mapping, and emotion regulation.
- Public Health & Social Sciences: tracking anxiety, mood, stress in population-level communication; blending with domain-specific lexicons.
- Digital Humanities & Literary Studies: tracing narrative affect arcs; quantifying emotional techniques across genres.
- Political Science & Discourse Analysis: evaluating power dynamics, persuasion, and framing in political/corporate text.
- Cross‐cultural Studies: leveraging VAD translations for comparative work in over 100 languages.
A plausible implication is that the resource’s scale and reliability enable novel studies into MWE semantics, the lexical encoding of emotion and competence, and the fine-grained modeling of collective affective states (Mohammad, 30 Mar 2025, Mohammad, 25 Nov 2025). Terms of use require that the lexicon not be re-distributed in raw form to prevent uncurated scraping. The dataset is available for research at http://saifmohammad.com/WebPages/nrc-vad.html.
8. Summary and Availability
NRC VAD Lexicon v2 delivers high-coverage, high-reliability VAD ratings for English words and MWEs as a ready-to-integrate UTF-8 encoded text file (.csv or .tsv) with explicit scoring methodology and robust quality control steps. Its broad domain relevance and methodological rigor make it an indispensable tool for affective-semantic analysis across psychology, computational linguistics, social science, and digital humanities (Mohammad, 30 Mar 2025, Mohammad, 25 Nov 2025).