Words of Warmth Lexicon

Updated 13 November 2025

The Words of Warmth Lexicon is a comprehensive suite capturing association norms for trust, sociability, and warmth across over 26,000 English words.
It employs rigorous annotation methods and robust reliability metrics, including split-half correlations, to ensure accurate social perception ratings.
The lexicon supports quantitative research on linguistic bias, stereotype analysis, and language development with actionable insights into social cognition.

The Words of Warmth Lexicon is a large-scale suite of association norms capturing perceived trust, sociability, and warmth for over 26,000 common English words. Based on social psychological theory, the lexicon facilitates quantitative analysis of the dimensions of interpersonal perception, enables developmental and applied investigations, and supports nuanced studies of linguistic bias and stereotypes. Trust (T) and Sociability (S) ratings are derived directly from human annotators; Warmth (W) is defined as the stronger association between the two for each word.

1. Theoretical Foundations and Dimensions

Competence (C) and Warmth (W) constitute the primary dimensions for social cognition, as formulated by the Stereotype Content Model (Fiske et al. 2002). Warmth—a measure of perceived intent, encompassing friendliness and hostility—is further decomposed by recent research (Abele et al. 2016; Koch et al. 2024) into two components:

Trust (T): Morality, honesty, integrity, sincerity, fairness.
Sociability (S): Friendliness, gregariousness, conviviality.

Formally, each word $i$ in the lexicon is indexed with three real-valued scores $T_i$ , $S_i$ , $W_i$ on $[-3, +3]$ . Trust and Sociability are empirically established through annotation; Warmth is operationalized as: $W_i = \begin{cases} T_i & \text{if } |T_i| \ge |S_i|\ S_i & \text{otherwise} \end{cases}$

The evolutionary and developmental literature indicates that warmth-based judgments emerge early in childhood, and that sociability precedes trust in early language acquisition.

2. Lexicon Construction and Reliability

2.1 Term Selection

The source vocabulary comprises approximately 44,000 unigrams from the NRC VAD Lexicon v2, filtered to exclude terms with near-neutral valence ( $-0.2 < \text{Valence} < +0.2$ ), yielding 26,188 emotionally salient unigrams.

2.2 Annotation Procedure

Ratings were crowdsourced via Amazon Mechanical Turk, restricting participation to native English speakers (69% USA, rest UK, Canada, India). Annotator demographics: mean age 39.2 years, 48% female and 52% male (self-reported). Each target was rated on 7-point bipolar scales for Trust and Sociability ( $-3$ = "very untrustworthy/unsociable", $+3$ = "very trustworthy/sociable", $0$ = "neither"), with task instructions detailing meanings, examples, and prompting annotators to consult dictionaries for ambiguous items.

2.3 Quality Control and Aggregation

“Gold” control items (~2%) were used for real-time and silent accuracy feedback; annotators with sub-80% gold accuracy had their contributions excluded. Lexicon scores per word are aggregated as follows: $T_i = \frac{1}{n}\sum_{j=1}^n t_{ij}, \quad S_i = \frac{1}{n}\sum_{j=1}^n s_{ij}$ Warmth $W_i$ is assigned according to the component with greater absolute value.

2.4 Reliability Metrics

Split-half reliability (SHR) was assessed over 1,000 random splits with the following results:

Dimension	Mean Annots/Word	Spearman $\rho$	Pearson $r$
Sociability (S)	7.9	0.965	0.969
Trust (T)	11.4	0.943	0.957
Warmth (W)	8.8	0.965	0.974

All correlations are reported as $\pm 0.002$ .

3. Lexicon Statistics and Distributions

3.1 Categorical Labeling

Each word is assigned a categorical label on a 7-class scale: Very/Moderately/Slightly Warm/Neutral/Slightly/Moderately/Very Cold. The class proportion breakdown is:

Dimension	Very High	Moderately High	Slightly High	Neutral	Slightly Low	Moderately Low	Very Low
Trust (T)	2.8 %	13.8 %	12.3 %	38.6 %	13.3 %	14.6 %	4.5 %
Sociability (S)	11.2 %	12.0 %	12.7 %	16.4 %	13.4 %	27.0 %	7.4 %
Warmth (W)	12.3 %	17.0 %	12.7 %	10.5 %	11.7 %	26.9 %	8.8 %

3.2 Empirical Distributions

The distributions for T, S, and W are approximately zero-centered: $\bar{T} \approx 0.00, \quad \bar{S} \approx 0.00, \quad \bar{W} \approx 0.00$ Standard deviations for each scale are $\sim$ 1.2–1.3.

3.3 Inter-Dimension Correlations

Empirical inter-correlations for real-valued scores across 26k words are moderate to strong: $r_{T,S} \approx 0.68,\quad r_{W,T} \approx 0.92, \quad r_{W,S} \approx 0.87\quad(\text{all }p<0.001)$

3.4 Illustrative Word Examples

Dimension	Top³ (score)	Bottom³ (score)
Trust (T)	consoler (2.00), cohesiveness (2.18), ethicist (2.50)	narcissm (–3.00), horrible (–2.78), denigration (–2.44)
Sociability (S)	consoler (3.00), cohesiveness (3.00), wedding (2.88)	stalker (–3.00), gentrify (–1.75), outcast (–1.80)
Warmth (W)	consoler (3.00), cohesiveness (3.00), wedding (2.88)	stalker (–3.00), narcism (–3.00), horrible (–2.78)

4. Developmental and Applied Insights

4.1 Age-of-Acquisition

Integrating W/T/S norms with age ratings (Kuperman et al. 2012) and binning words into High/Neutral/Low at $\pm$ 1.5, developmental analyses show:

Children disproportionately acquire high-W and high-S words at early ages; the proportion of low-W/S words rises from age 3 to 17, with $\sim$ 50% of acquired W/S words being polar at each age.
High-T word acquisition remains stable until age 10, declining thereafter as low-T acquisition rises.
High-C word acquisition peaks near age 10, later decreasing; low-C word acquisition is highest in early years.

These patterns empirically support the primacy of valence and indicate that sociability is acquired before trust during language development.

4.2 Bias and Stereotype Reseach

Utilizing both direct lookup and co-occurrence ("co-term") methodologies with large Twitter datasets (Vishnubhotla & Mohammad 2022; Wahle et al. 2025), lexicon analysis reveals established stereotype and bias patterns:

Social Groups: muslim, jew, immigrant exhibit low direct W; elderly score high on W but low on C; criminal scores very low on W.
Gender Terms: direct scores show high W for all gender terms; father/mother have high C, grandmother low C. Co-term analysis of tweets: references to "you" use more high-C language, "we" more high-W.
In-group / Out-group: bilateral analysis of Canadians and Americans finds self-references use higher W/C co-terms, consistent with in-group favoritism.
Professions: direct scores—engineers, doctors, teachers high C; nurses and teachers high W; jobless very low. Co-term results: "CEO" higher C context than "engineer"; "doctor" co-terms display more low-C language than "nurse," evidencing context sensitivity.

A plausible implication is that the lexicon, when paired with co-term methods, provides a robust foundation for quantitative bias and stereotype investigations in digital discourse.

5. Practical Integration, Limitations, and Ethical Considerations

5.1 Usage Guidance

Text scoring: For any text, scores can be assigned to each token for T, S, W, enabling calculation of mean, sum, or "polar" differential aggregates.
Comparative analysis: Researchers may examine relative differences (e.g., percent increases in high-W words) across temporal or group splits.
Bias/stereotype investigation: Co-term pairing (Turney 2002; Teodorescu & Mohammad 2023) facilitates measurement of W/C usage around target entities.

5.2 Limitations and Considerations

The lexicon covers 26k unigrams, favoring U.S.-centric corpora.
Scores reflect predominant word senses; specialized or ambiguous terms may require re-annotation.
Annotator pool is skewed toward U.S., Canada, UK, India—demographic biases are possible.
Lexicon scores reflect common perceptions (association norms), not objective reality.
Not suitable for assessing single utterances; reliability requires aggregate analysis over multiple items.
Scores are context-sensitive; comparative framing is recommended.
Essentializing speakers should be avoided; focus should be on the use of warmth-related language in context.

All resources are released under terms prohibiting direct redistribution in large training corpora. The lexicon supports interdisciplinary research spanning social cognition, computational bias analysis, digital humanities, and sentiment modeling, and is intended to enrich the quantitative study of linguistic social perception.

Markdown Report Issue Upgrade to Chat

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Words of Warmth Lexicon.

Words of Warmth Lexicon

1. Theoretical Foundations and Dimensions

2. Lexicon Construction and Reliability

2.1 Term Selection

2.2 Annotation Procedure

2.3 Quality Control and Aggregation

2.4 Reliability Metrics

3. Lexicon Statistics and Distributions

3.1 Categorical Labeling

3.2 Empirical Distributions

3.3 Inter-Dimension Correlations

3.4 Illustrative Word Examples

4. Developmental and Applied Insights

4.1 Age-of-Acquisition

4.2 Bias and Stereotype Reseach

5. Practical Integration, Limitations, and Ethical Considerations

5.1 Usage Guidance

5.2 Limitations and Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Words of Warmth Lexicon

1. Theoretical Foundations and Dimensions

2. Lexicon Construction and Reliability

2.1 Term Selection

2.2 Annotation Procedure

2.3 Quality Control and Aggregation

2.4 Reliability Metrics

3. Lexicon Statistics and Distributions

3.1 Categorical Labeling

3.2 Empirical Distributions

3.3 Inter-Dimension Correlations

3.4 Illustrative Word Examples

4. Developmental and Applied Insights

4.1 Age-of-Acquisition

4.2 Bias and Stereotype Reseach

5. Practical Integration, Limitations, and Ethical Considerations

5.1 Usage Guidance

5.2 Limitations and Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research