Human-Judgment–Informed Embeddings

Updated 7 April 2026

Human-judgment–informed embeddings are representational spaces structured using empirical similarity judgments to capture global semantic relationships.
They combine linear alignment, projection, and supervised pruning techniques to improve tasks like few-shot learning, anomaly detection, and cross-domain generalization.
Applications span vision, language, and multimodal domains, ensuring computational models align more closely with human cognition with efficient hybrid strategies.

Human-judgment–informed embeddings are representational spaces whose structure is shaped or constrained directly by empirical human assessments of similarity, feature ratings, or other semantic organization, with the goal of making computational models more aligned with human cognition and perception. Methods span vision, language, and multimodal domains; they range from hybrid training objectives to post-hoc projection, pruning, or retrofitting, and address a spectrum of tasks from few-shot learning to interpretability, cross-linguistic modeling, and scientific discovery.

1. Foundations and Motivations

Classical neural network embeddings are organized by training objectives that locally cluster similar inputs but do not explicitly shape the global geometry to match the human conceptual structure. Human similarity judgments, elicited via triplets, pairwise ratings, or feature norms, often reveal organization—taxonomic, contextual, or functional—that is only partially recovered by unsupervised or supervised representations. The central motivation for human-judgment–informed embeddings is to inject global structure reflecting high-level human inductive biases, thus improving alignment for few-shot learning, anomaly detection, semantic interpretability, and cognitive plausibility (Muttenthaler et al., 2023, Roads et al., 2020, Grand et al., 2018).

This paradigm addresses two key issues:

Global structure vs. local pairing: Standard objectives (e.g., contrastive loss) shape only local neighborhoods; human judgment constraints enforce semantically coherent arrangement at the global level (Muttenthaler et al., 2023, Attarian et al., 2020).
Supervision scarcity and context: Collecting O(N²) or O(N³) human pairwise/triplet data is expensive. Hybrid methods and projection techniques minimize human effort, while context-specific or language-informed supervision handles domain shifts and cross-linguistic variation (Leviant et al., 2015, Iordan et al., 2019, Marjieh et al., 2022).
Interpretability and downstream utility: Beyond alignment, understanding which dimensions or features map to specific human-endorsed attributes is necessary for transparency, fairness, and scientific insight (Manrique et al., 2023, Erk et al., 2024).

2. Core Methodologies

2.1 Linear and Nonlinear Alignment

Several frameworks optimize linear (and, less commonly, nonlinear) transformations to align neural embeddings with human judgments:

Global-Local Transform (gLocal): A linear transformation $M \in \mathbb{R}^{p \times p}$ is learned such that transformed embeddings $\tilde{x}_i = M x_i$ maximize the likelihood of triplet judgments,

$\mathcal{L}_{\text{global}}(M) = -\frac{1}{n}\sum_{s=1}^n\log\frac{\exp S_{a_sb_s}^\dagger(M)}{\exp S_{i_s j_s}^\dagger(M) + \exp S_{i_s k_s}^\dagger(M) + \exp S_{j_s k_s}^\dagger(M)}$

while preserving original local neighborhoods via a cross-entropy “local” loss

$\mathcal{L}_{\text{local}}(M) = -\frac{1}{m(m-1)}\sum_{i=1}^m\sum_{j\neq i}\sigma(S^\ast,i,j;\tau)\log\sigma(S^\dagger(M),i,j;\tau)$

where $\sigma(\cdot)$ parametrizes a local contrastive distribution. Regularization enforces that $M$ stays close to a scaled identity. The global-local objective significantly improves few-shot classification and anomaly detection (Section 7) (Muttenthaler et al., 2023).

Expressive Linear Transforms: Extending beyond diagonal scaling (“dilation”), full (possibly asymmetric) linear maps are learned on high-dimensional pretrained embeddings (e.g., VGG16 activates) to maximize log-likelihood on held-out human triplet choices, yielding accuracy gains from 72% (baseline) up to 89% (full asymmetric) (Attarian et al., 2020). Asymmetric maps capture the empirically observed non-commutativity of human similarity (i.e., similarity(a, b)≠similarity(b, a)).
Hybrid Human–Machine Embedding (SNaCK): Given an initial similarity kernel $K$ (e.g., CNN or word-vector), triplet constraints $T$ from human judgments, and trade-off λ, minimize

$L(Y) = (1-\lambda) L_{\text{machine}}(Y;K) + \lambda L_{\text{human}}(Y;T)$

where $L_{\text{machine}}$ matches the structure of $\tilde{x}_i = M x_i$ 0 (via t-SNE) and $\tilde{x}_i = M x_i$ 1 attempts to satisfy the triplet inequalities (via t-STE) (Wilber et al., 2015). This enables the embedding to preserve machine-discovered fine structure while carving out high-level concepts with sparse human input.

2.2 Projection and Feature-Axis Modeling

Semantic Projection: Define a feature axis in embedding space via antipodal anchor words (e.g., “big” vs “small”), calculate the dimension vector as the mean difference of multiple synonym pairs, and project object embeddings onto this axis; the resulting scalar correlates strongly with human feature ratings (median Pearson r=0.47, adjusted r=0.61, order-consistency 90%) (Grand et al., 2018). This method is linear, interpretable, and generalizes to arbitrary features as long as anchors are available.
Interpretable Dimension Adjustment: Fit embedding directions using both seed-based difference vectors and human ratings, optimizing a loss that matches projections of word vectors to z-scored human judgments (with scale and bias). Hybrid objectives incorporate both explicit seed anchors and direct dimension alignment via cosine similarity, yielding state-of-the-art rank order accuracy across a diverse set of semantic properties (FIT+S, r⁺-acc=0.80, MSE=0.7 for object properties) (Erk et al., 2024).

2.3 Sparse Supervised Feature Selection

Feature Pruning via Supervision: In domain-specific settings, supervised-learning is used to select a minimal subset of embedding dimensions (typically 20–40% of original space) which maximizes Spearman correlation between model and human pairwise similarity matrices. The resulting pruned subspace enhances both predictive power and interpretability, with axes that can be “named” via PMI or probed for downstream factors (Manrique et al., 2023).

3. Empirical Evaluations and Quantitative Impact

Human-judgment–informed embeddings have demonstrated robust improvements across a variety of benchmarks and modalities:

Domain/Task	Baseline	HJ-Informed Embedding	Metric / Gain	Source
Vision (5-shot Entity-13 CLIP)	65.3%	71.9%	+6.6 points accuracy	(Muttenthaler et al., 2023)
Vision (Anomaly, CIFAR-100)	91.41%	97.19%	+5.8 points AUROC	(Muttenthaler et al., 2023)
Vision (Triplet accuracy)	74.4% (VGG16)	80.7% (HSJ embedding)	+6.3 points	(Roads et al., 2020)
Word-feature reconstruction	r=0.47	r=0.94 (best cases)	37%–100% of explainable variance	(Grand et al., 2018)
Interpretable dimension prediction (GloVe, FIT+S)	r⁺-acc=0.64 (SEED)	r⁺-acc=0.80	+16 points, MSE collapse from >100 to 0.7	(Erk et al., 2024)
Pruned subspace similarity (sports domain)	ρ=0.40	ρ=0.52	+0.12 absolute correlation	(Manrique et al., 2023)
SQuID psychometric structure	R²=0 (random)	R²=0.55	55% variance in value dimension similarities	(Pellert et al., 29 Sep 2025)

These empirical gains indicate that incorporating human similarity data not only boosts alignment to human labels but also enhances performance on generalization-demanding tasks.

4. Domains and Modalities of Application

Few-shot learning and anomaly detection: Incorporation of human-inspired global structure into embeddings leads to pronounced improvements on tasks with limited or imbalanced data, as global alignment creates semantically meaningful clusters and more uniform anomaly baselines (Muttenthaler et al., 2023).
Semantic feature extraction and interpretability: Projection methods, explicit dimension fitting, and supervised pruning yield subspaces that recover human-endorsed features (e.g., size, danger, gustation), facilitating interpretable axes and cross-domain semantic generalizations (Grand et al., 2018, Manrique et al., 2023, Erk et al., 2024).
Cross-linguistic and cross-cultural modeling: Empirical studies on the “judgment language effect” demonstrate that aligning vector spaces to JL-specific human judgments or combining monolingual embeddings via linear interpolation or CCA improves cross-JL generalization and better reflects the fluidity of human semantic intuition (Leviant et al., 2015).
Human-in-the-loop refinement: Interactive systems allow iterative, fine-grained refitting of embeddings based on live human feedback, supporting debiasing, domain adaptation, or local correction while maintaining global integrity of the distributional structure (Powell et al., 2021).
Large-scale and zero-shot evaluations: Embedding pipeline methodologies based on LLM text encodings of semantically-rich annotation (e.g., for genomic variants) or psychometric survey items are shown to recover the structure of human-judged organization with high fidelity and enormous scalability, without the need for retraining or domain-specific fine-tuning (Niu et al., 25 Sep 2025, Pellert et al., 29 Sep 2025).

5. Best Practices, Limitations, and Open Challenges

Best Practices

Hybrid loss and regularization: Combine global human alignment with local/neighborhood preservation, using cross-validation and appropriate regularization (e.g., shrinkage toward identity, balancing α) (Muttenthaler et al., 2023).
Projection and seed-based grounding: Use multiple synonym-antonym pairs for robust feature axes; complement with direct human ratings where seeds are ambiguous or insufficient (Grand et al., 2018, Erk et al., 2024).
Active sampling and ensemble modeling: To efficiently scale human judgment collection, use active-learning trial selection and ensemble variational inference for confidence estimation in embedding inference (Roads et al., 2020).
Evaluation protocol: Always compare to inter-rater reliability to estimate ceiling performance; use both predictive (accuracy, AUROC) and structural (correlation, order-consistency) metrics (Muttenthaler et al., 2023, Grand et al., 2018, Manrique et al., 2023).

Limitations

Cost and scalability of human supervision: Direct pairwise/triplet assessments are O(N²)–O(N³) in data volume; many approaches rely on hybridization with language-based proxies or efficient sampling (Marjieh et al., 2022, Roads et al., 2020).
Dimensionality and rank reduction: Excessively constraining the mapping (e.g., to diagonal/dilation transforms or low-rank spaces) may underfit human structure, especially when axis semantics are arbitrary (Attarian et al., 2020).
Bias and domain dependence: Embeddings inherit statistical and ideological biases present in language or label corpora, and alignment to one JL or culture may reduce transferability (Leviant et al., 2015, Manrique et al., 2023).
Interpretability trade-offs: While supervised pruning and projection enhance interpretability, there remain open questions about the nonlinearity and compositionality of higher-order human concepts (Manrique et al., 2023, Erk et al., 2024).

Open Challenges

Extending to multimodal and dynamic settings: Combining visual, linguistic, and auditory similarity judgments in a unified space; handling polysemy and dynamic (temporal or evolving) semantic relations (Marjieh et al., 2022).
Automatic feature discovery and nonlinear alignment: Learning multidimensional, possibly nonlinear semantic subspaces corresponding to complex properties without reliance on predefined anchors (Grand et al., 2018, Erk et al., 2024).
Deeper cognitive modeling: Understanding the psychological mechanisms that underlie projection, axis interpretation, and asymmetry in judgments, and mirroring these in machine representations (Attarian et al., 2020, Iordan et al., 2019).

6. Algorithmic and Implementation Highlights

Method	Core Objective	Data/Setting	Notable Gains/Properties
gLocal (Muttenthaler et al., 2023)	Minimize combined global-local loss to align with human triplets, preserve neighbor structure	CLIP, ImageNet CNNs, triplet/odd-one-out	~+6% 5-shot acc, ~+6 AUROC; no harm to local geometry
Semantic Projection (Grand et al., 2018)	Project embeddings onto antonym-difference axes	GloVe, human feature ratings	r=0.47–0.94; domain-agnostic
Supervised Pruning (Manrique et al., 2023)	Greedy feature subset maximizing RSM alignment	GloVe, domain HSJs	Shrink to 20–40%, +0.07–0.15 ρ
Interactive Refitting (Powell et al., 2021)	Quadratic post-processing with live human constraints	word2vec, Google News	Targeted debiasing/correction
Contextual Constraint (Iordan et al., 2019)	Build embedding from domain-matched subcorpora	Word2Vec, BERT, GloVe	90–92% of inter-rater r on matching domain
SNaCK (Wilber et al., 2015)	Joint t-SNE/t-STE loss w/ human triplets + machine kernel	Vision (MNIST, birds, food), crowdsourcing	Outperforms t-SNE/t-STE alone, less human effort
HSJ Psychological Embedding (Roads et al., 2020)	Variational Bayesian embedding fit to HSJs, active sampling	ImageNet 50k, AMT human	80%+ triplet accuracy, >5% over deep nets
Dimension-fitting (FIT+S) (Erk et al., 2024)	Fit interpretable dimension to human ratings plus seed, w/cosine alignment	GloVe, BERT, object & style datasets	r⁺-acc=0.80 (objects), MSE <2
SQuID (Pellert et al., 29 Sep 2025)	Mean-centering, dimension-wise aggregation to recover psychometric structure	LLM sentence embeddings, PVQ-RR	R²=0.55 vs. human, factor congruence φ=0.88

These algorithms are distinguished by their use of explicit global structure supervision, careful balancing of local and global constraints, and the use of optimization or selection procedures tuned to human data.

7. Broader Implications and Future Directions

Human-judgment–informed embeddings provide a bridging solution between black-box statistical models trained on massive, possibly uncurated data and the nuanced semantic organization revealed by human judgment. They enable high-performance generalization in low-data regimes, legitimate interpretability for scientific and social science applications, and a firmer foundation for cross-disciplinary transfer. Open directions include scalable active supervision for large, dynamic domains, more sophisticated grounding across cultures and languages, and deeper integration with cognitive models of similarity and categorization.

As datasets and tools for human-judgment–informed embedding construction proliferate, this paradigm is likely to shape the future of interpretable, robust, and cognitively-aligned representation learning.

References

Improving neural network representations using human similarity judgments (Muttenthaler et al., 2023)
Enriching ImageNet with Human Similarity Judgments and Psychological Embeddings (Roads et al., 2020)
Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings (Grand et al., 2018)
Adjusting Interpretable Dimensions in Embedding Space with Human Judgments (Erk et al., 2024)
Enhancing Interpretability using Human Similarity Judgements to Prune Word Embeddings (Manrique et al., 2023)
Words are all you need? Language as an approximation for human similarity judgments (Marjieh et al., 2022)
Neural network embeddings recover value dimensions from psychometric survey items on par with human data (Pellert et al., 29 Sep 2025)
Transforming Neural Network Visual Representations to Predict Human Judgments of Similarity (Attarian et al., 2020)
Learning Concept Embeddings with Combined Human-Machine Expertise (Wilber et al., 2015)
Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora (Iordan et al., 2019)
Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling (Leviant et al., 2015)
Human-in-the-Loop Refinement of Word Embeddings (Powell et al., 2021)
Incorporation of Human Knowledge into Data Embeddings to Improve Pattern Significance and Interpretability (Li et al., 2022)
Incorporating LLM Embeddings for Variation Across the Human Genome (Niu et al., 25 Sep 2025)