Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
117 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Identifying and interpreting non-aligned human conceptual representations using language modeling (2403.06204v1)

Published 10 Mar 2024 in cs.CL

Abstract: The question of whether people's experience in the world shapes conceptual representation and lexical semantics is longstanding. Word-association, feature-listing and similarity rating tasks aim to address this question but require a subjective interpretation of the latent dimensions identified. In this study, we introduce a supervised representational-alignment method that (i) determines whether two groups of individuals share the same basis of a certain category, and (ii) explains in what respects they differ. In applying this method, we show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains, and we identify the associated semantic shifts. We first apply supervised feature-pruning to a LLM (GloVe) to optimize prediction accuracy of human similarity judgments from word embeddings. Pruning identifies one subset of retained GloVe features that optimizes prediction of judgments made by sighted individuals and another subset that optimizes judgments made by blind. A linear probing analysis then interprets the latent semantics of these feature-subsets by learning a mapping from the retained GloVe features to 65 interpretable semantic dimensions. We applied this approach to seven semantic domains, including verbs related to motion, sight, touch, and amodal verbs related to knowledge acquisition. We find that blind individuals more strongly associate social and cognitive meanings to verbs related to motion or those communicating non-speech vocal utterances (e.g., whimper, moan). Conversely, for amodal verbs, they demonstrate much sparser information. Finally, for some verbs, representations of blind and sighted are highly similar. The study presents a formal approach for studying interindividual differences in word meaning, and the first demonstration of how blindness impacts conceptual representation of everyday verbs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. There’s more to “sparkle” than meets the eye: Knowledge of vision and light verbs among congenitally blind and sighted individuals. Cognition, 189:105–115, 2019.
  2. Yonatan Belinkov. Probing classifiers: Promises, shortcomings, and advances. Computational Linguistics, 48(1):207–219, 2022.
  3. Toward a brain-based componential semantic representation. Cognitive neuropsychology, 33(3-4):130–174, 2016.
  4. Decoding word embeddings with brain-based semantic features. Computational Linguistics, 47(3):663–698, 2021.
  5. Transferred discrepancy: Quantifying the difference between representations. arXiv preprint arXiv:2007.12446, 2020.
  6. Enhancing interpretability using human similarity judgements to prune word embeddings. In Proceedings of BlackboxNLP at EMNLP 2023, October 2023.
  7. On the description of subcultural lexicons: A multidimensional approach. Journal of Personality and Social Psychology, 14(1):55, 1970.
  8. Word norms for blind and sighted subjects: Familiarity, concreteness, meaningfulness, imageability, imagery modality, and word associations. Behavior Research Methods, Instruments, & Computers, 23:461–485, 1991.
  9. Similarity of neural network models: A survey of functional and representational measures. arXiv preprint arXiv:2305.06329, 2023.
  10. Language and experience: Evidence from the blind child. Harvard University Press, Cambridge, MA, 1985.
  11. Blind: A set of semantic feature norms from the congenitally blind. Behavior Research Methods, 45:1218–1233, 2013.
  12. A comparative evaluation and analysis of three generations of distributional semantic models. Language resources and evaluation, 56(4):1269–1313, 2022.
  13. Distributional semantics as a source of visual knowledge. Proceedings of the National Academy of Sciences, 116(39):19237–19238, 2019.
  14. Lack of visual experience affects multimodal language production: Evidence from congenitally blind and sighted people. Cognitive Science, 47(1):e13228, 2023.
  15. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 2013.
  16. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp.  1532–1543, 2014.
  17. Similarity judgment within and across categories: A comprehensive model comparison. Cognitive Science, 45(8):e13030, 2021.
  18. Representation of colors in the blind, color-blind, and normally sighted. Psychological Science, 3(2):97–104, 1992.
  19. Cultural influences on word meanings revealed through large-scale semantic alignment. Nature Human Behaviour, 4(10):1029–1038, 2020.
  20. Akira Utsumi. Exploring what is encoded in distributional word vectors: A neurobiologically motivated analysis. Cognitive Science, 44(6):e12844, 2020.
  21. A universal algorithm for sequential data compression. IEEE Transactions on information theory, 23(3):337–343, 1977.

Summary

We haven't generated a summary for this paper yet.