Laying Anchors: Semantically Priming Numerals in Language Modeling (2404.01536v2)
Abstract: Off-the-shelf pre-trained LLMs have become the de facto standard in NLP pipelines for a multitude of downstream tasks. However, the inability of these models to properly encode numerals limits their performance on tasks requiring numeric comprehension. We introduce strategies to semantically prime numerals in any corpus by generating anchors governed by the distribution of numerals in said corpus, thereby enabling mathematically grounded representations of these numeral tokens. We establish the superiority of our proposed techniques through evaluation on a range of numeracy tasks for both in-domain (seen) and out-domain (unseen) numerals. Further, we expand our empirical evaluations to numerals ranging from 1 to 10 billion, a significantly broader range compared to previous studies of the same nature, and we demonstrate significant improvements in the mathematical grounding of our learned embeddings.
- John A. Bargh and Tanya L. Chartrand. 2000. The mind in the middle: A practical guide to priming and automaticity research.
- Johannes Blömer and Kathrin Bujna. 2013. Simple methods for initializing the em algorithm for gaussian mixture models. CoRR.
- Numeracy-600k: Learning numeracy for detecting exaggerated information in market comments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6307–6313.
- Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794.
- Stanislas Dehaene. 2003. The neural basis of the weber–fechner law: a logarithmic mental number line. Trends in cognitive sciences, 7(4):145–147.
- Abstract representations of numbers in the animal and human brain. Trends in neurosciences, 21(8):355–361.
- Is numerical comparison digital? analogical and symbolic effects in two-digit number comparison. Journal of experimental Psychology: Human Perception and performance, 16(3):626.
- Injecting numerical reasoning skills into language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 946–958.
- Learning numeral embedding. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2586–2599.
- Mwp-bert: A numeracy-augmented pre-trained encoder for math word problems. 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Workshop on Math-AI.
- Numerical relation extraction with minimal supervision. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30.
- Pointer sentinel mixture models. International Conference on Learning Representations (ICLR).
- Stress test evaluation for natural language inference. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2340–2353.
- Investigating the limitations of transformers with simple arithmetic tasks. arXiv preprint arXiv:2102.13019.
- Kuntal Kumar Pal and Chitta Baral. 2021. Investigating numeracy learning ability of a text-to-text transfer model. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3095–3101.
- Jean Piaget. 1952. The Child’s Conception of Number. London: Routledge and Kegan Paul.
- Theodore M Porter. 1996. Trust in numbers. Princeton University Press.
- Impact of pretraining term frequencies on few-shot reasoning. arXiv preprint arXiv:2202.07206.
- Semantic priming in number naming. The Quarterly Journal of Experimental Psychology: Section A, 55(4):1127–1139.
- T 3: Domain-agnostic neural time-series narration. In 2021 IEEE International Conference on Data Mining (ICDM), pages 1324–1329. IEEE.
- Innovations in neural data-to-text generation. arXiv preprint arXiv:2207.12571.
- Overcoming barriers to skill injection in language modeling: Case study in arithmetic. 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Workshop on Math-AI.
- Learning non-linguistic skills without sacrificing linguistic proficiency. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6178–6191, Toronto, Canada. Association for Computational Linguistics.
- Georgios Spithourakis and Sebastian Riedel. 2018. Numeracy for language models: Evaluating and improving their ability to predict numbers. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2104–2115.
- Attention is all you need. Advances in neural information processing systems, 30.
- Order matters: Sequence to sequence for sets. In 4th International Conference on Learning Representations, ICLR 2016.
- Do nlp models know numbers? probing numeracy in embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pages 5307–5315.
- Deep neural solver for math word problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 845–854.
- Huggingface’s transformers: State-of-the-art natural language processing.
- Identity crisis: Memorization and generalization under extreme overparameterization. In 8th International Conference on Learning Representations, ICLR 2020.
- Computational modeling of numerical cognition.
- Mandar Sharma (9 papers)
- Rutuja Murlidhar Taware (1 paper)
- Pravesh Koirala (6 papers)
- Nikhil Muralidhar (19 papers)
- Naren Ramakrishnan (72 papers)