Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches (2303.07196v2)

Published 13 Mar 2023 in cs.CL and cs.NE

Abstract: Vector-based word representations help countless NLP tasks capture the language's semantic and syntactic regularities. In this paper, we present the characteristics of existing word embedding approaches and analyze them with regard to many classification tasks. We categorize the methods into two main groups - Traditional approaches mostly use matrix factorization to produce word representations, and they are not able to capture the semantic and syntactic regularities of the language very well. On the other hand, Neural-network-based approaches can capture sophisticated regularities of the language and preserve the word relationships in the generated word representations. We report experimental results on multiple classification tasks and highlight the scenarios where one approach performs better than the rest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. “Stochastic answer networks for machine reading comprehension” In arXiv preprint arXiv:1712.03556, 2017
  2. Caiming Xiong, Victor Zhong and Richard Socher “Dynamic coattention networks for question answering” In arXiv preprint arXiv:1611.01604, 2016
  3. “End-to-end learning of semantic role labeling using recurrent neural networks” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015, pp. 1127–1137
  4. “Deep semantic role labeling: What works and what’s next” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 473–483
  5. William Foland and James H Martin “Dependency-based semantic role labeling using convolutional neural networks” In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 2015, pp. 279–288
  6. “Enhanced lstm for natural language inference” In arXiv preprint arXiv:1609.06038, 2016
  7. “A context-aware recurrent encoder for neural machine translation” In IEEE/ACM Transactions on Audio, Speech, and Language Processing 25.12 IEEE, 2017, pp. 2424–2432
  8. “Twitter sentiment analysis with deep convolutional neural networks” In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015, pp. 959–962
  9. Chi Sun, Luyao Huang and Xipeng Qiu “Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence” In arXiv preprint arXiv:1903.09588, 2019
  10. “Rumor detection on social media: A multi-view model using self-attention mechanism” In International Conference on Computational Science, 2019, pp. 339–352 Springer
  11. “Fake news detection through multi-perspective speaker profiles” In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2017, pp. 252–256
  12. Thomas K Landauer, Peter W Foltz and Darrell Laham “An introduction to latent semantic analysis” In Discourse processes 25.2-3 Taylor & Francis, 1998, pp. 259–284
  13. David M Blei, Andrew Y Ng and Michael I Jordan “Latent dirichlet allocation” In Journal of machine Learning research 3.Jan, 2003, pp. 993–1022
  14. “A neural probabilistic language model” In Journal of machine learning research 3.Feb, 2003, pp. 1137–1155
  15. “Learning word embeddings efficiently with noise-contrastive estimation” In Advances in neural information processing systems, 2013, pp. 2265–2273
  16. “Deep contextualized word representations” In arXiv preprint arXiv:1802.05365, 2018
  17. “Bert: Pre-training of deep bidirectional transformers for language understanding” In arXiv preprint arXiv:1810.04805, 2018
  18. Amir Bakarov “A survey of word embeddings evaluation methods” In arXiv preprint arXiv:1801.09536, 2018
  19. Marco Baroni, Georgiana Dinu and Germán Kruszewski “Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors” In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014, pp. 238–247
  20. “Evaluating word embedding models: methods and experimental results” In APSIPA Transactions on Signal and Information Processing 8 Cambridge University Press, 2019
  21. “Evaluation methods for unsupervised word embeddings” In Proceedings of the 2015 conference on empirical methods in natural language processing, 2015, pp. 298–307
  22. Xiao Yang, Craig Macdonald and Iadh Ounis “Using word embeddings in twitter election classification” In Information Retrieval Journal 21.2-3 Springer, 2018, pp. 183–207
  23. Billy Chiu, Anna Korhonen and Sampo Pyysalo “Intrinsic evaluation of word vectors fails to predict extrinsic performance” In Proceedings of the 1st workshop on evaluating vector-space representations for NLP, 2016, pp. 1–6
  24. Marwa Naili, Anja Habacha Chaibi and Henda Hajjami Ben Ghezala “Comparative study of word embedding methods in topic segmentation” In Procedia computer science 112 Elsevier, 2017, pp. 340–349
  25. “A comparative study of word embeddings for reading comprehension” In arXiv preprint arXiv:1703.00993, 2017
  26. “Comparative study of word embeddings models and their usage in Arabic language applications” In 2018 International Arab Conference on Information Technology (ACIT), 2018, pp. 1–7 IEEE
  27. “Producing high-dimensional semantic spaces from lexical co-occurrence” In Behavior Research Methods, Instruments, & Computers 28.2, 1996, pp. 203–208 DOI: 10.3758/BF03204766
  28. Douglas LT Rohde, Laura M Gonnerman and David C Plaut “An improved model of semantic similarity based on lexical co-occurrence” In Communications of the ACM 8.627-633 Citeseer, 2006, pp. 116
  29. “Word Embeddings through Hellinger PCA” In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics Gothenburg, Sweden: Association for Computational Linguistics, 2014, pp. 482–490 DOI: 10.3115/v1/E14-1051
  30. “Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database” In arXiv preprint arXiv:1610.01520, 2016
  31. “Recurrent neural network based language model” In Eleventh annual conference of the international speech communication association, 2010
  32. “Extensions of recurrent neural network language model” In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), 2011, pp. 5528–5531 IEEE
  33. “A unified architecture for natural language processing: Deep neural networks with multitask learning” In Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167
  34. “Efficient representation of word representations in vector space” In Proceedings of the international workshop on learning representations (ICLR), 2013
  35. Joseph Turian, Lev Ratinov and Yoshua Bengio “Word representations: a simple and general method for semi-supervised learning” In Proceedings of the 48th annual meeting of the association for computational linguistics, 2010, pp. 384–394 Association for Computational Linguistics
  36. “Distributed representations of words and phrases and their compositionality” In Advances in neural information processing systems, 2013, pp. 3111–3119
  37. Jeffrey Pennington, Richard Socher and Christopher D Manning “Glove: Global vectors for word representation” In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543
  38. “Enriching word vectors with subword information” In Transactions of the Association for Computational Linguistics 5 MIT Press, 2017, pp. 135–146
  39. Radu Soricut and Franz Josef Och “Unsupervised morphology induction using word embeddings” In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015, pp. 1627–1637
  40. “Misspelling Oblivious Word Embeddings” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 3226–3234
  41. “Efficient non-parametric estimation of multiple embeddings per word in vector space” In arXiv preprint arXiv:1504.06654, 2015
  42. “Improving word representations via global context and multiple word prototypes” In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1, 2012, pp. 873–882 Association for Computational Linguistics
  43. “Exploring the limits of language modeling” In arXiv preprint arXiv:1602.02410, 2016
  44. Gábor Melis, Chris Dyer and Phil Blunsom “On the state of the art of evaluation in neural language models” In arXiv preprint arXiv:1707.05589, 2017
  45. “Semi-supervised sequence tagging with bidirectional language models” In arXiv preprint arXiv:1705.00108, 2017
  46. “Improving language understanding by generative pre-training” In URL https://s3-us-west-2. amazonaws. com/openai-assets/researchcovers/languageunsupervised/language understanding paper. pdf, 2018
  47. “Language models are unsupervised multitask learners” In OpenAI Blog 1.8, 2019, pp. 9
  48. “Attention is all you need” In Advances in neural information processing systems, 2017, pp. 5998–6008
  49. “Google’s neural machine translation system: Bridging the gap between human and machine translation” In arXiv preprint arXiv:1609.08144, 2016
  50. “Tensorflow: a system for large-scale machine learning.” In Osdi 16.2016, 2016, pp. 265–283 Savannah, GA, USA
  51. “Software Framework for Topic Modelling with Large Corpora” http://is.muni.cz/publication/884893/en In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks Valletta, Malta: ELRA, 2010, pp. 45–50
Citations (1)

Summary

We haven't generated a summary for this paper yet.