Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Augmenting semantic lexicons using word embeddings and transfer learning (2109.09010v2)

Published 18 Sep 2021 in cs.CL, cs.LG, cs.SI, and physics.soc-ph

Abstract: Sentiment-aware intelligent systems are essential to a wide array of applications. These systems are driven by LLMs which broadly fall into two paradigms: Lexicon-based and contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of use. For example, lexicon-based models allow researchers to readily determine which words and phrases contribute most to a change in measured sentiment. A challenge for any lexicon-based approach is that the lexicon needs to be routinely expanded with new words and expressions. Here, we propose two models for automatic lexicon expansion. Our first model establishes a baseline employing a simple and shallow neural network initialized with pre-trained word embeddings using a non-contextual approach. Our second model improves upon our baseline, featuring a deep Transformer-based network that brings to bear word definitions to estimate their lexical polarity. Our evaluation shows that both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Thayer Alshaabi (18 papers)
  2. Colin M. Van Oort (7 papers)
  3. Mikaela Irene Fudolig (5 papers)
  4. Michael V. Arnold (14 papers)
  5. Christopher M. Danforth (83 papers)
  6. Peter Sheridan Dodds (80 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.