Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
90 tokens/sec
Gemini 2.5 Pro Premium
54 tokens/sec
GPT-5 Medium
19 tokens/sec
GPT-5 High Premium
18 tokens/sec
GPT-4o
104 tokens/sec
DeepSeek R1 via Azure Premium
78 tokens/sec
GPT OSS 120B via Groq Premium
475 tokens/sec
Kimi K2 via Groq Premium
225 tokens/sec
2000 character limit reached

Low-dimensional Semantic Space: from Text to Word Embedding (1911.00845v1)

Published 3 Nov 2019 in cs.CL and cs.LG

Abstract: This article focuses on the study of Word Embedding, a feature-learning technique in Natural Language Processing that maps words or phrases to low-dimensional vectors. Beginning with the linguistic theories concerning contextual similarities - "Distributional Hypothesis" and "Context of Situation", this article introduces two ways of numerical representation of text: One-hot and Distributed Representation. In addition, this article presents statistical-based LLMs(such as Co-occurrence Matrix and Singular Value Decomposition) as well as Neural Network LLMs (NNLM, such as Continuous Bag-of-Words and Skip-Gram). This article also analyzes how Word Embedding can be applied to the study of word-sense disambiguation and diachronic linguistics.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (2)