Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sentence Modeling via Multiple Word Embeddings and Multi-level Comparison for Semantic Textual Similarity (1805.07882v1)

Published 21 May 2018 in cs.CL

Abstract: Different word embedding models capture different aspects of linguistic properties. This inspired us to propose a model (M-MaxLSTM-CNN) for employing multiple sets of word embeddings for evaluating sentence similarity/relation. Representing each word by multiple word embeddings, the MaxLSTM-CNN encoder generates a novel sentence embedding. We then learn the similarity/relation between our sentence embeddings via Multi-level comparison. Our method M-MaxLSTM-CNN consistently shows strong performances in several tasks (i.e., measure textual similarity, identify paraphrase, recognize textual entailment). According to the experimental results on STS Benchmark dataset and SICK dataset from SemEval, M-MaxLSTM-CNN outperforms the state-of-the-art methods for textual similarity tasks. Our model does not use hand-crafted features (e.g., alignment features, Ngram overlaps, dependency features) as well as does not require pre-trained word embeddings to have the same dimension.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Huy Nguyen Tien (1 paper)
  2. Minh Nguyen Le (2 papers)
  3. Yamasaki Tomohiro (1 paper)
  4. Izuha Tatsuya (1 paper)
Citations (64)