Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LSTMs Exploit Linguistic Attributes of Data (1805.11653v2)

Published 29 May 2018 in cs.CL

Abstract: While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM's ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Nelson F. Liu (19 papers)
  2. Omer Levy (70 papers)
  3. Roy Schwartz (74 papers)
  4. Chenhao Tan (89 papers)
  5. Noah A. Smith (224 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.