Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Alternative structures for character-level RNNs (1511.06303v2)

Published 19 Nov 2015 in cs.LG and cs.CL

Abstract: Recurrent neural networks are convenient and efficient models for LLMing. However, when applied on the level of characters instead of words, they suffer from several problems. In order to successfully model long-term dependencies, the hidden representation needs to be large. This in turn implies higher computational costs, which can become prohibitive in practice. We propose two alternative structural modifications to the classical RNN model. The first one consists on conditioning the character level representation on the previous word representation. The other one uses the character history to condition the output probability. We evaluate the performance of the two proposed modifications on challenging, multi-lingual real world data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Piotr Bojanowski (50 papers)
  2. Armand Joulin (81 papers)
  3. Tomas Mikolov (43 papers)
Citations (49)

Summary

We haven't generated a summary for this paper yet.