Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Depth-Gated LSTM (1508.03790v4)

Published 16 Aug 2015 in cs.NE and cs.CL

Abstract: In this short note, we present an extension of long short-term memory (LSTM) neural networks to using a depth gate to connect memory cells of adjacent layers. Doing so introduces a linear dependence between lower and upper layer recurrent units. Importantly, the linear dependence is gated through a gating function, which we call depth gate. This gate is a function of the lower layer memory cell, the input to and the past memory cell of this layer. We conducted experiments and verified that this new architecture of LSTMs was able to improve machine translation and LLMing performances.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kaisheng Yao (16 papers)
  2. Trevor Cohn (105 papers)
  3. Katerina Vylomova (1 paper)
  4. Kevin Duh (65 papers)
  5. Chris Dyer (91 papers)
Citations (73)

Summary

We haven't generated a summary for this paper yet.