Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies (1702.07805v4)

Published 24 Feb 2017 in cs.NE

Abstract: Recurrent neural networks (RNNs) have achieved state-of-the-art performance on many diverse tasks, from machine translation to surgical activity recognition, yet training RNNs to capture long-term dependencies remains difficult. To date, the vast majority of successful RNN architectures alleviate this problem using nearly-additive connections between states, as introduced by long short-term memory (LSTM). We take an orthogonal approach and introduce MIST RNNs, a NARX RNN architecture that allows direct connections from the very distant past. We show that MIST RNNs 1) exhibit superior vanishing-gradient properties in comparison to LSTM and previously-proposed NARX RNNs; 2) are far more efficient than previously-proposed NARX RNN architectures, requiring even fewer computations than LSTM; and 3) improve performance substantially over LSTM and Clockwork RNNs on tasks requiring very long-term dependencies.

Authors (4)

Robert DiPietro (7 papers)
Christian Rupprecht (90 papers)
Nassir Navab (461 papers)
Gregory D. Hager (79 papers)

Citations (26)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies (1702.07805v4)

Summary

Related Papers