Papers
Topics
Authors
Recent
Search
2000 character limit reached

IGLOO: Slicing the Features Space to Represent Sequences

Published 9 Jul 2018 in cs.LG and stat.ML | (1807.03402v3)

Abstract: Historically, Recurrent neural networks (RNNs) and its variants such as LSTM and GRU and more recently Transformers have been the standard go-to components when processing sequential data with neural networks. One notable issue is the relative difficulty to deal with long sequences (i.e. more than 20,000 steps). We introduce IGLOO, a new neural network architecture which aims at being efficient for short sequences but also at being able to deal with long sequences. IGLOOs core idea is to use the relationships between non-local patches sliced out of the features maps of successively applied convolutions to build a representation for the sequence. We show that the model can deal with dependencies of more than 20,000 steps in a reasonable time frame. We stress test IGLOO on the copy-memory and addition tasks, as well as permuted MNIST (98.4%). For a larger task we apply this new structure to the Wikitext-2 dataset Merity et al. (2017b) and achieve a perplexity in line with baseline Transformers but lower than baseline AWD-LSTM. We also present how IGLOO is already used today in production for bioinformatics tasks.

Citations (5)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.