Learning Architectures from an Extended Search Space for Language Modeling (2005.02593v2)

Published 6 May 2020 in cs.LG, cs.CL, and stat.ML

Abstract: Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural LLMing, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

Citations (10)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning Architectures from an Extended Search Space for Language Modeling (2005.02593v2)

Summary

Related Papers