Improving Sentence Representations with Consensus Maximisation (1810.01064v4)

Published 2 Oct 2018 in cs.CL, cs.LG, and cs.NE

Abstract: Consensus maximisation learning can provide self-supervision when different views are available of the same data. The distributional hypothesis provides another form of useful self-supervision from adjacent sentences which are plentiful in large unlabelled corpora. Motivated by the observation that different learning architectures tend to emphasise different aspects of sentence meaning, we present a new self-supervised learning framework for learning sentence representations which minimises the disagreement between two views of the same sentence where one view encodes the sentence with a recurrent neural network (RNN), and the other view encodes the same sentence with a simple linear model. After learning, the individual views (networks) result in higher quality sentence representations than their single-view learnt counterparts (learnt using only the distributional hypothesis) as judged by performance on standard downstream tasks. An ensemble of both views provides even better generalisation on both supervised and unsupervised downstream tasks. Also, importantly the ensemble of views trained with consensus maximisation between the two different architectures performs better on downstream tasks than an analogous ensemble made from the single-view trained counterparts.

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Improving Sentence Representations with Consensus Maximisation (1810.01064v4)

Summary

Related Papers