2000 character limit reached
Semi-Supervised QA with Generative Domain-Adaptive Nets (1702.02206v2)
Published 7 Feb 2017 in cs.CL and cs.LG
Abstract: We study the problem of semi-supervised question answering----utilizing unlabeled text to boost the performance of question answering models. We propose a novel training framework, the Generative Domain-Adaptive Nets. In this framework, we train a generative model to generate questions based on the unlabeled text, and combine model-generated questions with human-generated questions for training question answering models. We develop novel domain adaptation algorithms, based on reinforcement learning, to alleviate the discrepancy between the model-generated data distribution and the human-generated data distribution. Experiments show that our proposed framework obtains substantial improvement from unlabeled text.
- Zhilin Yang (50 papers)
- Junjie Hu (111 papers)
- Ruslan Salakhutdinov (248 papers)
- William W. Cohen (79 papers)