2000 character limit reached
An Automatic Approach for Document-level Topic Model Evaluation (1706.05140v1)
Published 16 Jun 2017 in cs.CL
Abstract: Topic models jointly learn topics and document-level topic distribution. Extrinsic evaluation of topic models tends to focus exclusively on topic-level evaluation, e.g. by assessing the coherence of topics. We demonstrate that there can be large discrepancies between topic- and document-level model quality, and that basing model evaluation on topic-level analysis can be highly misleading. We propose a method for automatically predicting topic model quality based on analysis of document-level topic allocations, and provide empirical evidence for its robustness.
- Shraey Bhatia (5 papers)
- Jey Han Lau (67 papers)
- Timothy Baldwin (125 papers)