S2vNTM: Semi-supervised vMF Neural Topic Modeling (2307.04804v2)
Abstract: LLM based methods are powerful techniques for text classification. However, the models have several shortcomings. (1) It is difficult to integrate human knowledge such as keywords. (2) It needs a lot of resources to train the models. (3) It relied on large text data to pretrain. In this paper, we propose Semi-Supervised vMF Neural Topic Modeling (S2vNTM) to overcome these difficulties. S2vNTM takes a few seed keywords as input for topics. S2vNTM leverages the pattern of keywords to identify potential topics, as well as optimize the quality of topics' keywords sets. Across a variety of datasets, S2vNTM outperforms existing semi-supervised topic modeling methods in classification accuracy with limited keywords provided. S2vNTM is at least twice as fast as baselines.
- Weijie Xu (28 papers)
- Jay Desai (11 papers)
- Srinivasan Sengamedu (4 papers)
- Xiaoyu Jiang (17 papers)
- Francis Iannacci (5 papers)