Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Topic Modeling with Continual Lifelong Learning (2006.10909v2)

Published 19 Jun 2020 in cs.CL, cs.IR, cs.LG, and cs.NE

Abstract: Lifelong learning has recently attracted attention in building machine learning systems that continually accumulate and transfer knowledge to help future learning. Unsupervised topic modeling has been popularly used to discover topics from document collections. However, the application of topic modeling is challenging due to data sparsity, e.g., in a small collection of (short) documents and thus, generate incoherent topics and sub-optimal document representations. To address the problem, we propose a lifelong learning framework for neural topic modeling that can continuously process streams of document collections, accumulate topics and guide future topic modeling tasks by knowledge transfer from several sources to better deal with the sparse data. In the lifelong process, we particularly investigate jointly: (1) sharing generative homologies (latent topics) over lifetime to transfer prior knowledge, and (2) minimizing catastrophic forgetting to retain the past learning via novel selective data augmentation, co-training and topic regularization approaches. Given a stream of document collections, we apply the proposed Lifelong Neural Topic Modeling (LNTM) framework in modeling three sparse document collections as future tasks and demonstrate improved performance quantified by perplexity, topic coherence and information retrieval task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.
  2. Topic modeling using topics from many domains, lifelong learning and big data. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, pp.  703–711, 2014.
  3. Lifelong machine learning for natural language processing. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, Austin, Texas, November 2016. Association for Computational Linguistics.
  4. Lifelong learning for sentiment classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 2: Short Papers, pp.  750–756, 2015.
  5. A unified architecture for natural language processing: deep neural networks with multitask learning. In Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), Helsinki, Finland, June 5-9, 2008, pp.  160–167, 2008.
  6. Gaussian lda for topic models with word embeddings. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.  795–804. Association for Computational Linguistics, 2015. doi: 10.3115/v1/P15-1077.
  7. Episodic memory in lifelong language learning. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pp.  13122–13131, 2019.
  8. Document informed neural autoregressive topic models with distributional prior. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp.  6505–6512, 2019.
  9. Neuroscience-inspired artificial intelligence. Neuron, 95(2):245–258, 2017.
  10. Less-forgetting learning in deep neural networks. CoRR, abs/1607.00122, 2016.
  11. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  12. A neural autoregressive topic model. In Bartlett, P. L., Pereira, F. C. N., Burges, C. J. C., Bottou, L., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems, pp.  2717–2725, 2012.
  13. The neural autoregressive distribution estimator. In Gordon, G. J., Dunson, D. B., and Dudík, M. (eds.), Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS, volume 15 of JMLR Proceedings, pp.  29–37. JMLR.org, 2011.
  14. Document neural autoregressive distribution estimation. Journal of Machine Learning Research, 18:113:1–113:24, 2017.
  15. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell., 40(12):2935–2947, 2018.
  16. Neural variational inference for text processing. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pp.  1727–1736. JMLR.org, 2016.
  17. Never-ending learning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, pp. 2302–2310, 2015.
  18. Improving topic models with latent feature word representations. TACL, 3:299–313, 2015.
  19. Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
  20. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  1532–1543. Association for Computational Linguistics, 2014.
  21. Word features for latent dirichlet allocation. In Lafferty, J. D., Williams, C. K. I., Shawe-Taylor, J., Zemel, R. S., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems, pp.  1921–1929. Curran Associates, Inc., 2010.
  22. Robins, A. V. Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci., 7(2):123–146, 1995.
  23. Exploring the space of topic coherence measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM 2015, Shanghai, China, February 2-6, 2015, pp.  399–408. ACM, 2015.
  24. Ruder, S. An overview of multi-task learning in deep neural networks. CoRR, abs/1706.05098, 2017.
  25. Replicated softmax: an undirected topic model. In Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems, pp.  1607–1614. Curran Associates, Inc., 2009.
  26. Autoencoding variational inference for topic models. In 5th International Conference on Learning Representations, ICLR, 2017.
  27. Lifelong robot learning. Robotics and autonomous systems, 15(1-2):25–46, 1995.
  28. Sentence embedding alignment for lifelong relation extraction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.  796–806, 2019.
  29. Continual learning through synaptic intelligence. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, pp. 3987–3995, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Pankaj Gupta (33 papers)
  2. Yatin Chaudhary (10 papers)
  3. Thomas Runkler (34 papers)
  4. Hinrich Schütze (250 papers)
Citations (42)

Summary

We haven't generated a summary for this paper yet.