Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Explainable and Discourse Topic-aware Neural Language Understanding (2006.10632v3)

Published 18 Jun 2020 in cs.CL, cs.AI, and cs.LG

Abstract: Marrying topic models and LLMs exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in LLMs, existing approaches incorporate latent document topic proportions and ignore topical discourse in sentences of the document. This work extends the line of research by additionally introducing an explainable topic representation in language understanding, obtained from a set of key terms correspondingly for each latent topic of the proportion. Moreover, we retain sentence-topic associations along with document-topic association by modeling topical discourse for every sentence in the document. We present a novel neural composite LLM that exploits both the latent and explainable topics along with topical discourse at sentence-level in a joint learning framework of topic and LLMs. Experiments over a range of tasks such as LLMing, word sense disambiguation, document classification, retrieval and text generation demonstrate ability of the proposed model in improving language understanding.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Latent dirichlet allocation. In Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada], pp.  601–608, 2001.
  2. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017. ISSN 2307-387X.
  3. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724–1734, 2014.
  4. Topicrnn: A recurrent neural network with long-range semantic dependency. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
  5. Document informed neural autoregressive topic models with distributional prior. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp.  6505–6512, 2019a.
  6. Texttovec: Deep contextualized neural autoregressive topic models of language with distributed compositional prior. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019b.
  7. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
  8. Sharp nearby, fuzzy far away: How neural language models use context. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp.  284–294, 2018.
  9. Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751, 2014.
  10. Topically driven neural language model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp.  355–365, 2017.
  11. Neural variational inference for text processing. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pp. 1727–1736, 2016.
  12. Recurrent neural network based language model. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010, pp.  1045–1048, 2010.
  13. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013.
  14. Miller, G. A. Wordnet: A lexical database for english. Commun. ACM, 38(11):39–41, 1995.
  15. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 1: Long Papers, pp.  99–110, 2017.
  16. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), pp.  2227–2237, 2018.
  17. Neural sequence learning models for word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pp.  1156–1167, 2017.
  18. Larger-context language modelling with recurrent neural network. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers, 2016.
  19. Topic compositional neural language model. In International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 9-11 April 2018, Playa Blanca, Lanzarote, Canary Islands, Spain, pp.  356–365, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yatin Chaudhary (10 papers)
  2. Hinrich Schütze (250 papers)
  3. Pankaj Gupta (33 papers)
Citations (7)