Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Topic-to-essay generation with knowledge-based content selection (2402.16248v1)

Published 26 Feb 2024 in cs.CL and cs.AI

Abstract: The topic-to-essay generation task is a challenging natural language generation task that aims to generate paragraph-level text with high semantic coherence based on a given set of topic words. Previous work has focused on the introduction of external knowledge, ignoring the insufficient generated text diversity. In order to improve the generation diversity, we propose a novel copy mechanism model with a content selection module that integrates rich semantic knowledge from the LLM into the decoder. Furthermore, we introduce the improved prefix tuning method to train the model, enabling it to adapt to varying input complexities. In addition, we have contributed a new Chinese dataset for TEG tasks. Experimental results demonstrate that the proposed model can improve the generated text diversity by 35\% to 59\% compared to the state-of-the-art method, while maintaining a high level of topic consistency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. “Data-driven news generation for automated journalism,” in Proceedings of the 10th international conference on natural language generation, 2017, pp. 188–197.
  2. “Enhancing topic-to-essay generation with external commonsense knowledge,” in Proceedings of the 57th annual meeting of the association for computational linguistics, 2019, pp. 2002–2012.
  3. “A sentiment-controllable topic-to-essay generator with topic knowledge graph,” arXiv preprint arXiv:2010.05511, 2020.
  4. “Sememe-based topic-to-essay generation with neural networks,” in Journal of Physics: Conference Series. IOP Publishing, 2021, vol. 1861, p. 012068.
  5. “Conceptnet 5.5: An open multilingual graph of general knowledge,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, Satinder Singh and Shaul Markovitch, Eds. 2017, pp. 4444–4451, AAAI Press.
  6. “A method of calculating the semantic similarity between english and chinese concepts,” in Machine Learning and Intelligent Communications: 4th International Conference, MLICOM 2019, Nanjing, China, August 24–25, 2019, Proceedings 4. Springer, 2019, pp. 313–324.
  7. Xinyi Ning, “Topic-to-text generation with pmi-ir additional semantic information,” in 2021 International Conference on Asian Language Processing (IALP). IEEE, 2021, pp. 131–136.
  8. “Topic-to-essay generation with corpus-based background information,” in Journal of Physics: Conference Series. IOP Publishing, 2021, vol. 1827, p. 012127.
  9. “Topic-to-essay generation with comprehensive knowledge enhancement,” in Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part V 21. Springer, 2021, pp. 302–318.
  10. “Transformer-based hierarchical topic-to-essay generation,” Procedia Computer Science, vol. 202, pp. 414–421, 2022.
  11. “Pc-san: Pretraining-based contextual self-attention model for topic essay generation,” KSII Transactions on Internet and Information Systems (TIIS), vol. 14, no. 8, pp. 3168–3186, 2020.
  12. “Genius: Sketch-based language model pre-training via extreme and selective masking for text generation and augmentation,” arXiv preprint arXiv:2211.10330, 2022.
  13. “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 4582–4597.
  14. “Topic-to-essay generation with neural networks.,” in IJCAI, 2018, pp. 4078–4084.
  15. “Yake! keyword extraction from single documents using multiple local features,” Inf. Sci., vol. 509, pp. 257–289, 2020.
  16. “Mderank: A masked document embedding rank approach for unsupervised keyphrase extraction,” arXiv preprint arXiv:2110.06651, 2021.
  17. “Editable neural networks,” arXiv preprint arXiv:2004.00345, 2020.
  18. “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
  19. “A diversity-promoting objective function for neural conversation models,” arXiv preprint arXiv:1510.03055, 2015.

Summary

We haven't generated a summary for this paper yet.