Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy (2402.01714v1)

Published 25 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Data-to-text (D2T) generation is a crucial task in many natural language understanding (NLU) applications and forms the foundation of task-oriented dialog systems. In the context of conversational AI solutions that can work directly with local data on the user's device, architectures utilizing large pre-trained LLMs (PLMs) are impractical for on-device deployment due to a high memory footprint. To this end, we propose TrICy, a novel lightweight framework for an enhanced D2T task that generates text sequences based on the intent in context and may further be guided by user-provided triggers. We leverage an attention-copy mechanism to predict out-of-vocabulary (OOV) words accurately. Performance analyses on E2E NLG dataset (BLEU: 66.43%, ROUGE-L: 70.14%), WebNLG dataset (BLEU: Seen 64.08%, Unseen 52.35%), and our Custom dataset related to text messaging applications, showcase our architecture's effectiveness. Moreover, we show that by leveraging an optional trigger input, data-to-text generation quality increases significantly and achieves the new SOTA score of 69.29% BLEU for E2E NLG. Furthermore, our analyses show that TrICy achieves at least 24% and 3% improvement in BLEU and METEOR respectively over LLMs like GPT-3, ChatGPT, and Llama 2. We also demonstrate that in some scenarios, performance improvement due to triggers is observed even when they are absent in training.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. J. Novikova, O. Dušek, and V. Rieser, “The E2E dataset: New challenges for end-to-end generation,” in Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue.   Saarbrücken, Germany: Association for Computational Linguistics, Aug. 2017, pp. 201–206. [Online]. Available: https://aclanthology.org/W17-5525
  2. C. Gardent, A. Shimorina, S. Narayan, and L. Perez-Beltrachini, “Creating training corpora for NLG micro-planners,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 179–188. [Online]. Available: https://aclanthology.org/P17-1017
  3. A. Gatt and E. Krahmer, “Survey of the state of the art in natural language generation: Core tasks, applications and evaluation,” Journal of Artificial Intelligence Research, vol. 61, pp. 65–170, 2018.
  4. R. Puduppully, L. Dong, and M. Lapata, “Data-to-text generation with entity modeling,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 2023–2035. [Online]. Available: https://aclanthology.org/P19-1195
  5. S. Ma, P. Yang, T. Liu, P. Li, J. Zhou, and X. Sun, “Key fact as pivot: A two-stage model for low resource table-to-text generation,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 2047–2057. [Online]. Available: https://aclanthology.org/P19-1197
  6. S. Chen, J. Wang, X. Feng, F. Jiang, B. Qin, and C.-Y. Lin, “Enhancing neural data-to-text generation models with external background knowledge,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3022–3032. [Online]. Available: https://aclanthology.org/D19-1299
  7. OpenAI, “Introducing ChatGPT,” 2022. [Online]. Available: https://openai.com/blog/chatgpt
  8. OpenAI, “Gpt-4 technical report,” 2023. [Online]. Available: https://arxiv.org/abs/2107.13586
  9. Mehdi, Yusuf, “Reinventing search with a new ai-powered microsoft bing and edge, your copilot for the web,” 2023. [Online]. Available: https://aka.ms/AAjd7x2
  10. J. Gu, Z. Lu, H. Li, and V. O. Li, “Incorporating copying mechanism in sequence-to-sequence learning,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 1631–1640. [Online]. Available: https://aclanthology.org/P16-1154
  11. L. Iordanskaja, M. Kim, R. Kittredge, B. Lavoie, and A. Polguere, “Generation of extended bilingual statistical reports,” in COLING 1992 Volume 3: The 14th International Conference on Computational Linguistics, 1992.
  12. A. Gatt, F. Portet, E. Reiter, J. Hunter, S. Mahamood, W. Moncur, and S. Sripada, “From data to text in the neonatal intensive care unit: Using nlg technology for decision support and information management,” Ai Communications, vol. 22, no. 3, pp. 153–186, 2009.
  13. K. Kukich, “Design of a knowledge-based report generator,” in 21st Annual Meeting of the Association for Computational Linguistics, 1983, pp. 145–150.
  14. H. Mei, M. Bansal, and M. R. Walter, “What to talk about and how? selective generation using lstms with coarse-to-fine alignment,” arXiv preprint arXiv:1509.00838, 2015.
  15. L. Leppänen, M. Munezero, M. Granroth-Wilding, and H. Toivonen, “Data-driven news generation for automated journalism,” in Proceedings of the 10th international conference on natural language generation, 2017, pp. 188–197.
  16. C. van der Lee, E. Krahmer, and S. Wubben, “Pass: A dutch data-to-text system for soccer, targeted towards specific audiences,” in Proceedings of the 10th International Conference on Natural Language Generation, 2017, pp. 95–104.
  17. A. Gatt and E. Reiter, “Simplenlg: A realisation engine for practical applications,” in Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009), 2009, pp. 90–93.
  18. D. Matheson, S. Sripada, and G. M. Coghill, “Moving from data to text using causal statements in explanatory narratives,” in 2010 UK Workshop on Computational Intelligence (UKCI).   IEEE, 2010, pp. 1–6.
  19. A. Mishra, M. F. M. Chowdhury, S. Manohar, D. Gutfreund, and K. Sankaranarayanan, “Template controllable keywords-to-text generation,” arXiv preprint arXiv:2011.03722, 2020.
  20. P. Liang, M. I. Jordan, and D. Klein, “Learning semantic correspondences with less supervision,” in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2009, pp. 91–99.
  21. D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
  22. C. Xing, W. Wu, Y. Wu, J. Liu, Y. Huang, M. Zhou, and W.-Y. Ma, “Topic aware neural response generation,” in Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017.
  23. N. Dziri, E. Kamalloo, K. W. Mathewson, and O. Zaiane, “Augmenting neural response generation with context-aware topical attention,” arXiv preprint arXiv:1811.01063, 2018.
  24. K. Song, L. Zhao, and F. Liu, “Structure-infused copy mechanisms for abstractive summarization,” arXiv preprint arXiv:1806.05658, 2018.
  25. X. Zeng, D. Zeng, S. He, K. Liu, and J. Zhao, “Extracting relational facts by an end-to-end neural model with copy mechanism,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 506–514.
  26. D. Zeng, H. Zhang, and Q. Liu, “Copymtl: Copy mechanism for joint extraction of entities and relations with multi-task learning,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 05, 2020, pp. 9507–9514.
  27. A. Singh, P. Xia, G. Qin, M. Yarmohammadi, and B. Van Durme, “Copynext: Explicit span copying and alignment in sequence to sequence models,” arXiv preprint arXiv:2010.15266, 2020.
  28. J. Bai, Z. Yang, X. Liang, W. Wang, and Z. Li, “Learning to copy coherent knowledge for response generation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 14, 2021, pp. 12 535–12 543.
  29. S. Lin, W. Wang, Z. Yang, X. Liang, F. F. Xu, E. Xing, and Z. Hu, “Data-to-text generation with style imitation,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, Nov. 2020, pp. 1589–1598. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.144
  30. S. Yang, Y. Liu, D. Feng, and D. Li, “Text generation from data with dynamic planning,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 26–34, 2021.
  31. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online]. Available: https://aclanthology.org/2020.acl-main.703
  32. L. Zhuang, L. Wayne, S. Ya, and Z. Jun, “A robustly optimized BERT pre-training approach with post-training,” in Proceedings of the 20th Chinese National Conference on Computational Linguistics.   Huhhot, China: Chinese Information Processing Society of China, Aug. 2021, pp. 1218–1227. [Online]. Available: https://aclanthology.org/2021.ccl-1.108
  33. T. Castro Ferreira, C. Gardent, N. Ilinykh, C. van der Lee, S. Mille, D. Moussallem, and A. Shimorina, “The 2020 bilingual, bi-directional WebNLG+ shared task: Overview and evaluation results (WebNLG+ 2020),” in Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+).   Dublin, Ireland (Virtual): Association for Computational Linguistics, 12 2020, pp. 55–76. [Online]. Available: https://aclanthology.org/2020.webnlg-1.7
  34. H. Harkous, I. Groves, and A. Saffari, “Have your text and use it too! end-to-end neural data-to-text generation with semantic fidelity,” in Proceedings of the 28th International Conference on Computational Linguistics.   Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 2410–2424. [Online]. Available: https://aclanthology.org/2020.coling-main.218
  35. Y. Zhou and V. Srikumar, “A closer look at how fine-tuning changes BERT,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 1046–1061. [Online]. Available: https://aclanthology.org/2022.acl-long.75
  36. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” 2020.
  37. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “Llama: Open and efficient foundation language models,” 2023.
  38. E. Reif, D. Ippolito, A. Yuan, A. Coenen, C. Callison-Burch, and J. Wei, “A recipe for arbitrary text style transfer with large language models,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 837–848. [Online]. Available: https://aclanthology.org/2022.acl-short.94
  39. X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).   Online: Association for Computational Linguistics, Aug. 2021, pp. 4582–4597. [Online]. Available: https://aclanthology.org/2021.acl-long.353
  40. F. Carlsson, J. Öhman, F. Liu, S. Verlinden, J. Nivre, and M. Sahlgren, “Fine-grained controllable text generation using non-residual prompting,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 6837–6857. [Online]. Available: https://aclanthology.org/2022.acl-long.471
  41. T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.   Lisbon, Portugal: Association for Computational Linguistics, Sep. 2015, pp. 1412–1421. [Online]. Available: https://aclanthology.org/D15-1166
  42. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” 2021. [Online]. Available: https://arxiv.org/abs/2107.13586
  43. H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.
  44. H. Elder, S. Gehrmann, A. O’Connor, and Q. Liu, “E2E NLG challenge submission: Towards controllable generation of diverse natural language,” in Proceedings of the 11th International Conference on Natural Language Generation.   Tilburg University, The Netherlands: Association for Computational Linguistics, Nov. 2018, pp. 457–462. [Online]. Available: https://aclanthology.org/W18-6556
  45. B. Zhang, J. Yang, Q. Lin, and J. Su, “Attention regularized sequence-to-sequence learning for e2e nlg challenge,” E2E NLG Challenge System Descriptions, 2018.
  46. S. Shen, D. Fried, J. Andreas, and D. Klein, “Pragmatically informative text generation,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).   Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4060–4067. [Online]. Available: https://aclanthology.org/N19-1410
  47. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv preprint arXiv:1603.04467, 2016.
  48. J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1532–1543. [Online]. Available: https://aclanthology.org/D14-1162
  49. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR, vol. abs/1412.6980, 2015.
  50. D. Masson, S. Malacria, G. Casiez, and D. Vogel, “Directgpt: A direct manipulation interface to interact with large language models,” 2023.
  51. L. S. Hadla, T. M. Hailat, and M. N. Al-Kabi, “Comparative study between meteor and bleu methods of mt: Arabic into english translation as a case study,” International Journal of Advanced Computer Science and Applications, vol. 6, no. 11, pp. 215–223, 2015.
  52. T. Tang, J. Li, Z. Chen, Y. Hu, Z. Yu, W. Dai, Z. Dong, X. Cheng, Y. Wang, W. X. Zhao et al., “Textbox 2.0: A text generation library with pre-trained language models,” arXiv preprint arXiv:2212.13005, 2022.
  53. A. Axelsson and G. Skantze, “Using large language models for zero-shot natural language generation from knowledge graphs,” in Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023), A. Gatt, C. Gardent, L. Cripwell, A. Belz, C. Borg, A. Erdem, and E. Erdem, Eds.   Prague, Czech Republic: Association for Computational Linguistics, Sep. 2023, pp. 39–54. [Online]. Available: https://aclanthology.org/2023.mmnlg-1.5
  54. S. Yuan and M. Faerber, “Evaluating generative models for graph-to-text generation,” in Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, R. Mitkov and G. Angelova, Eds.   Varna, Bulgaria: INCOMA Ltd., Shoumen, Bulgaria, Sep. 2023, pp. 1256–1264. [Online]. Available: https://aclanthology.org/2023.ranlp-1.133
  55. J. Clive, K. Cao, and M. Rei, “Control prefixes for parameter-efficient text generation,” 2021. [Online]. Available: https://arxiv.org/abs/2110.08329
  56. Q. Wang, S. Yavuz, X. V. Lin, H. Ji, and N. Rajani, “Stage-wise fine-tuning for graph-to-text generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop.   Online: Association for Computational Linguistics, Aug. 2021, pp. 16–22. [Online]. Available: https://aclanthology.org/2021.acl-srw.2
  57. M. Kale and A. Rastogi, “Text-to-text pre-training for data-to-text tasks,” in Proceedings of the 13th International Conference on Natural Language Generation.   Dublin, Ireland: Association for Computational Linguistics, Dec. 2020, pp. 97–102. [Online]. Available: https://aclanthology.org/2020.inlg-1.14
  58. L. F. R. Ribeiro, Y. Zhang, C. Gardent, and I. Gurevych, “Modeling global and local node contexts for text generation from knowledge graphs,” 2020. [Online]. Available: https://arxiv.org/abs/2001.11003
  59. Altexsoft, “Synthetic data for machine learning: Its nature, types, and means of generation,” 2022, accessed on 2022-06-07. [Online]. Available: https://www.altexsoft.com/blog/synthetic-data-generation
  60. Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,” ArXiv, vol. abs/2202.03629, 2022.
  61. M. Labonne, “Fine-tune your own llama 2 model in a colab notebook,” 2023, accessed on 2023-08-18. [Online]. Available: https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Vibhav Agarwal (8 papers)
  2. Sourav Ghosh (28 papers)
  3. Harichandana BSS (2 papers)
  4. Himanshu Arora (12 papers)
  5. Barath Raj Kandur Raja (7 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets