Self-training from Self-memory in Data-to-text Generation (2401.10567v1)
Abstract: This paper introduces a novel training model, self-training from self-memory (STSM) in data-to-text generation (DTG), allowing the model to self-train on subsets, including self-memory as outputs inferred directly from the trained models and/or the new data. The quality of self-memory is validated by two models, data-to-text (D2T) and text-to-data (T2D), by two pre-defined conditions: (1) the appearance of all source values in the outputs of the D2T model and (2) the ability to convert back to source data in the outputs in the T2D model. We utilize a greedy algorithm to generate shorter D2T outputs if they contain all source values. Subsequently, we use the T2D model to confirm that these outputs can capture input relationships by demonstrating their capacity to convert text back into data. With 30% of the dataset, we can train the D2T model with a competitive performance compared to full training in the same setup. We experiment with our model on two datasets, E2E NLG and DART. STSM offers the D2T model a generalization capability from its subset memory while reducing training data volume. Ultimately, we anticipate that this paper will contribute to continual learning solutions that adapt to new training data, incorporating it as a form of self-memory in DTG tasks. The curated dataset is publicly available at: https://github.com/hoangthangta/STSM.
- A brief overview of chatgpt: The history, status quo and potential future development, IEEE/CAA Journal of Automatica Sinica 10 (2023) 1122–1136.
- The flan collection: Designing data and methods for effective instruction tuning, arXiv preprint arXiv:2301.13688 (2023).
- Llama: Open and efficient foundation language models, arXiv preprint arXiv:2302.13971 (2023).
- Choosing words in computer-generated weather forecasts, Artificial Intelligence 167 (2005) 137–169.
- Template-free data-to-text generation of finnish sports news, arXiv preprint arXiv:1910.01863 (2019).
- Learning with contrastive examples for data-to-text generation, in: Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 2352–2362.
- Data-to-text generation with entity modeling, arXiv preprint arXiv:1906.03221 (2019).
- Plan-then-generate: Controlled data-to-text generation via planning, arXiv preprint arXiv:2108.13740 (2021).
- Challenges in data-to-document generation, arXiv preprint arXiv:1707.08052 (2017).
- Neural text generation from structured data with application to the biography domain, arXiv preprint arXiv:1603.07771 (2016).
- Dart: Open-domain structured data record to text generation, arXiv preprint arXiv:2007.02871 (2020).
- Findings of the E2E NLG challenge, arXiv preprint arXiv:1810.01170 (2018).
- E. Reiter, An architecture for data-to-text systems, in: proceedings of the eleventh European workshop on natural language generation (ENLG 07), 2007, pp. 97–104.
- A. Belz, E. Kow, System building cost vs. output quality in data-to-text generation, in: Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009), 2009, pp. 16–24.
- Statistical natural language generation from tabular non-textual data, in: Proceedings of the 9th international natural language generation conference, 2016, pp. 143–152.
- Learning phrase representations using rnn encoder-decoder for statistical machine translation, arXiv preprint arXiv:1406.1078 (2014).
- Sequence to sequence learning with neural networks, Advances in neural information processing systems 27 (2014).
- Neural data-to-text generation: A comparison between pipeline and end-to-end architectures, arXiv preprint arXiv:1908.09022 (2019).
- Attention is all you need, Advances in neural information processing systems 30 (2017).
- A hierarchical model for data-to-text generation, in: Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42, Springer, 2020, pp. 65–80.
- Key fact as pivot: A two-stage model for low resource table-to-text generation, arXiv preprint arXiv:1908.03067 (2019).
- Sketch and refine: Towards faithful and informative table-to-text generation, arXiv preprint arXiv:2105.14778 (2021).
- R. Puduppully, M. Lapata, Data-to-text generation with macro planning, Transactions of the Association for Computational Linguistics 9 (2021) 510–527.
- Brio: Bringing order to abstractive summarization, arXiv preprint arXiv:2203.16804 (2022).
- Extractive summarization as text matching, arXiv preprint arXiv:2004.08795 (2020).
- Have your text and use it too! end-to-end neural data-to-text generation with semantic fidelity, arXiv preprint arXiv:2004.06577 (2020).
- Improving compositional generalization with self-training for data-to-text generation, arXiv preprint arXiv:2110.08467 (2021).
- Curriculum-based self-training makes better few-shot learners for data-to-text generation, arXiv preprint arXiv:2206.02712 (2022).
- Logen: few-shot logical knowledge-conditioned text generation with self-training, IEEE/ACM Transactions on Audio, Speech, and Language Processing (2023).
- A simple recipe towards reducing hallucination in neural surface realisation, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 2673–2679.
- C. Kedzie, K. McKeown, A good sample is hard to find: Noise injection sampling and self-training for neural language generation models, arXiv preprint arXiv:1911.03373 (2019).
- Faithful low-resource data-to-text generation through cycle training, arXiv preprint arXiv:2305.14793 (2023).
- Improving graph-to-text generation using cycle training, in: Proceedings of the 4th Conference on Language, Data and Knowledge, 2023, pp. 256–261.
- Cyclegt: Unsupervised graph-to-text and text-to-graph generation via cycle training, arXiv preprint arXiv:2006.04702 (2020).
- The e2e dataset: New challenges for end-to-end generation, arXiv preprint arXiv:1706.09254 (2017).
- Crowd-sourcing nlg data: Pictures elicit better data, arXiv preprint arXiv:1608.00339 (2016).
- Describing a knowledge base, arXiv preprint arXiv:1809.01797 (2018).
- Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
- Cider: Consensus-based image description evaluation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 4566–4575.
- S. Banerjee, A. Lavie, METEOR: An automatic metric for MT evaluation with improved correlation with human judgments, in: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 2005, pp. 65–72.
- A. Lavie, M. J. Denkowski, The METEOR metric for automatic evaluation of machine translation, Machine translation 23 (2009) 105–115.
- G. Doddington, Automatic evaluation of machine translation quality using n-gram co-occurrence statistics, in: Proceedings of the second international conference on Human Language Technology Research, 2002, pp. 138–145.
- C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, in: Text summarization branches out, 2004, pp. 74–81.
- Fluency, adequacy, or hter? exploring different human judgments with a tunable mt metric, in: Proceedings of the fourth workshop on statistical machine translation, 2009, pp. 259–268.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, arXiv preprint arXiv:1910.13461 (2019).
- Exploring the limits of transfer learning with a unified text-to-text transformer, The Journal of Machine Learning Research 21 (2020) 5485–5551.
- Pragmatically informative text generation, arXiv preprint arXiv:1904.01301 (2019).
- Copy mechanism and tailored training for character-based data-to-text generation, in: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, 2019, Proceedings, Part II, Springer, 2020, pp. 648–664.
- A deep ensemble model with slot alignment for sequence-to-sequence natural language generation, arXiv preprint arXiv:1805.06553 (2018).
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.