Recent Advances in Neural Text Generation: A Task-Agnostic Survey (2203.03047v3)
Abstract: In recent years, considerable research has been dedicated to the application of neural models in the field of natural language generation (NLG). The primary objective is to generate text that is both linguistically natural and human-like, while also exerting control over the generation process. This paper offers a comprehensive and task-agnostic survey of the recent advancements in neural text generation. These advancements have been facilitated through a multitude of developments, which we categorize into four key areas: data construction, neural frameworks, training and inference strategies, and evaluation metrics. By examining these different aspects, we aim to provide a holistic overview of the progress made in the field. Furthermore, we explore the future directions for the advancement of neural text generation, which encompass the utilization of neural pipelines and the incorporation of background knowledge. These avenues present promising opportunities to further enhance the capabilities of NLG systems. Overall, this survey serves to consolidate the current state of the art in neural text generation and highlights potential avenues for future research and development in this dynamic field.
- Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- Unified pre-training for program understanding and generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Arwa I Alhussain and Aqil M Azmi. 2021. Automatic story generation: A survey of approaches. ACM Computing Surveys (CSUR), 54(5):1–38.
- Jointly measuring diversity and quality in text generation models. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
- Agreement is overrated: A plea for correlation to assess human evaluation reliability. In Proceedings of the 12th International Conference on Natural Language Generation.
- The use of rating and Likert scales in natural language generation human evaluation tasks: A review and some recommendations. In Proceedings of the 12th International Conference on Natural Language Generation.
- AraGPT2: Pre-trained transformer for Arabic language generation. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 196–207, Kyiv, Ukraine (Virtual). Association for Computational Linguistics.
- Pens: A dataset and generic framework for personalized news headline generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 82–92.
- A brief survey of deep reinforcement learning. IEEE Signal Processing Magazine, 34.
- Variational attention for sequence-to-sequence models. In Proceedings of the 27th International Conference on Computational Linguistics.
- Stochastic Wasserstein autoencoder for probabilistic sentence generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Recent advances in adversarial training for adversarial robustness. arXiv preprint arXiv:2102.01356.
- Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65–72, Ann Arbor, Michigan. Association for Computational Linguistics.
- Yonatan Belinkov and James Glass. 2019. Analysis methods in neural language processing: A survey. Transactions of the Association for Computational Linguistics, 7.
- Anja Belz and Eric Kow. 2010. Extracting parallel fragments from comparable corpora for data-to-text generation. In Proceedings of the 6th International Natural Language Generation Conference.
- What takes the brain so long: Object recognition at the level of minimal images develops for up to seconds of presentation time.
- Generating sentences from a continuous space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning.
- Faeze Brahman and Snigdha Chaturvedi. 2020a. Modeling protagonist emotions for emotion-aware storytelling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Faeze Brahman and Snigdha Chaturvedi. 2020b. Modeling protagonist emotions for emotion-aware storytelling. arXiv preprint arXiv:2010.06822.
- Skeleton-to-response: Dialogue generation guided by retrieval memory. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- CRF autoencoder for unsupervised dependency parsing. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
- TAG : Type auxiliary guiding for code comment generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Retrieve, rerank and rewrite: Soft template based neural summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Evaluation of text generation: A survey. arXiv preprint arXiv:2006.14799.
- Neural keyphrase generation via reinforcement learning with adaptive rewards. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Modeling personalization in continuous space for response generation via augmented Wasserstein autoencoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- “my way of telling a story”: Persona based grounded story generation. In Proceedings of the Second Workshop on Storytelling, pages 11–21.
- Khyathi Raghavi Chandu and Alan W Black. 2020. Positioning yourself in the maze of neural text generation: A task-agnostic survey. arXiv preprint arXiv:2010.07279.
- Neural data-to-text generation with LM-based text augmentation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online.
- KGPT: Knowledge-grounded pre-training for data-to-text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Kgpt: Knowledge-grounded pre-training for data-to-text generation. arXiv preprint arXiv:2010.02307.
- A working memory model for task-oriented dialog response generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2687–2693.
- Distilling knowledge learned in BERT for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Cross-modal memory networks for radiology report generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5904–5914.
- Align-refine: Non-autoregressive speech recognition via iterative realignment. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- Jen-Tzung Chien and Chun-Wei Wang. 2019. Variational and hierarchical recurrent autoencoder. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3202–3206. IEEE.
- Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning, December 2014.
- Sentence mover’s similarity: Automatic evaluation for multi-sentence texts. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Elizabeth Clark and Noah A Smith. 2021. Choose your own adventure: Paired suggestions in collaborative writing for evaluating story generation models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3566–3575.
- Freenet: A distributed anonymous information storage and retrieval system. In Designing privacy enhancing technologies, pages 46–66. Springer.
- DAL: Dual adversarial learning for dialogue generation. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
- Focus-constrained attention mechanism for CVAE-based response generation. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online.
- A survey of multilingual neural machine translation. ACM Computing Surveys (CSUR), 53(5):1–38.
- Marco Damonte and Shay B. Cohen. 2019. Structural neural encoders for AMR-to-text generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Ernest Davis and Gary Marcus. 2015. Commonsense reasoning and commonsense knowledge in artificial intelligence. Communications of the ACM, 58(9):92–103.
- Spot the bot: A robust and efficient framework for the evaluation of conversational dialogue systems. arXiv preprint arXiv:2010.02140.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- A survey of natural language generation. arXiv preprint arXiv:2112.11739.
- Position information in transformers: An overview. arXiv preprint arXiv:2102.11090.
- Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165:113679.
- Template-based question generation from retrieved sentences for improved unsupervised question answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Variational recurrent auto-encoders. CoRR, abs/1412.6581.
- Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Maskgan: better text generation via filling in the_. ICLR.
- Jessica Ficler and Yoav Goldberg. 2017. Controlling linguistic style aspects in neural language generation. In Proceedings of the Workshop on Stylistic Variation.
- Yao Fu and Yansong Feng. 2018. Natural answer generation with heterogeneous memory. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).
- Partially-aligned data-to-text generation with distant supervision. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- End-to-end conversation modeling : Moving beyond chitchat dstc 7 task 2 description ( v 1 . 0 ). In DSTC7 workshop.
- Rashmi Gangadharaiah and Balakrishnan Narayanaswamy. 2020. Recursive template-based frame generation for task oriented dialog. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Judge the judges: A large-scale evaluation study of neural language models for online review generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- The webnlg challenge: Generating text from rdf data. In Proceedings of the 10th International Conference on Natural Language Generation, pages 124–133.
- Albert Gatt and Emiel Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61:65–170.
- The GEM benchmark: Natural language generation, its evaluation and metrics. In Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021).
- Improving human text comprehension through semi-Markov CRF-based neural section title generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Plot-guided adversarial example construction for evaluating open-domain story generation. arXiv preprint arXiv:2104.05801.
- Better automatic evaluation of open-domain dialogue systems with contextualized embeddings. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
- Yoav Goldberg and Omer Levy. 2014. word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722.
- Content planning for neural story generation with aristotelian rescoring. arXiv preprint arXiv:2009.09870.
- Table-to-text generation with effective hierarchical encoder on three dimensions (row, column and time). In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Enhanced transformer model for data-to-text generation. In Proceedings of the 3rd Workshop on Neural Generation and Translation, Hong Kong.
- TeaForN: Teacher-forcing with n-grams. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Knowledge distillation: A survey. ArXiv, abs/2006.05525.
- Recurrent independent mechanisms. arXiv preprint arXiv:1909.10893.
- Professor forcing: A new algorithm for training recurrent networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems.
- The R-U-a-robot dataset: Helping avoid chatbot deception by detecting user questions about human or non-human identity. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- PRAL: A tailored pre-training model for task-oriented dialog generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).
- A knowledge-enhanced pretraining model for commonsense story generation. Transactions of the Association for Computational Linguistics, 8.
- Jian Guan and Minlie Huang. 2020a. Union: An unreferenced metric for evaluating open-ended story generation. arXiv preprint arXiv:2009.07602.
- Jian Guan and Minlie Huang. 2020b. UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Long text generation by modeling sentence-level and discourse-level coherence. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- OpenMEVA: A benchmark for evaluating open-ended story generation metrics. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Frank Guerin. 2021. Projection: A mechanism for human-like reasoning in artificial intelligence. arXiv preprint arXiv:2103.13512.
- Hongyu Guo. 2015. Generating text with deep reinforcement learning. arXiv preprint arXiv:1510.09202.
- Non-autoregressive neural machine translation with enhanced decoder input. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3723–3730.
- Controlling dialogue generation with semantic exemplars. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online.
- The FLORES evaluation datasets for low-resource machine translation: Nepali–English and Sinhala–English. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Md Akmal Haidar and Mehdi Rezagholizadeh. 2019. Textkd-gan: Text generation using knowledge distillation and generative adversarial networks. In Canadian conference on artificial intelligence, pages 107–118. Springer.
- Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Unifying human and statistical evaluation for natural language generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Quantifying exposure bias for neural language generation. ArXiv, abs/1905.10617.
- The curious case of neural text degeneration. In International Conference on Learning Representations.
- A survey of deep learning applied to story generation. In International Conference on Smart Computing and Communication, pages 1–10. Springer.
- Visual story post-editing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Toward controlled generation of text. In International Conference on Machine Learning, pages 1587–1596. PMLR.
- Argument generation with retrieval, planning, and realization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Automatic dialogue generation with expressed emotions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).
- Supervised word mover’s distance. Advances in neural information processing systems, 29:4862–4870.
- Improving Chinese story generation via awareness of syntactic dependencies and semantics. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 178–185, Online only. Association for Computational Linguistics.
- Knowledge graph-augmented abstractive summarization with semantic-driven cloze reward. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Visual storytelling. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1233–1239.
- Tatsuya Ide and Daisuke Kawahara. 2021. Multi-task learning of generation and classification for emotion-aware dialogue response generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop.
- Cobot in lambdamoo: A social statistics agent. In AAAI/IAAI.
- Mnnfast: A fast and scalable system architecture for memory-augmented neural networks. In Proceedings of the 46th International Symposium on Computer Architecture, pages 250–263.
- Language generation with multi-hop reasoning on commonsense knowledge graph. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- An information retrieval approach to short text conversation. ArXiv, abs/1408.6988.
- Robust pre-training by adversarial contrastive learning. In NeurIPS.
- Recent advances of neural text generation: Core tasks, datasets, models and challenges. Science China Technological Sciences, pages 1–21.
- Anjuli Kannan and Oriol Vinyals. 2017. Adversarial evaluation of dialogue models. arXiv preprint arXiv:1701.08198.
- Chris Kedzie and Kathleen McKeown. 2019. A good sample is hard to find: Noise injection sampling and self-training for neural language generation models. In Proceedings of the 12th International Conference on Natural Language Generation.
- Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings.
- Svetlana Kiritchenko and Saif M Mohammad. 2017. Best-worst scaling more reliable than rating scales: A case study on sentiment intensity annotation. arXiv preprint arXiv:1712.01765.
- Linguistically-informed specificity and semantic plausibility for dialogue generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Vaibhav Kumar and Alan W Black. 2020. ClarQ: A large-scale and diverse dataset for clarification question generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Generalizing image captions for image-text parallel corpus. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 790–796.
- Implicit unlikelihood training: Improving neural text generation with reinforcement learning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume.
- Building machines that learn and think like people. CoRR, abs/1604.00289.
- Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web, 6(2):167–195.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.
- Improving encoder by auxiliary supervision tasks for table-to-text generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5979–5989.
- Don’t say that! making inconsistent dialogue unlikely with unlikelihood training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons. arXiv preprint arXiv:1909.03087.
- Why attention? analyze bilstm deficiency and its remedies in the case of ner. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8236–8244.
- Conversations are not flat: Modeling the dynamic information flow across dialogue utterances. arXiv preprint arXiv:2106.02227.
- XGLUE: A new benchmark datasetfor cross-lingual pre-training, understanding and generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Xglue: A new benchmark datasetfor cross-lingual pre-training, understanding and generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6008–6018.
- Probabilistically masked language model capable of autoregressive generation in arbitrary word order. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Shellcode_IA32: A dataset for automatic shellcode generation. In Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021).
- CommonGen: A constrained text generation challenge for generative commonsense reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Hugo Liu and Push Singh. 2004. Conceptnet—a practical commonsense reasoning tool-kit. BT technology journal, 22(4):211–226.
- Data boost: Text data augmentation through reinforcement learning guided conditional generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Towards comprehensive description generation from factual attribute-value tables. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Yixin Liu and Pengfei Liu. 2021. SimCLS: A simple framework for contrastive learning of abstractive summarization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).
- A data-centric framework for composable NLP workflows. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.
- Chi-kiu Lo. 2017. MEANT 2.0: Accurate semantic MT evaluation for any output language. In Proceedings of the Second Conference on Machine Translation.
- Phonetically-grounded language generation: The case of tongue twisters. arXiv preprint arXiv:2306.03457.
- Neural text generation: Past, present and beyond. arXiv preprint arXiv:1803.07133.
- Make templates smarter: A template based Data2Text system powered by text stitch model. In Findings of the Association for Computational Linguistics: EMNLP 2020.
- FlowSeq: Non-autoregressive conditional sequence generation with generative flow. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Emma Manning. 2019. A partially rule-based approach to AMR generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop.
- Gary Marcus. 2020. The next decade in AI: four steps towards robust artificial intelligence. CoRR, abs/2002.06177.
- Shikib Mehri and Maxine Eskenazi. 2020. USR: An unsupervised and reference free evaluation metric for dialog generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Generation-distillation for efficient natural language understanding in low-data settings. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019).
- Step-by-step: Separating planning from realization in neural data-to-text generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- A corpus and cloze evaluation for deeper understanding of commonsense stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 839–849.
- Recent advances in deep learning based dialogue systems: A systematic survey. arXiv preprint arXiv:2105.04387.
- Operation-guided neural networks for high fidelity data-to-text generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
- I like fish, especially dolphins: Addressing contradictions in dialogue modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Reinforcement learning with imbalanced dataset for data-to-text medical report generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 2223–2236.
- Reinforcement learning with imbalanced dataset for data-to-text medical report generation. In Findings of the Association for Computational Linguistics: EMNLP 2020.
- Why we need new evaluation metrics for NLG. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
- The E2E dataset: New challenges for end-to-end generation. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue.
- RankME: Reliable human ratings for natural language generation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).
- IRT-based aggregation model of crowdsourced pairwise comparison for evaluating machine translations. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.
- Contrastive learning for many-to-many multilingual neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Towards holistic and automatic evaluation of open-domain dialogue generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
- ToTTo: A controlled table-to-text generation dataset. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- DORB: Dynamically optimizing multiple rewards with bandits. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Text generation with exemplar-based adaptive decoding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).
- Using meta-knowledge mined from identifiers to improve intent recognition in conversational systems. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- ProphetNet-X: Large-scale pre-training models for English, Chinese, multi-lingual, dialog, and code generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations.
- ProphetNet: Predicting future n-gram for sequence-to-SequencePre-training. In Findings of the Association for Computational Linguistics: EMNLP 2020.
- Glancing transformer for non-autoregressive neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Pre-trained models for natural language processing: A survey. Science China Technological Sciences, pages 1–26.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- PlotMachines: Outline-conditioned generation with dynamic plot state tracking. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Ehud Reiter. 2007. An architecture for data-to-text systems. In proceedings of the eleventh European workshop on natural language generation (ENLG 07), pages 97–104.
- Ehud Reiter. 2018. A structured review of the validity of BLEU. Computational Linguistics, 44(3).
- Ehud Reiter and Robert Dale. 1997. Building applied natural language generation systems. Natural Language Engineering, 3(1):57–87.
- Ehud Reiter and Robert Dale. 2000. Building Natural Language Generation Systems. Studies in Natural Language Processing. Cambridge University Press.
- Scalable and accurate dialogue state tracking via hierarchical sequence generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- The Code2Text challenge: Text generation in source libraries. In Proceedings of the 10th International Conference on Natural Language Generation.
- A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8.
- Object hallucination in image captioning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
- Gustavo H de Rosa and João P Papa. 2021. A survey on text generation using generative adversarial networks. Pattern Recognition, page 108098.
- Semeval-2017 task 4: Sentiment analysis in twitter. In Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pages 502–518.
- Keisuke Sakaguchi and Benjamin Van Durme. 2018. Efficient online scalar annotation with bounded support. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Sashank Santhanam and Samira Shaikh. 2019. A survey of natural language generation techniques with a focus on dialogue systems-past, present and future directions. arXiv preprint arXiv:1906.00500.
- Florian Schmidt and Thomas Hofmann. 2018. Deep state space models for unconditional word generation. In NeurIPS.
- Autoregressive text generation beyond feedback loops. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- BLEURT: Learning robust metrics for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- On accurate evaluation of gans for language generation. arXiv preprint arXiv:1806.04936.
- Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30.
- A hierarchical latent variable encoder-decoder model for generating dialogues. In Proceedings of the AAAI Conference on Artificial Intelligence.
- Fei Sha and Fernando Pereira. 2003. Shallow parsing with conditional random fields. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 213–220.
- Long and diverse text generation with planning-based hierarchical variational model. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2556–2565.
- Self-attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).
- Image-chat: Engaging grounded conversations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 5926–5936.
- An ensemble of retrieval-based and generation-based human-computer conversation systems. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18.
- A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In proceedings of the 24th ACM international on conference on information and knowledge management, pages 553–562.
- Conceptnet 5.5: An open multilingual graph of general knowledge. In Thirty-first AAAI conference on artificial intelligence.
- Does knowledge distillation really work? ArXiv, abs/2106.05945.
- Moviechats: Chat like humans in a closed domain. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6605–6619.
- Non-autoregressive text generation with pre-trained language models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume.
- End-to-end memory networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15.
- Dima Suleiman and Arafat Awajan. 2020. Deep learning based abstractive text summarization: Approaches, datasets, evaluation measures, and challenges. Mathematical Problems in Engineering, 2020.
- Adding chit-chat to enhance task-oriented dialogues. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- Lstm neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.
- EtriCA: Event-triggered context-aware story generation augmented by cross attention. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5504–5518, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Terminology-aware medical dialogue generation. arXiv preprint arXiv:2210.15551.
- NGEP: A graph-based event planning framework for story generation. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 186–193, Online only. Association for Computational Linguistics.
- Natural language generation for effective knowledge distillation. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019).
- Ruber: An unsupervised method for automatic evaluation of open-domain dialog systems. In Thirty-Second AAAI Conference on Artificial Intelligence.
- Craig Thomson and Ehud Reiter. 2021. Generation challenges: Results of the accuracy evaluation shared task. arXiv preprint arXiv:2108.05644.
- Learning to abstract for memory-augmented conversational response generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3816–3825.
- Charm: Inferring personal attributes from conversations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5391–5404.
- Wasserstein auto-encoders. In 6th International Conference on Learning Representations (ICLR 2018). OpenReview. net.
- Ruben S van Bergen and Nikolaus Kriegeskorte. 2020. Going in circles is the way forward: the role of recurrence in visual inference. Current Opinion in Neurobiology, 65:176–193. Whole-brain interactions between neural circuits.
- Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
- Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10):78–85.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP.
- T3: Tree-autoencoder constrained adversarial text generation for targeted attack. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- A template-guided hybrid pointer network for knowledge-based task-oriented dialogue systems. In Proceedings of the 1st Workshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc 2021).
- Wikigraphs: A wikipedia text-knowledge graph paired dataset. In Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15), pages 67–82.
- Yicheng Wang and Mohit Bansal. 2018. Robust machine comprehension models via adversarial training. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).
- Pete Warden. 2018. Speech commands: A dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209.
- Hierarchical quantized representations for script generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
- Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319.
- Memory networks. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Guiding variational response generator to exploit persona. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Personalized response generation via generative split memory network. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- KM-BART: Knowledge enhanced multimodal BART for visual commonsense generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- XLPT-AMR: Cross-lingual pre-training via multi-task learning for zero-shot AMR parsing and text generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Incorporating external knowledge through pre-training for natural language to code generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Discovering dialog structure graph for coherent dialog generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Personal information leakage detection in conversations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6567–6580.
- Agggen: Ordering and aggregating while generating. arXiv preprint arXiv:2106.05580.
- Reinforcement learning for abstractive question summarization with question-aware semantic rewards. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).
- A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526.
- Semi-supervised QA with generative domain-adaptive nets. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Plan-and-write: Towards better automatic storytelling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7378–7385.
- Keep CALM and explore: Language models for action generation in text-based games. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Denis Yarats and Mike Lewis. 2018. Hierarchical text generation and planning for strategic dialogue. In International Conference on Machine Learning, pages 5591–5599. PMLR.
- Variational hierarchical dialog autoencoder for dialog state tracking data augmentation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- STAIR captions: Constructing a large-scale Japanese image caption dataset. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).
- Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.
- Dirichlet latent variable hierarchical recurrent encoder-decoder in dialogue generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1267–1272.
- A hybrid model for globally coherent story generation. In Proceedings of the Second Workshop on Storytelling.
- Fast interleaved bidirectional sequence generation. In Proceedings of the Fifth Conference on Machine Translation.
- DynaEval: Unifying turn and dialogue level evaluation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Probabilistic verb selection for data-to-text generation. Transactions of the Association for Computational Linguistics, 6.
- Self-attention generative adversarial networks. In International conference on machine learning, pages 7354–7363. PMLR.
- Flexible and creative Chinese poetry generation using neural memory. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Quanshi Zhang and Song-Chun Zhu. 2018. Visual interpretability for deep learning: a survey. arXiv preprint arXiv:1802.00614.
- Bertscore: Evaluating text generation with bert. ICLR.
- Bridging the gap between training and inference for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
- Unsupervised concept representation learning for length-varying text similarity. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
- DIALOGPT : Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.
- A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence.
- Improve diverse text generation by self labeling conditional variational auto encoder. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2767–2771. IEEE.
- Cadge: Context-aware dialogue generation enhanced with graph-structured knowledge aggregation. arXiv preprint arXiv:2305.06294.
- Bridging the structural gap between encoding and decoding for data-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online.
- MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Learning a simple and effective model for multi-turn response generation with auxiliary tasks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Unsupervised context rewriting for open domain conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Learning from perturbations: Diverse and informative dialogue generation with inverse adversarial training. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Multi-task learning with language modeling for question generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
- Texygen: A benchmarking platform for text generation models. The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.
- A reinforced generation of adversarial examples for neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
- Chen Tang (94 papers)
- Frank Guerin (30 papers)
- Chenghua Lin (127 papers)