Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges (2405.15604v3)
Abstract: Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using LLMs, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text generation, summarization, translation, paraphrasing, and question answering. For each task, we review their relevant characteristics, sub-tasks, and specific challenges (e.g., missing datasets for multi-document summarization, coherence in story generation, and complex reasoning for question answering). Additionally, we assess current approaches for evaluating text generation systems and ascertain problems with current metrics. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications: bias, reasoning, hallucinations, misuse, privacy, interpretability, transparency, datasets, and computing. We provide a detailed analysis of these challenges, their potential solutions, and which gaps still require further engagement from the community. This systematic literature review targets two main audiences: early career researchers in natural language processing looking for an overview of the field and promising research directions, as well as experienced researchers seeking a detailed view of tasks, evaluation methodologies, open challenges, and recent mitigation strategies.
- The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 13141–13160. https://doi.org/10.18653/v1/2023.acl-long.734
- XAlign: Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages. In Companion Proceedings of the Web Conference 2022. ACM. https://doi.org/10.1145/3487553.3524265
- GPT-4 Technical Report.
- Balazs Aczel and Eric-Jan Wagenmakers. 2023. Transparency Guidance for ChatGPT Usage in Scientific Writing. https://doi.org/10.31234/osf.io/b58ex
- Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-Based Detection. In Advanced Information Networking and Applications. Springer International Publishing, 1341–1354. https://doi.org/10.1007/978-3-030-44041-1_114
- Automatic Story Generation: Challenges and Attempts. In Proceedings of the Third Workshop on Narrative Understanding. Association for Computational Linguistics, Virtual, 72–83. https://doi.org/10.18653/v1/2021.nuse-1.8
- Jointly Measuring Diversity and Quality in Text Generation Models. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation. Association for Computational Linguistics, Minneapolis, Minnesota, 90–98. https://doi.org/10.18653/v1/W19-2311
- SPICE: Semantic Propositional Image Caption Evaluation.
- Revisiting Machine Translation for Cross-lingual Classification.
- A large-scale computational study of content preservation measures for text style transfer and paraphrase generation. In Proc. of ACL. Association for Computational Linguistics, Dublin, Ireland, 300–321. https://doi.org/10.18653/v1/2022.acl-srw.23
- Quality Controlled Paraphrase Generation. In Proc. of ACL. Association for Computational Linguistics, Dublin, Ireland, 596–609. https://doi.org/10.18653/v1/2022.acl-long.45
- Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65–72.
- PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In Proc. of ACL. Association for Computational Linguistics, Online, 85–96. https://doi.org/10.18653/v1/2020.acl-main.9
- Paraphrase Detection: Human vs. Machine Content.
- Longformer: The Long-Document Transformer.
- A Neural Probabilistic Language Model. In Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS) 2000, Denver, CO, USA, Todd K. Leen, Thomas G. Dietterich, and Volker Tresp (Eds.). MIT Press, 932–938.
- Learning To Split and Rephrase From Wikipedia Edit History. In Proc. of EMNLP. Association for Computational Linguistics, Brussels, Belgium, 732–737. https://doi.org/10.18653/v1/D18-1080
- Few-Shot Learning for Opinion Summarization. In Proc. of EMNLP. Association for Computational Linguistics, Online, 4119–4135. https://doi.org/10.18653/v1/2020.emnlp-main.337
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).
- Intelligent Question Answering in Restricted Domains Using Deep Learning and Question Pair Matching. IEEE Access 8 (2020), 32922–32934. https://doi.org/10.1109/ACCESS.2020.2973728
- Further Meta-Evaluation of Machine Translation. In Proceedings of the Third Workshop on Statistical Machine Translation. Association for Computational Linguistics, Columbus, Ohio, 70–106.
- Extracting Training Data from Large Language Models.
- Controllable Paraphrase Generation with a Syntactic Exemplar. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 5972–5984. https://doi.org/10.18653/v1/P19-1599
- A review: Knowledge reasoning over knowledge graph. Expert Systems with Applications 141 (2020), 112948. https://doi.org/10.1016/j.eswa.2019.112948
- Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.34
- All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text. In Proc. of ACL. Association for Computational Linguistics, Online, 7282–7296. https://doi.org/10.18653/v1/2021.acl-long.565
- Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 2748–2760. https://doi.org/10.18653/v1/P19-1264
- Training Verifiers to Solve Math Word Problems.
- Automatic Generation of Natural Language Explanations. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion. ACM. https://doi.org/10.1145/3180308.3180366
- Chatting and Cheating. Ensuring academic integrity in the era of ChatGPT. (2024). https://doi.org/10.35542/osf.io/mrz8h
- Distilling Multiple Domains for Neural Machine Translation. In Proc. of EMNLP. Association for Computational Linguistics, Online, 4500–4511. https://doi.org/10.18653/v1/2020.emnlp-main.364
- Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation. In Proc. of EMNLP. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 7580–7605. https://doi.org/10.18653/v1/2021.emnlp-main.599
- Daniel C Dennett. 2013. The role of language in intelligence. Sprache und Denken/Language and Thought (2013), 42.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL-HLT. Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
- BOLD. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM. https://doi.org/10.1145/3442188.3445924
- George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research -. Association for Computational Linguistics, San Diego, California, 138. https://doi.org/10.3115/1289189.1289273
- William B. Dolan and Chris Brockett. 2005. Automatically Constructing a Corpus of Sentential Paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
- Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text. In Proc. of ACL. Association for Computational Linguistics, Dublin, Ireland, 7250–7274. https://doi.org/10.18653/v1/2022.acl-long.501
- Learning to Ask: Neural Question Generation for Reading Comprehension. In Proc. of ACL. Association for Computational Linguistics, Vancouver, Canada, 1342–1352. https://doi.org/10.18653/v1/P17-1123
- FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization. In Proc. of ACL. Association for Computational Linguistics, Online, 5055–5070. https://doi.org/10.18653/v1/2020.acl-main.454
- Elozino Egonmwan and Yllias Chali. 2019. Transformer and seq2seq model for Paraphrase Generation. In Proceedings of the 3rd Workshop on Neural Generation and Translation. Association for Computational Linguistics, Hong Kong, 249–255. https://doi.org/10.18653/v1/D19-5627
- Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 1074–1084. https://doi.org/10.18653/v1/P19-1102
- QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 2587–2601. https://doi.org/10.18653/v1/2022.naacl-main.187
- SummEval: Re-evaluating Summarization Evaluation. Transactions of the Association for Computational Linguistics 9 (2021), 391–409. https://doi.org/10.1162/tacl_a_00373
- TweepFake: About Detecting Deepfake Tweets. PLOS ONE 16, 5 (2021), e0251415. https://doi.org/10.1371/journal.pone.0251415
- Ranking Generated Summaries by Correctness: An Interesting but Challenging Application for Natural Language Inference. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 2214–2220. https://doi.org/10.18653/v1/P19-1213
- ELI5: Long Form Question Answering. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 3558–3567. https://doi.org/10.18653/v1/P19-1346
- Hierarchical Neural Story Generation. In Proc. of ACL. Association for Computational Linguistics, Melbourne, Australia, 889–898. https://doi.org/10.18653/v1/P18-1082
- A Systematic Literature Review on Text Generation Using Deep Neural Network Models. IEEE Access 10 (2022), 53490–53503. https://doi.org/10.1109/access.2022.3174108
- Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization. In Proc. of ACL. Association for Computational Linguistics, Online, 1479–1491. https://doi.org/10.18653/v1/2021.acl-long.117
- FairPrism: Evaluating Fairness-Related Harms in Text Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.343
- Mahak Gambhir and Vishal Gupta. 2017. Recent automatic text summarization techniques: a survey. Artificial Intelligence Review 47, 1 (2017), 1–66. https://doi.org/10.1007/s10462-016-9475-9
- Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. In 2018 IEEE Security and Privacy Workshops (SPW). IEEE. https://doi.org/10.1109/spw.2018.00016
- Difficulty Controllable Generation of Reading Comprehension Questions. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 4968–4974. https://doi.org/10.24963/ijcai.2019/690
- Unsupervised Contextual Paraphrase Generation using Lexical Control and Reinforcement Learning.
- Albert Gatt and Emiel Krahmer. 2018. Survey of the State of the Art in Natural Language Generation: Core Tasks, Applications and Evaluation. Journal of Artificial Intelligence Research 61 (2018), 65–170. https://doi.org/10.1613/jair.5477
- RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 3356–3369. https://doi.org/10.18653/v1/2020.findings-emnlp.301
- GLTR: Statistical Detection and Visualization of Generated Text. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 111–116. https://doi.org/10.18653/v1/P19-3019
- Affect-LM: A Neural Language Model for Customizable Affective Text Generation. In Proc. of ACL. Association for Computational Linguistics, Vancouver, Canada, 634–642. https://doi.org/10.18653/v1/P17-1059
- Debiasing Pre-Trained Language Models via Efficient Fine-Tuning. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion. Association for Computational Linguistics, Dublin, Ireland, 59–69. https://doi.org/10.18653/v1/2022.ltedi-1.8
- Using natural-language processing to produce weather forecasts. IEEE Expert 9, 2 (1994), 45–53. https://doi.org/10.1109/64.294135
- Assessing The Factual Accuracy of Generated Text. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 166–175. https://doi.org/10.1145/3292500.3330955
- A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges. Multimedia Tools and Applications (2023). https://doi.org/10.1007/s11042-023-15224-0
- Tanya Goyal and Greg Durrett. 2020. Evaluating Factuality in Generation with Dependency-level Entailment. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 3592–3603. https://doi.org/10.18653/v1/2020.findings-emnlp.322
- Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security. ACM, Copenhagen Denmark, 79–90. https://doi.org/10.1145/3605764.3623985
- Max Grusky. 2023. Rogue Scores. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.107
- Improving Controllable Text Generation with Position-Aware Weighted Decoding. In Findings of the Association for Computational Linguistics: ACL 2022. Association for Computational Linguistics, Dublin, Ireland, 3449–3467. https://doi.org/10.18653/v1/2022.findings-acl.272
- Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence. In Proc. of ACL. Association for Computational Linguistics, Online, 6379–6393. https://doi.org/10.18653/v1/2021.acl-long.499
- RecipeGPT: Generative Pre-training Based Cooking Recipe Generation and Evaluation System. In Companion Proceedings of the Web Conference 2020 (WWW ’20). Association for Computing Machinery, New York, NY, USA, 181–184. https://doi.org/10.1145/3366424.3383536
- SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.647
- Zellig S. Harris. 1954. Distributional Structure. WORD 10, 2-3 (1954), 146–162. https://doi.org/10.1080/00437956.1954.11659520
- On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.674
- Large Margin Neural Language Model. In Proc. of EMNLP. Association for Computational Linguistics, Brussels, Belgium, 1183–1191. https://doi.org/10.18653/v1/D18-1150
- Challenges in Building Intelligent Open-domain Dialog Systems.
- Reducing Sentiment Bias in Language Models via Counterfactual Evaluation. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 65–83. https://doi.org/10.18653/v1/2020.findings-emnlp.7
- Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg University, The Netherlands, 221–232. https://doi.org/10.18653/v1/W18-6529
- Automatic Detection of Machine Generated Text: A Critical Survey. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 2296–2309. https://doi.org/10.18653/v1/2020.coling-main.208
- Perplexity—a measure of the difficulty of speech recognition tasks. The Journal of the Acoustical Society of America 62, S1 (2005), S63. https://doi.org/10.1121/1.2016299
- Survey of Hallucination in Natural Language Generation. Comput. Surveys 55, 12 (2023), 1–38. https://doi.org/10.1145/3571730
- ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs.
- Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 8018–8025.
- Lisa Jin and Daniel Gildea. 2022. Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation. In Proc. of ACL. Association for Computational Linguistics, Dublin, Ireland, 710–715. https://doi.org/10.18653/v1/2022.acl-short.80
- CGMI: Configurable General Multi-Agent Interaction Framework.
- Daniel Jurafsky and James H. Martin. 2024. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3 ed.).
- Open-domain Dialogue Generation: What We Can Do, Cannot Do, And Should Do Next. In Proceedings of the 4th Workshop on NLP for Conversational AI. Association for Computational Linguistics, Dublin, Ireland, 148–165. https://doi.org/10.18653/v1/2022.nlp4convai-1.13
- Scaling Laws for Neural Language Models.
- Boris Katz. 1980. A Three-Step Procedure for Language Generation. (1980).
- Noriaki Kawamae. 2023. Friendly Conditional Text Generator. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. ACM. https://doi.org/10.1145/3539597.3570364
- Deep Reinforcement Learning for Sequence-to-Sequence Models. IEEE Transactions on Neural Networks and Learning Systems (2019), 1–21. https://doi.org/10.1109/tnnls.2019.2929141
- Nearest Neighbor Machine Translation. In Proc. of ICLR. OpenReview.net.
- Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing Systematic Literature Reviews in Software Engineering. 2 (2007).
- Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet. In Proceedings of the Eighth Conference on Machine Translation, Philipp Koehn, Barry Haddow, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, Singapore, 1–42. https://doi.org/10.18653/v1/2023.wmt-1.1
- Philipp Koehn and Rebecca Knowles. 2017. Six Challenges for Neural Machine Translation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics, Vancouver, 28–39. https://doi.org/10.18653/v1/W17-3204
- ETPC - A Paraphrase Identification Corpus Annotated with Extended Paraphrase Typology and Negation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan.
- Syntax-Guided Controlled Generation of Paraphrases. Transactions of the Association for Computational Linguistics 8 (2020), 329–345. https://doi.org/10.1162/tacl_a_00318
- Putting the Horse before the Cart: A Generator-Evaluator Framework for Question Generation from Text. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, 812–821. https://doi.org/10.18653/v1/K19-1076
- From Word Embeddings To Document Distances. In Proc. of ICML (JMLR Workshop and Conference Proceedings, Vol. 37), Francis R. Bach and David M. Blei (Eds.). JMLR.org, 957–966.
- A Continuously Growing Dataset of Sentential Paraphrases. In Proc. of EMNLP. Association for Computational Linguistics, Copenhagen, Denmark, 1224–1234. https://doi.org/10.18653/v1/D17-1126
- Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 212–222. https://doi.org/10.1145/3477495.3532001
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proc. of ACL. Association for Computational Linguistics, Online, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
- Multi-step Jailbreaking Privacy Attacks on ChatGPT.
- A Diversity-Promoting Objective Function for Neural Conversation Models. In Proc. of NAACL-HLT. Association for Computational Linguistics, San Diego, California, 110–119. https://doi.org/10.18653/v1/N16-1014
- Pretrained Language Models for Text Generation: A Survey.
- Pretrained Language Model for Text Generation: A Survey. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2021/612
- Contrastive Decoding: Open-ended Text Generation as Optimization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.687
- Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proc. of ACL. Association for Computational Linguistics, Online, 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353
- Yiyang Li and Hai Zhao. 2023. EM Pre-training for Multi-party Dialogue Response Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.7
- Open-Ended Long Text Generation via Masked Language Modeling. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.13
- CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 1823–1840. https://doi.org/10.18653/v1/2020.findings-emnlp.165
- Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81.
- A Survey of Transformers. ArXiv preprint abs/2106.04554 (2021).
- DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts. In Proc. of ACL. Association for Computational Linguistics, Online, 6691–6706. https://doi.org/10.18653/v1/2021.acl-long.522
- Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus. In Proc. of WWW, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 2032–2043. https://doi.org/10.1145/3366423.3380270
- AgentBench: Evaluating LLMs as Agents.
- Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proc. of EMNLP. Association for Computational Linguistics, Hong Kong, China, 3730–3740. https://doi.org/10.18653/v1/D19-1387
- Binary and Ternary Natural Language Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.5
- Automatic Generation of Pull Request Descriptions. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE. https://doi.org/10.1109/ase.2019.00026
- Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering. Advances in Neural Information Processing Systems 35 (2022), 2507–2521.
- Toward Human-Like Evaluation for Natural Language Generation with Error Analysis. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.324
- NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 780–799. https://doi.org/10.18653/v1/2022.naacl-main.57
- Multi-document Summarization via Deep Learning Techniques: A Survey. Comput. Surveys 55, 5 (2023), 1–37. https://doi.org/10.1145/3529754
- Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation. In Proc. of NAACL-HLT. Association for Computational Linguistics, New Orleans, Louisiana, 196–206. https://doi.org/10.18653/v1/N18-1018
- Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 13516–13524.
- GPT-too: A Language-Model-First Approach for AMR-to-Text Generation. In Proc. of ACL. Association for Computational Linguistics, Online, 1846–1852. https://doi.org/10.18653/v1/2020.acl-main.167
- A Survey on Document-level Neural Machine Translation: Methods and Evaluation. Comput. Surveys 54, 2 (2022), 1–36. https://doi.org/10.1145/3441691
- Benchmarking Large Language Model Capabilities for Conditional Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.511
- On Faithfulness and Factuality in Abstractive Summarization. In Proc. of ACL. Association for Computational Linguistics, Online, 1906–1919. https://doi.org/10.18653/v1/2020.acl-main.173
- The AMI meeting corpus. Int’l. Conf. on Methods and Techniques in Behavioral Research (2005).
- David Mcdonald. 1980. Natural Language Generation as a Process of Decision Making Under Constartints. (1980).
- Kathleen R. McKeown. 1982. The Text System for Natural Language Generation: An Overview. In 20th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Toronto, Ontario, Canada, 113–120. https://doi.org/10.3115/981251.981285
- Efficient Estimation of Word Representations in Vector Space.
- Model Cards for Model Reporting. ArXiv preprint abs/1810.03993.
- N. Moratanch and S. Chitrakala. 2017. A survey on extractive text summarization. In 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP). 1–6. https://doi.org/10.1109/ICCCSP.2017.7944061
- “That Is a Suspicious Reaction!”: Interpreting Logits Variation to Detect NLP Adversarial Attacks. In Proc. of ACL. Association for Computational Linguistics, Dublin, Ireland, 7806–7816. https://doi.org/10.18653/v1/2022.acl-long.538
- Nikahat Mulla and Prachi Gharpure. 2023. Automatic Question Generation: A Review of Methodologies, Datasets, Evaluation Metrics, and Applications. Progress in Artificial Intelligence 12, 1 (2023), 1–32. https://doi.org/10.1007/s13748-023-00295-9
- Through the Looking Glass: Learning to Attribute Synthetic Text Generated by Language Models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 1811–1822. https://doi.org/10.18653/v1/2021.eacl-main.155
- Nikita Munot and Sharvari S. Govilkar. 2014. Comparative Study of Text Summarization Methods. International Journal of Computer Applications 102, 12 (2014), 33–37. https://doi.org/10.5120/17870-8810
- Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Association for Computational Linguistics, Berlin, Germany, 280–290. https://doi.org/10.18653/v1/K16-1028
- Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization. In Proc. of EMNLP. Association for Computational Linguistics, Brussels, Belgium, 1797–1807. https://doi.org/10.18653/v1/D18-1206
- Jun-Ping Ng and Viktoria Abrecht. 2015. Better Summarization Evaluation with Word Embeddings for ROUGE. In Proc. of EMNLP. Association for Computational Linguistics, Lisbon, Portugal, 1925–1930. https://doi.org/10.18653/v1/D15-1222
- Jianmo Ni and Julian McAuley. 2018. Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations. In Proc. of ACL. Association for Computational Linguistics, Melbourne, Australia, 706–711. https://doi.org/10.18653/v1/P18-2112
- Unsupervised Domain Adaptation of Language Models for Reading Comprehension. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5392–5399.
- NIST Multimodal Information Group. 2010. NIST 2005 Open Machine Translation (OpenMT) Evaluation. https://doi.org/10.35111/7GAN-5J45 Artwork Size: 4947 KB Pages: 4947 KB.
- NIST Multimodal Information Group. 2013. NIST 2012 Open Machine Translation (OpenMT) Evaluation. https://doi.org/10.35111/EKV5-3297 Artwork Size: 3012 KB Pages: 3012 KB.
- Chitu Okoli. 2015. A Guide to Conducting a Standalone Systematic Literature Review. Communications of the Association for Information Systems 37 (2015). https://doi.org/10.17705/1CAIS.03743
- Training language models to follow instructions with human feedback.
- Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 4812–4829. https://doi.org/10.18653/v1/2021.naacl-main.383
- Bleu: a Method for Automatic Evaluation of Machine Translation. In Proc. of ACL. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311–318. https://doi.org/10.3115/1073083.1073135
- A Deep Reinforced Model for Abstractive Summarization. In Proc. of ICLR.
- Nanyun (Violet) Peng. 2022. Controllable Text Generation for Open-Domain Creativity and Fairness. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2022/818
- Maja Popović. 2017. chrF++: words helping character n-grams. In Proceedings of the Second Conference on Machine Translation. Association for Computational Linguistics, Copenhagen, Denmark, 612–618. https://doi.org/10.18653/v1/W17-4770
- Focused Attention Improves Document-Grounded Generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 4274–4287. https://doi.org/10.18653/v1/2021.naacl-main.338
- David Premack. 2004. Is language the key to human intelligence? Science 303, 5656 (2004), 318–320.
- Generation of Company descriptions using concept-to-text and text-to-text deep models: dataset collection and systems evaluation. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg University, The Netherlands, 254–263. https://doi.org/10.18653/v1/W18-6532
- Introduction to the Special Issue on Summarization. Computational Linguistics 28, 4 (2002), 399–408. https://doi.org/10.1162/089120102762671927
- Language Models are Unsupervised Multitask Learners. (2019).
- Know What You Don’t Know: Unanswerable Questions for SQuAD. In Proc. of ACL. Association for Computational Linguistics, Melbourne, Australia, 784–789. https://doi.org/10.18653/v1/P18-2124
- SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proc. of EMNLP. Association for Computational Linguistics, Austin, Texas, 2383–2392. https://doi.org/10.18653/v1/D16-1264
- AI’s next frontier: The rise of ChatGPT and its implications on society, industry, and scientific research. 44 (2023), 131–148.
- Ehud Reiter and Robert Dale. 1997. Building applied natural language generation systems. Natural Language Engineering 3, 1 (1997), 57–87. https://doi.org/10.1017/S1351324997001502
- Automatic Generation of Technical Documentation. Applied Artificial Intelligence 9, 3 (1995), 259–287. https://doi.org/10.1080/08839519508945476 Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/08839519508945476.
- Cross-Domain Detection of GPT-2-Generated Technical Text. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 1213–1233. https://doi.org/10.18653/v1/2022.naacl-main.88
- QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension. Comput. Surveys 55, 10 (2023), 1–45. https://doi.org/10.1145/3560260
- Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP. Transactions of the Association for Computational Linguistics 9 (2021), 1408–1424. https://doi.org/10.1162/tacl_a_00434
- The Limitations of Stylometry for Detecting Machine-Generated Fake News. Computational Linguistics 46, 2 (2020), 499–510. https://doi.org/10.1162/coli_a_00380
- BLEURT: Learning Robust Metrics for Text Generation. In Proc. of ACL. Association for Computational Linguistics, Online, 7881–7892. https://doi.org/10.18653/v1/2020.acl-main.704
- Controllable Text Generation Using Semantic Control Grammar. IEEE Access 11 (2023), 26329–26343. https://doi.org/10.1109/access.2023.3252017
- Multilingual Instruction Tuning With Just a Pinch of Multilinguality.
- KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022. AAAI Press, 11294–11302.
- On the Evaluation Metrics for Paraphrase Generation. In Proc. of EMNLP. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 3178–3190.
- The Woman Worked as a Babysitter: On Biases in Language Generation. In Proc. of EMNLP. Association for Computational Linguistics, Hong Kong, China, 3407–3412. https://doi.org/10.18653/v1/D19-1339
- Natural Language to Code Translation with Execution. In Proc. of EMNLP. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 3533–3546.
- Rethinking Interpretability in the Era of LLMs.
- Gabriel Skantze. 2021. Turn-taking in Conversational Systems and Human-Robot Interaction: A Review. Computer Speech & Language 67 (2021), 101178. https://doi.org/10.1016/j.csl.2020.101178
- Congzheng Song and Vitaly Shmatikov. 2019. Auditing Data Provenance in Text-Generation Models. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 196–206. https://doi.org/10.1145/3292500.3330885
- Daniel Sonntag. 2004. Assessing the quality of natural language text data. Gesellschaft für Informatik e.V., 259–263.
- A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.32
- Mitigating Gender Bias in Natural Language Processing: Literature Review. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 1630–1640. https://doi.org/10.18653/v1/P19-1159
- Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family.
- XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages. In Proceedings of the ACM Web Conference 2023. ACM. https://doi.org/10.1145/3543507.3583405
- Efficient Transformers: A Survey. ArXiv preprint abs/2009.06732 (2020).
- Gemini: A Family of Highly Capable Multimodal Models.
- The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models. In Proc. of EMNLP. Association for Computational Linguistics, Online, 107–118. https://doi.org/10.18653/v1/2020.emnlp-demos.15
- Automatic Detection of Bot-generated Tweets. In Proceedings of the 1st International Workshop on Multimedia AI against Disinformation. ACM. https://doi.org/10.1145/3512732.3533584
- LLaMA: Open and Efficient Foundation Language Models.
- TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 2001–2016. https://doi.org/10.18653/v1/2021.findings-emnlp.172
- Difficulty-Controllable Neural Question Generation for Reading Comprehension using Item Response Theory. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). Association for Computational Linguistics, Toronto, Canada, 119–129. https://doi.org/10.18653/v1/2023.bea-1.10
- Improving Robustness of Machine Translation with Synthetic Noise. In Proc. of NAACL-HLT. Association for Computational Linguistics, Minneapolis, Minnesota, 1916–1920. https://doi.org/10.18653/v1/N19-1190
- Identifying Meaningful Citations.
- Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008.
- CIDEr: Consensus-based image description evaluation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. IEEE Computer Society, 4566–4575. https://doi.org/10.1109/CVPR.2015.7299087
- Corpus annotation with paraphrase types: new annotation scheme and inter-annotator agreement measures. Language Resources and Evaluation 49, 1 (2015), 77–105. https://doi.org/10.1007/s10579-014-9272-5
- Is This a Paraphrase? What Kind? Paraphrase Boundaries and Typology. Open Journal of Modern Linguistics 04, 01 (2014), 205–218. https://doi.org/10.4236/ojml.2014.41016
- Prompting PaLM for Translation: Assessing Strategies and Performance.
- Paraphrase Types for Generation and Detection.
- We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 12896–12913. https://doi.org/10.18653/v1/2023.emnlp-main.797
- Identifying Machine-Paraphrased Plagiarism. In Information for a Better World: Shaping the Global Future: 17th International Conference, iConference 2022, Virtual Event, February 28 – March 4, 2022, Proceedings, Part I. Springer-Verlag, Berlin, Heidelberg, 393–413. https://doi.org/10.1007/978-3-030-96957-8_34
- How Large Language Models are Transforming Machine-Paraphrase Plagiarism. In Proc. of EMNLP. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 952–963.
- Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection. ArXiv preprint abs/2103.12450.
- AI Usage Cards: Responsibly Reporting AI-generated Content.
- Document-Level Machine Translation with Large Language Models.
- A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, Jérôme Lang (Ed.). ijcai.org, 4453–4460. https://doi.org/10.24963/ijcai.2018/619
- Bilateral Multi-Perspective Matching for Natural Language Sentences. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, Carles Sierra (Ed.). ijcai.org, 4144–4150. https://doi.org/10.24963/ijcai.2017/579
- Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration.
- Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints. In Proc. of ACL. Association for Computational Linguistics, Online, 1072–1086. https://doi.org/10.18653/v1/2020.acl-main.101
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.
- On Decoding Strategies for Neural Text Generators. Transactions of the Association for Computational Linguistics 10 (2022), 997–1012. https://doi.org/10.1162/tacl_a_00502
- A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proc. of NAACL-HLT. Association for Computational Linguistics, New Orleans, Louisiana, 1112–1122. https://doi.org/10.18653/v1/N18-1101
- Terry Winograd. 1972. The Automatic Generation of Natural Language Texts. Artificial Intelligence 3, 3-4 (1972), 185–231.
- Wen Xiao and Giuseppe Carenini. 2019. Extractive Summarization of Long Documents by Combining Global and Local Context. In Proc. of EMNLP. Association for Computational Linguistics, Hong Kong, China, 3011–3021. https://doi.org/10.18653/v1/D19-1298
- A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond.
- SQL-to-Text Generation with Graph-to-Sequence Model. In Proc. of EMNLP. Association for Computational Linguistics, Brussels, Belgium, 931–936. https://doi.org/10.18653/v1/D18-1112
- Neural Response Generation via GAN with an Approximate Embedding Layer. In Proc. of EMNLP. Association for Computational Linguistics, Copenhagen, Denmark, 617–626. https://doi.org/10.18653/v1/D17-1065
- Learning Structural Information for Syntax-Controlled Paraphrase Generation. In Findings of the Association for Computational Linguistics: NAACL 2022. Association for Computational Linguistics, Seattle, United States, 2079–2090. https://doi.org/10.18653/v1/2022.findings-naacl.160
- Kevin Yang and Dan Klein. 2021. FUDGE: Controlled Text Generation With Future Discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 3511–3535. https://doi.org/10.18653/v1/2021.naacl-main.276
- Tailor: A Soft-Prompt-Based Approach to Attribute-Based Controlled Text Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.25
- Re3: Generating Longer Stories With Recursive Reprompting and Revision. In Proc. of EMNLP. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 4393–4479.
- XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 5754–5764.
- Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication.
- A Survey of Knowledge-enhanced Text Generation. Comput. Surveys 54, 11s (2022), 1–38. https://doi.org/10.1145/3512467
- Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces. ACM. https://doi.org/10.1145/3490099.3511105
- BARTScore: Evaluating Generated Text as Text Generation. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 27263–27277.
- Machine Comprehension by Text-to-Text Neural Question Generation. In Proceedings of the 2nd Workshop on Representation Learning for NLP. Association for Computational Linguistics, Vancouver, Canada, 15–25. https://doi.org/10.18653/v1/W17-2603
- Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.74
- HellaSwag: Can a Machine Really Finish Your Sentence?. In Proc. of ACL. Association for Computational Linguistics, Florence, Italy, 4791–4800. https://doi.org/10.18653/v1/P19-1472
- Defending Against Neural Fake News. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 9051–9062.
- Prompting Large Language Model for Machine Translation: A Case Study.
- Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, 789–797. https://doi.org/10.18653/v1/K19-1074
- PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In Proc. of ICML (Proceedings of Machine Learning Research, Vol. 119). PMLR, 11328–11339.
- JiaJun Zhang and ChengQing Zong. 2020. Neural machine translation: Challenges, progress and future. Science China Technological Sciences 63, 10 (2020), 2028–2050. https://doi.org/10.1007/s11431-020-1632-x
- BERTScore: Evaluating Text Generation with BERT. In Proc. of ICLR. OpenReview.net.
- Aspect Sentiment Quad Prediction as Paraphrase Generation. In Proc. of EMNLP. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 9209–9219. https://doi.org/10.18653/v1/2021.emnlp-main.726
- DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation. In Proc. of ACL. Association for Computational Linguistics, Online, 270–278. https://doi.org/10.18653/v1/2020.acl-demos.30
- POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training. In Proc. of EMNLP. Association for Computational Linguistics, Online, 8649–8670. https://doi.org/10.18653/v1/2020.emnlp-main.698
- Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. In Proc. of NAACL-HLT. Association for Computational Linguistics, New Orleans, Louisiana, 15–20. https://doi.org/10.18653/v1/N18-2003
- MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance. In Proc. of EMNLP. Association for Computational Linguistics, Hong Kong, China, 563–578. https://doi.org/10.18653/v1/D19-1053
- A Survey of Large Language Models.
- An Invariant Learning Characterization of Controlled Text Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.179
- Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation. In Proc. of EMNLP. Association for Computational Linguistics, Brussels, Belgium, 3188–3197. https://doi.org/10.18653/v1/D18-1357
- Neural Deepfake Detection with Factual Structure of Text. In Proc. of EMNLP. Association for Computational Linguistics, Online, 2461–2470. https://doi.org/10.18653/v1/2020.emnlp-main.193
- Detecting Hallucinated Content in Conditional Neural Sequence Generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 1393–1404. https://doi.org/10.18653/v1/2021.findings-acl.120
- A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 194–203. https://doi.org/10.18653/v1/2020.findings-emnlp.19
- Texygen: A Benchmarking Platform for Text Generation Models. In Proc. of SIGIR, Kevyn Collins-Thompson, Qiaozhu Mei, Brian D. Davison, Yiqun Liu, and Emine Yilmaz (Eds.). ACM, 1097–1100. https://doi.org/10.1145/3209978.3210080
- ToolQA: A Dataset for LLM Question Answering with External Tools.
- Jonas Becker (8 papers)
- Jan Philip Wahle (31 papers)
- Bela Gipp (98 papers)
- Terry Ruas (46 papers)