ChatGPT vs Human-authored Text: Insights into Controllable Text Summarization and Sentence Style Transfer (2306.07799v2)
Abstract: Large-scale LLMs, like ChatGPT, have garnered significant media attention and stunned the public with their remarkable capacity for generating coherent text from short natural language prompts. In this paper, we aim to conduct a systematic inspection of ChatGPT's performance in two controllable generation tasks, with respect to ChatGPT's ability to adapt its output to different target audiences (expert vs. layman) and writing styles (formal vs. informal). Additionally, we evaluate the faithfulness of the generated text, and compare the model's performance with human-authored texts. Our findings indicate that the stylistic variations produced by humans are considerably larger than those demonstrated by ChatGPT, and the generated texts diverge from human samples in several characteristics, such as the distribution of word types. Moreover, we observe that ChatGPT sometimes incorporates factual errors or hallucinations when adapting the text to suit a specific style.
- Aspect-controllable opinion summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6578–6593, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Generating scientific definitions with controllable complexity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8298–8317, Dublin, Ireland. Association for Computational Linguistics.
- A large-scale computational study of content preservation measures for text style transfer and paraphrase generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 300–321, Dublin, Ireland. Association for Computational Linguistics.
- David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (ai): Understanding the potential benefits of chatgpt in promoting teaching and learning. Available at SSRN 4337484.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. ArXiv, abs/2302.04023.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023.
- Hallucinated but factual! inspecting the factuality of hallucinations in abstractive summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3340–3354, Dublin, Ireland. Association for Computational Linguistics.
- Expertise style transfer: A new task towards better communication between experts and laymen. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1061–1071, Online. Association for Computational Linguistics.
- Fine-grained controllable text generation using non-residual prompting. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6837–6857, Dublin, Ireland. Association for Computational Linguistics.
- Jeanne Sternlicht Chall and Edgar Dale. 1995. Readability revisited: The new Dale-Chall readability formula. Brookline Books.
- Controllable summarization with constrained Markov decision process. Transactions of the Association for Computational Linguistics, 9:1213–1232.
- Contextual text style transfer. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2915–2924, Online. Association for Computational Linguistics.
- Meri Coleman and Ta Lin Liau. 1975. A computer readability formula designed for machine scoring. Journal of Applied Psychology, 60(2):283.
- Topic-guided abstractive multi-document summarization. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1463–1472, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Chataug: Leveraging chatgpt for text data augmentation. arXiv preprint arXiv:2302.13007.
- Style transformer: Unpaired text style transfer without disentangled latent representation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5997–6007, Florence, Italy. Association for Computational Linguistics.
- Plug and play language models: A simple approach to controlled text generation. arXiv preprint arXiv:1912.02164.
- Plug and play language models: A simple approach to controlled text generation. ArXiv, abs/1912.02164.
- Learning controllable content generators. In 2021 IEEE Conference on Games (CoG), pages 1–9. IEEE.
- Controllable abstractive summarization. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 45–54, Melbourne, Australia. Association for Computational Linguistics.
- Finding Factual Inconsistencies in Abstractive Summaries. Universität Hamburg.
- Generating abstractive summaries with finetuned language models. In Proceedings of the 12th International Conference on Natural Language Generation, pages 516–522, Tokyo, Japan. Association for Computational Linguistics.
- How does chatgpt perform on the united states medical licensing examination? the implications of large language models for medical education and knowledge assessment. JMIR Medical Education, 9(1):e45312.
- Making science simple: Corpora for the lay summarisation of scientific literature. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10589–10604, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Tanya Goyal and Greg Durrett. 2020. Evaluating factuality in generation with dependency-level entailment. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3592–3603, Online. Association for Computational Linguistics.
- A distributional lens for multi-aspect controllable text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1023–1043, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- CTRLsum: Towards generic controllable text summarization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5879–5915, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Francis Heylighen and Jean-Marc Dewaele. 1999. Formality of language: definition, measurement and behavioral determinants. Interner Bericht, Center “Leo Apostel”, Vrije Universiteit Brüssel, 4.
- Syntax matters! syntax-controlled in text style transfer. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 566–575, Held Online. INCOMA Ltd.
- Zhiting Hu and Li Erran Li. 2021. A causal lens for controllable text generation. Advances in Neural Information Processing Systems, 34:24941–24955.
- Is chatgpt a good translator? a preliminary study. arXiv preprint arXiv:2301.08745.
- Deep learning for text style transfer: A survey. Computational Linguistics, 48(1):155–205.
- Ctrl: A conditional transformer language model for controllable generation. ArXiv, abs/1909.05858.
- Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, Naval Technical Training Command Millington TN Research Branch.
- Gradient-based constrained sampling from language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2251–2277, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- SummaC: Re-visiting NLI-based models for inconsistency detection in summarization. Transactions of the Association for Computational Linguistics, 10:163–177.
- Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning. arXiv preprint arXiv:2304.05613.
- Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems, 35:4328–4343.
- DGST: a dual-generator network for text style transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7131–7136, Online. Association for Computational Linguistics.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- PSP: Pre-trained soft prompts for few-shot abstractive summarization. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6355–6368, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Length control in abstractive summarization by pretraining information selection. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6885–6895, Dublin, Ireland. Association for Computational Linguistics.
- Readability controllable biomedical document summarization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4667–4680, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Unsupervised text style transfer with padded masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8671–8680, Online. Association for Computational Linguistics.
- Philip M McCarthy and Scott Jarvis. 2010. Mtld, vocd-d, and hd-d: A validation study of sophisticated approaches to lexical diversity assessment. Behavior research methods, 42(2):381–392.
- Evaluating style transfer for text. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 495–504, Minneapolis, Minnesota. Association for Computational Linguistics.
- Nasim Nouri. 2022. Text style transfer via optimal transport. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2532–2541, Seattle, United States. Association for Computational Linguistics.
- OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
- Towards making the most of chatgpt for machine translation. Available at SSRN 4390455.
- Two-stage movie script summarization: An efficient method for low-resource long document summarization. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, pages 57–66, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Dongqi Pu and Khalil Sima’an. 2022. Passing parser uncertainty to the transformer: Labeled dependency distributions for neural machine translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 41–50, Ghent, Belgium. European Association for Machine Translation.
- Incorporating distributions of discourse structure for long document abstractive summarization. arXiv preprint arXiv:2305.16784.
- Is chatgpt a general-purpose natural language processing task solver? arXiv preprint arXiv:2302.06476.
- So different yet so alike! constrained unsupervised text style transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 416–431, Dublin, Ireland. Association for Computational Linguistics.
- Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may I introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140, New Orleans, Louisiana. Association for Computational Linguistics.
- A recipe for arbitrary text style transfer with large language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 837–848, Dublin, Ireland. Association for Computational Linguistics.
- Chatgpt: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1).
- Interpretable multi-headed attention for abstractive summarization at controllable lengths. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6871–6882, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancouver, Canada. Association for Computational Linguistics.
- SentBS: Sentence-level beam search for controllable summarization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10256–10265, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- MReD: A meta-review dataset for structure-controllable text generation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2521–2535, Dublin, Ireland. Association for Computational Linguistics.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface. arXiv preprint arXiv:2303.17580.
- An analysis of the automatic bug fixing performance of chatgpt. arXiv preprint arXiv:2301.08653.
- Ewoenam Kwaku Tokpo and Toon Calders. 2022. Text style transfer for bias mitigation using masked language modeling. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, pages 163–171, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
- Attention is all you need. Advances in neural information processing systems, 30.
- Cross-lingual summarization via chatgpt. arXiv preprint arXiv:2302.14229.
- Zero-shot cross-lingual summarization via large language models.
- Colin G West. 2023. Ai and the fci: Can chatgpt project an understanding of introductory physics? arXiv preprint arXiv:2303.01067.
- Controllable abstractive dialogue summarization with sketch supervision. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 5108–5122, Online. Association for Computational Linguistics.
- Tailor: A prompt-based approach to attribute-based controlled text generation. arXiv preprint arXiv:2204.13362.
- Exploring the limits of chatgpt for query or aspect-based text summarization. arXiv preprint arXiv:2302.08081.
- Mm-react: Prompting chatgpt for multimodal reasoning and action. arXiv preprint arXiv:2303.11381.
- Hanqing Zhang and Dawei Song. 2022. DisCup: Discriminator cooperative unlikelihood prompt-tuning for controllable text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3392–3406, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- A survey of controllable text generation using transformer-based pre-trained language models. arXiv preprint arXiv:2201.05337.
- A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419.
- Multimodal text style transfer for outdoor vision-and-language navigation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1207–1221, Online. Association for Computational Linguistics.
- Vera Demberg (48 papers)
- Dongqi Liu (6 papers)