Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End (2212.10522v2)
Abstract: We consider the end-to-end abstract-to-title generation problem, exploring seven recent transformer based models (including ChatGPT) fine-tuned on more than 30k abstract-title pairs from NLP and ML venues. As an extension, we also consider the harder problem of generating humorous paper titles. For the latter, we compile the first large-scale humor annotated dataset for scientific papers in the NLP/ML domains, comprising almost ~2.6k titles. We evaluate all models using human and automatic metrics. Our human evaluation suggests that our best end-to-end system performs similarly to human authors (but arguably slightly worse). Generating funny titles is more difficult, however, and our automatic systems clearly underperform relative to humans and often learn dataset artefacts of humor. Finally, ChatGPT, without any fine-tuning, performs on the level of our best fine-tuned system.
- Learning to learn by gradient descent by gradient descent. Advances in neural information processing systems, 29.
- Location name disambiguation exploiting spatial proximity and temporal consistency. In SocialNLP 2015@NAACL - 3rd International Workshop on Natural Language Processing for Social Media, Proceedings of the Workshop, SocialNLP 2015@NAACL - 3rd International Workshop on Natural Language Processing for Social Media, Proceedings of the Workshop, pages 1–9. Association for Computational Linguistics (ACL). Publisher Copyright: © 2015 Association for Computational Linguistics; 3rd Workshop on Natural Language Processing for Social Media, SocialNLP 2015, associated with NAACL 2015 ; Conference date: 05-06-2015.
- Samaneh Azadi and Suvrit Sra. 2014. Towards an optimal stochastic alternating direction method of multipliers. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 620–628, Bejing, China. PMLR.
- The shattered gradients problem: If resnets are the answer, then what is the question?
- Did ai get more negative recently? Royal Society Open Science, 10.
- Jonas Belouadi and Steffen Eger. 2023. Bygpt5: End-to-end style-conditioned poetry generation with token-free language models. In ACL.
- SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3615–3620, Hong Kong, China. Association for Computational Linguistics.
- Longformer: The long-document transformer.
- Hugh L. Burns. 1979. Stimulating rhetorical invention in english composition through computer-assisted instruction.
- Towards multimodal sarcasm detection (an _Obviously_ perfect paper). In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4619–4629, Florence, Italy. Association for Computational Linguistics.
- Cpr: Classifier-projection regularization for continual learning. CoRR, abs/2006.07326.
- German’s next language model. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6788–6796, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Yanran Chen and Steffen Eger. 2022. Menli: Robust evaluation metrics from natural language inference. ArXiv, abs/2208.07316.
- BAM! born-again multi-task networks for natural language understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5931–5937, Florence, Italy. Association for Computational Linguistics.
- Matt Crane. 2018. Questionable answers in question answering research: Reproducibility and variability of published results. Transactions of the Association for Computational Linguistics, 6:241–252.
- Margherita Dore. 2019. Humour in Audiovisual Translation: Theories and Applications. Routledge, New York.
- Successive prompting for decomposing complex questions. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1251–1265, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Pablo Duboue and Jennifer Chu-Carroll. 2006. Answering the question you wish they had asked: The impact of paraphrasing for question answering. In North American Chapter of the Association for Computational Linguistics.
- Go simple and pre-train on domain-specific corpora: On the role of training data for text classification. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5522–5529, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Summeval: Re-evaluating summarization evaluation. Transactions of the Association for Computational Linguistics, 9:391–409.
- The impact of article length on the number of future citations: a bibliometric analysis of general medicine journals. PLoS One, 8(2):e49476.
- Results of WMT22 metrics shared task: Stop using BLEU – neural metrics are better and more robust. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 46–68, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Fumiyo Fukumoto and Yoshimi Suzuki. 2004. A comparison of manual and automatic constructions of category hierarchy for classifying large corpora. In Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004, pages 65–72, Boston, Massachusetts, USA. Association for Computational Linguistics.
- Witches’ brew: Industrial scale data poisoning via gradient matching.
- Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 30–45, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Ajda Gokcen and Marie-Catherine de Marneffe. 2015. I do not disagree: leveraging monolingual alignment to detect disagreement in dialogue. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 94–99, Beijing, China. Association for Computational Linguistics.
- You get what you share: Incentives for a sharing economy. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):2004–2011.
- HydraSum: Disentangling style features in text summarization with multi-decoder models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 464–479, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- The workweek is the best time to start a family – a study of GPT-2 based claim generation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 528–544, Online. Association for Computational Linguistics.
- James Hartley. 2008. Academic writing and publishing: A practical handbook. Routledge.
- Pun generation with surprise. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1734–1744, Minneapolis, Minnesota. Association for Computational Linguistics.
- Generate, annotate, and learn: Nlp with synthetic text. Transactions of the Association for Computational Linguistics, 10:826–842.
- If this title is funny, will you cite me? citation impacts of humour and other features of article titles in ecology and evolution. bioRxiv.
- Tom Heskes. 1996. Balancing between bagging and bumping. In Advances in Neural Information Processing Systems, volume 9. MIT Press.
- TDMSci: A specialized corpus for scientific literature entity tagging of tasks datasets and metrics. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 707–714, Online. Association for Computational Linguistics.
- Empowering language models with knowledge graph reasoning for question answering.
- Increasing visual awareness in multimodal neural machine translation from an information theoretic perspective. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6755–6764, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- SentiLARE: Sentiment-aware language representation learning with linguistic knowledge. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6975–6988, Online. Association for Computational Linguistics.
- Thieves on sesame street! model extraction of bert-based apis. ArXiv, abs/1910.12366.
- Joel Lang and Mirella Lapata. 2010. Unsupervised induction of semantic roles. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 939–947, Los Angeles, California. Association for Computational Linguistics.
- Pneg: Prompt-based negative response generation for dialogue response selection task. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10692–10703, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- Grant Lewison and James Hartley. 2005. What’s in a title? numbers of words and the presence of colons. Scientometrics, 63(2):341–356.
- Generating a related work section for scientific papers: an optimized approach with adopting problem and method information. Scientometrics, 127(8):4397–4417.
- SHAO LI. 2004. Integrating context and transliteration to mine new word translations from comparable corpora.
- Wanli Li and Tieyun Qian. 2022. Graph-based model generation for few-shot relation extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 62–71, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
- Global encoding for abstractive summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 163–169, Melbourne, Australia. Association for Computational Linguistics.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In International Conference on Learning Representations.
- Jordan J Louviere and George G Woodworth. 1991. Best-worst scaling: A model for the largest difference judgments. Technical report, Working paper.
- Results of the WMT18 metrics shared task: Both characters and embeddings achieve good performance. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 671–688, Belgium, Brussels. Association for Computational Linguistics.
- Rada Mihalcea and Carlo Strapparava. 2006. Learning to laugh (automatically): Computational models for humor recognition. Computational Intelligence, 22(2):126–142.
- Automatic title generation for text with pre-trained transformer language model. In 2021 IEEE 15th International Conference on Semantic Computing (ICSC), pages 17–24. IEEE.
- Scigen: a dataset for reasoning-aware text generation from scientific tables. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).
- Gender differences in humor appreciation. Humor, 1(3):231–244.
- Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
- Vivi Nastase and Marius Popescu. 2009. What’s in a name? In some languages, grammatical gender. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1368–1377, Singapore. Association for Computational Linguistics.
- Automatic summarization. Foundations and Trends® in Information Retrieval, 5(2–3):103–233.
- Ulrike Padó. 2016. Get semantic with me! the usefulness of different feature types for short-answer grading. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2186–2195, Osaka, Japan. The COLING 2016 Organizing Committee.
- Alexander Pak and Patrick Paroubek. 2010. Twitter based system: Using twitter for disambiguating sentiment ambiguous adjectives. In Proceedings of the 5th International Workshop on Semantic Evaluation, SemEval ’10, page 436–439, USA. Association for Computational Linguistics.
- (almost) no label no cry. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc.
- Charuta Pethe and Steve Skiena. 2019. The trumpiest trump? identifying a subject’s most characteristic tweets. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1653–1663, Hong Kong, China. Association for Computational Linguistics.
- Laughing heads: Can transformers detect what makes a sentence funny? In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021, pages 3899–3905. ijcai.org.
- NYTWIT: A dataset of novel words in the New York Times. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6509–6515, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Jan Wira Gotama Putra and Masayu Leylia Khodra. 2017. Automatic title generation in scientific articles for authorship assistance: a summarization approach. Journal of ICT Research and Applications, 11(3):253–267.
- Language models are unsupervised multitask learners.
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
- Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784–789, Melbourne, Australia. Association for Computational Linguistics.
- Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may I introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140, New Orleans, Louisiana. Association for Computational Linguistics.
- COMET: A neural framework for MT evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2685–2702, Online. Association for Computational Linguistics.
- It’s a contradiction – no, it’s not: A case study using functional relations. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 11–20, Honolulu, Hawaii. Association for Computational Linguistics.
- Taming the wild: A unified analysis of hogwild!-style algorithms.
- Word embeddings based on fixed-size ordinally forgetting encoding. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 310–315, Copenhagen, Denmark. Association for Computational Linguistics.
- Alexandra Schofield and David Mimno. 2016. Comparing apples to apple: The effects of stemmers on topic models. Transactions of the Association for Computational Linguistics, 4:287–300.
- Facenet: A unified embedding for face recognition and clustering. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823.
- Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancouver, Canada. Association for Computational Linguistics.
- How did this get funded?! Automatically identifying quirky scientific achievements. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 14–28, Online. Association for Computational Linguistics.
- The curse of recursion: Training on generated data makes models forget. ArXiv, abs/2305.17493.
- Predicting humorousness and metaphor novelty with Gaussian process preference learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pages 5716–5728.
- Supervised prototypical contrastive learning for emotion recognition in conversation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5197–5206, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Tiberiu Sosea and Cornelia Caragea. 2020. CancerEmo: A dataset for fine-grained emotion detection. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8892–8904, Online. Association for Computational Linguistics.
- Natural language deduction with incomplete information.
- Exploring topic coherence over many models and many topics. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’12, page 952–961, USA. Association for Computational Linguistics.
- Piotr Szymański and Kyle Gorman. 2020. Is the best better? Bayesian statistical model comparison for natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2203–2212, Online. Association for Computational Linguistics.
- From neural sentence summarization to headline generation: A coarse-to-fine approach. In IJCAI, volume 17, pages 4109–4115.
- Query-based instance discrimination network for relational triple extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7677–7690, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Showing your work doesn’t always work. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2766–2772, Online. Association for Computational Linguistics.
- DeCEMBERT: Learning from noisy instructional videos via dense captions and entropy minimization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2415–2426, Online. Association for Computational Linguistics.
- Galactica: A large language model for science. ArXiv, abs/2211.09085.
- Noriko Tomuro. 2001. Tree-cut and a lexicon based on systematic polysemy. In Second Meeting of the North American Chapter of the Association for Computational Linguistics.
- ENGINE: Energy-based inference networks for non-autoregressive machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2819–2826, Online. Association for Computational Linguistics.
- Speculation and negation: Rules, rankers, and the role of syntax. Computational Linguistics, 38(2):369–410.
- Diverse beam search: Decoding diverse solutions from neural sequence models. CoRR, abs/1610.02424.
- Can you tell me how to get past sesame street? sentence-level pretraining beyond language modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4465–4476, Florence, Italy. Association for Computational Linguistics.
- Learning to generate question by asking question: A primal-dual approach with uncommon word generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 46–61, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- PaperRobot: Incremental draft generation of scientific ideas. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1980–1991, Florence, Italy. Association for Computational Linguistics.
- William Yang Wang and Kathleen McKeown. 2010. “got you!”: Automatic vandalism detection in Wikipedia with web-based shallow syntactic-semantic modeling. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 1146–1154, Beijing, China. Coling 2010 Organizing Committee.
- Catching captain jack: Efficient time and space dependent patrols to combat oil-siphoning in international waters. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’18/IAAI’18/EAAI’18. AAAI Press.
- A two-stage parsing method for text-level discourse analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 184–188, Vancouver, Canada. Association for Computational Linguistics.
- M3: A multi-view fusion and multi-decoding network for multi-document reading comprehension. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1450–1461, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Joachim Wermter and Udo Hahn. 2006. You can’t beat frequency (unless you use linguistic knowledge) – a qualitative evaluation of association measures for collocation and term extraction. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 785–792, Sydney, Australia. Association for Computational Linguistics.
- Bowen Xing and Ivor W. Tsang. 2022. Co-guiding net: Achieving mutual guidances between multiple intent detection and slot filling via heterogeneous semantics-label graphs.
- Are two heads better than one? crowdsourced translation via a two-step collaboration of non-professional translators and editors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1134–1144, Baltimore, Maryland. Association for Computational Linguistics.
- Generating natural language proofs with verifier-guided search.
- Keep CALM and explore: Language models for action generation in text-based games. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8736–8754, Online. Association for Computational Linguistics.
- Generative knowledge graph construction: A review. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1–17, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Ki Yoon Yoo and Nojun Kwak. 2022. Backdoor attacks in federated learning by rare embeddings and gradient ensembling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 72–88, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Can we automate scientific reviewing? Journal of Artificial Intelligence Research, 75:171–212.
- Bartscore: Evaluating generated text as text generation. Advances in Neural Information Processing Systems, 34:27263–27277.
- MedDialog: Large-scale medical dialogue datasets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9241–9250, Online. Association for Computational Linguistics.
- Curriculum knowledge distillation for emoji-supervised cross-lingual sentiment analysis. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 864–875, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Pegasus: Pre-training with extracted gap-sentences for abstractive summarization.
- Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
- MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 563–578, Hong Kong, China. Association for Computational Linguistics.
- Summarizing multiple spoken documents: Finding evidence from untranscribed audio. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2, ACL ’09, page 549–557, USA. Association for Computational Linguistics.
- OntoGUM: Evaluating contextualized SOTA coreference resolution on 12 more genres. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 461–467, Online. Association for Computational Linguistics.