Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review (2310.09411v1)
Abstract: In recent years, deep learning has revolutionized NLP by enabling the development of models that can learn complex representations of language data, leading to significant improvements in performance across a wide range of NLP tasks. Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data. This is in contrast to traditional NLP approaches, which rely on hand-engineered features and rules to perform NLP tasks. The ability of deep neural networks to learn hierarchical representations of language data, handle variable-length input sequences, and perform well on large datasets makes them well-suited for NLP applications. Driven by the exponential growth of textual data and the increasing demand for condensed, coherent, and informative summaries, text summarization has been a critical research area in the field of NLP. Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks. In this survey, we begin with a review of fashionable text summarization tasks in recent years, including extractive, abstractive, multi-document, and so on. Next, we discuss most deep learning-based models and their experimental results on these tasks. The paper also covers datasets and data representation for summarization tasks. Finally, we delve into the opportunities and challenges associated with summarization tasks and their corresponding methodologies, aiming to inspire future research efforts to advance the field further. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific setting.
- Towards generating query to perform query focused abstractive summarization using pre-trained model. In Proceedings of the 13th International Conference on Natural Language Generation, pages 80–85, Dublin, Ireland, December 2020. Association for Computational Linguistics.
- Text summarization: A brief review. Recent Advances in NLP: The Case of Arabic Language, 874:1, 2019.
- Summarization from medical documents: a survey. Artificial intelligence in medicine, 33(2):157–177, 2005.
- Text summarization techniques: a brief survey. arXiv preprint arXiv:1707.02268, 2017.
- An overview of text summarization techniques. In 2016 international conference on computing communication control and automation (ICCUBEA), pages 1–7. IEEE, 2016.
- Text2table: Medical text summarization system based on named entity recognition and modality identification. In Proceedings of the BioNLP 2009 Workshop, pages 185–192, 2009.
- A text summarizer for arabic. Computer Speech & Language, 26(4):260–273, 2012.
- Cross-lingual abstractive summarization with limited parallel resources. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6910–6924, Online, August 2021. Association for Computational Linguistics.
- METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65–72, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics.
- Headline generation based on statistical translation. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pages 318–325, 2000.
- Unilmv2: Pseudo-masked language models for unified language model pre-training. In Preprint, 2020.
- Sentence fusion for multidocument news summarization. Computational Linguistics, 31(3):297–328, 2005.
- Longformer: The long-document transformer, 2020.
- A neural probabilistic language model. Advances in neural information processing systems, 13, 2000.
- Automatic text summarization and it’s methods-a review. In 2016 6th international conference-cloud system and big data engineering (Confluence), pages 65–72. IEEE, 2016.
- A graph-based approach to cross-language multi-document summarization. Polibits, 43:113–118, 2011.
- Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5):675–685, 1995.
- The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107–117, 1998.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- TLDR: Extreme summarization of scientific documents. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4766–4777, Online, November 2020. Association for Computational Linguistics.
- Mutually reinforced manifold-ranking based relevance propagation model for query-focused multi-document summarization. IEEE transactions on audio, speech, and language processing, 20(5):1597–1607, 2012.
- Ranking with recursive neural networks and its application to multi-document summarization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, page 2153–2159. AAAI Press, 2015.
- The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335–336, 1998.
- On the effectiveness of using sentence compression models for query-focused multi-document summarization. In Proceedings of COLING 2012, pages 457–474, Mumbai, India, December 2012. The COLING 2012 Organizing Committee.
- Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches. Natural Language Engineering, 18(1):109–145, 2012.
- LEGAL-BERT: The muppets straight out of law school. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2898–2904, Online, November 2020. Association for Computational Linguistics.
- A query-based medical information summarization system using ontology knowledge. In 19th IEEE Symposium on Computer-Based Medical Systems (CBMS’06), pages 37–42. IEEE, 2006.
- Fast abstractive summarization with reinforce-selected sentence rewriting. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 675–686, Melbourne, Australia, July 2018. Association for Computational Linguistics.
- Distilling knowledge learned in BERT for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7893–7905, Online, July 2020. Association for Computational Linguistics.
- Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734, Doha, Qatar, October 2014. Association for Computational Linguistics.
- Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 93–98, San Diego, California, June 2016. Association for Computational Linguistics.
- Single document automatic text summarization using term frequency-inverse document frequency (tf-idf). ComTech: Computer, Mathematics and Engineering Applications, 7(4):285–294, 2016.
- Automatic summarization of events from social media. Proceedings of the International AAAI Conference on Web and Social Media, 7(1):81–90, Aug. 2021.
- Deep learning with cots hpc systems. In International conference on machine learning, pages 1337–1345. PMLR, 2013.
- A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615–621, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
- Document Understanding Conference. Duc 2004. https://duc.nist.gov/duc2004/tasks.html, 2004. Updated: 2011-03-24.
- Bayesian query-focused summarization. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 305–312, Sydney, Australia, July 2006. Association for Computational Linguistics.
- Daisy Deng. Bootstrap your text summarization solution with the latest release from nlp-recipes. https://techcommunity.microsoft.com/t5/ai-customer-engineering-team/bootstrap-your-text-summarization-solution-with-the-latest/ba-p/1268809, 2020. Updated: 2020-03-31.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
- Multi graph neural network for extractive long document summarization. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5870–5875, Gyeongju, Republic of Korea, October 2022. International Committee on Computational Linguistics.
- Two-phase multidocument summarization through content-attention-based subtopic detection. IEEE Transactions on Computational Social Systems, 8(6):1379–1392, 2021.
- Hedge trimmer: A parse-and-trim approach to headline generation. In Proceedings of the HLT-NAACL 03 Text Summarization Workshop, pages 1–8, 2003.
- Zero-shot cross-lingual abstractive sentence summarization through teaching generation and attention. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3162–3172, Florence, Italy, July 2019. Association for Computational Linguistics.
- Harold P Edmundson. New methods in automatic extracting. Journal of the ACM (JACM), 16(2):264–285, 1969.
- Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 conference on empirical methods in natural language processing, pages 365–371, 2004.
- Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence research, 22:457–479, 2004.
- Multi-news: A large-scale multi-document summarization dataset and abstractive hierarchical model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1074–1084, Florence, Italy, July 2019. Association for Computational Linguistics.
- Rule-based modeling of biochemical systems with bionetgen. Systems biology, pages 113–167, 2009.
- A survey on dialogue summarization: Recent advances and new frontiers. arXiv preprint arXiv:2107.03175, 2021.
- Structured neural summarization, 2021.
- Query-focused summarization by supervised sentence ranking and skewed word distributions. In Proceedings of the Document Understanding Conference, DUC-2006, New York, USA, 2006.
- Support vector machines for query-focused summarization trained and evaluated on pyramid data. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 57–60, Prague, Czech Republic, June 2007. Association for Computational Linguistics.
- Kunihiko Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193–202, 1980.
- One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing (TSLP), 3(2):1–16, 2006.
- Combining different summarization techniques for legal text. In Proceedings of the workshop on innovative hybrid approaches to the processing of textual data, pages 115–123, 2012.
- Recent automatic text summarization techniques: a survey. Artificial Intelligence Review, 47:1–66, 2017.
- Opinosis: A graph based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010), pages 340–348, 2010.
- A comprehensive survey on text summarization systems. In 2009 2nd International Conference on Computer Science and its Applications, pages 1–6. IEEE, 2009.
- Summarizing text documents: Sentence selection and evaluation metrics. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 121–128, 1999.
- Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19–25, 2001.
- Globalizing bert-based transformer architectures for long document summarization. In Proceedings of the 16th conference of the European chapter of the Association for Computational Linguistics: Main volume, pages 1792–1810, 2021.
- node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864, 2016.
- Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 708–719, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
- Survey on automatic text summarization and transformer models applicability. In Proceedings of the 2020 1st International Conference on Control, Robotics and Intelligent System, pages 176–184, 2020.
- Vishal Gupta. Hybrid algorithm for multilingual summarization of hindi and punjabi documents. In Mining Intelligence and Knowledge Exploration: First International Conference, MIKE 2013, Tamil Nadu, India, December 18-20, 2013. Proceedings, pages 717–727. Springer, 2013.
- A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence, 2(3):258–268, 2010.
- Query-based abstractive summarization using neural networks. arXiv preprint arXiv:1712.06100, 2017.
- Learning distributed representations of sentences from unlabelled data. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1367–1377, San Diego, California, June 2016. Association for Computational Linguistics.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Comments-oriented document summarization: understanding documents with readers’ feedback. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 291–298, 2008.
- Efficient attentions for long document summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1419–1436, Online, June 2021. Association for Computational Linguistics.
- A guide to theory, algorithm, and system development. Spoken Language Processing. Prentice-Hall, 2001.
- Generating multiple-length summaries via reinforcement learning for unsupervised sentence summarization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2939–2951, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- A survey on multi-modal summarization. ACM Computing Surveys, 2021.
- Multiplex graph neural network for extractive text summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 133–139, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics.
- Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996.
- Text summarization from legal documents: a survey. Artificial Intelligence Review, 51:371–402, 2019.
- An empirical survey on long document summarization: Datasets, models, and metrics. ACM computing surveys, 55(8):1–35, 2022.
- BillSum: A corpus for automatic summarization of US legislation. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 48–56, Hong Kong, China, November 2019. Association for Computational Linguistics.
- Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.
- Booksum: A collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209, 2021.
- A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pages 68–73, 1995.
- End-to-end training for financial report summarization. In Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation, pages 118–123, Barcelona, Spain (Online), December 2020. COLING.
- Query focused abstractive summarization via incorporating query relevance and transfer learning with transformer models. In Advances in Artificial Intelligence: 33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020, Ottawa, ON, Canada, May 13–15, 2020, Proceedings 33, pages 342–348. Springer, 2020.
- Finding topic words for hierarchical summarization. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 349–357, 2001.
- Deep learning. nature, 521(7553):436–444, 2015.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- A fuzzy ontology and its application to news summarization. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 35(5):859–880, 2005.
- BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 09 2019.
- Cross-lingual c*st*rd: English access to hindi information. ACM Transactions on Asian Language Information Processing, 2(3):245–269, sep 2003.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online, July 2020. Association for Computational Linguistics.
- News headline generation based on improved decoder from transformer. Scientific Reports, 12(1):11648, 2022.
- XGLUE: A new benchmark dataset for cross-lingual pre-training, understanding and generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6008–6018, Online, November 2020. Association for Computational Linguistics.
- Chin-Yew Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain, July 2004. Association for Computational Linguistics.
- Manual and automatic evaluation of summaries. In Proceedings of the ACL-02 workshop on automatic summarization, pages 45–51, 2002.
- Graph-based submodular selection for extractive summarization. In 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, pages 381–386. IEEE, 2009.
- Abstractive summarization: A survey of the state of the art. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):9815–9822, Jul. 2019.
- Time-aware large kernel convolutions. In International Conference on Machine Learning, pages 6172–6183. PMLR, 2020.
- Generative adversarial network for abstractive text summarization. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), Apr. 2018.
- Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198, 2018.
- Yang Liu. Fine-tune bert for extractive summarization. arXiv preprint arXiv:1903.10318, 2019.
- Hierarchical transformers for multi-document summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5070–5081, Florence, Italy, July 2019. Association for Computational Linguistics.
- Graph summarization methods and applications: A survey. ACM computing surveys (CSUR), 51(3):1–34, 2018.
- On learning to summarize with large language models as references, 2023.
- Controlling length in abstractive summarization using a convolutional neural network. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4110–4119, Brussels, Belgium, October-November 2018. Association for Computational Linguistics.
- Hans Peter Luhn. The automatic creation of literature abstracts. IBM Journal of research and development, 2(2):159–165, 1958.
- Global optimization under length constraint for neural text summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1039–1048, Florence, Italy, July 2019. Association for Computational Linguistics.
- Machine learning of generic and user-focused summarization. In AAAI/IAAI, pages 821–826, 1998.
- The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60, Baltimore, Maryland, June 2014. Association for Computational Linguistics.
- CiteSum: Citation text-guided scientific extreme summarization and domain adaptation with limited supervision. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10922–10935, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- Generating summaries of multiple news articles. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pages 74–82, 1995.
- Tracking and summarizing news on a daily basis with columbia’s newsblaster. In Proceedings of the human language technology conference, pages 280–285. San Diego, CA, 2002.
- Bringing structure into summaries: a faceted summarization dataset for long scientific documents. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 1080–1089, Online, August 2021. Association for Computational Linguistics.
- Alessio Micheli. Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks, 20(3):498–511, 2009.
- TextRank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 404–411, Barcelona, Spain, July 2004. Association for Computational Linguistics.
- Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing, pages 404–411, 2004.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
- Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- N Moratanch and S Chitrakala. A survey on extractive text summarization. In 2017 international conference on computer, communication and signal processing (ICCCSP), pages 1–6. IEEE, 2017.
- Semantic self-segmentation for abstractive summarization of long documents in low-resource regimes. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):11085–11093, Jun. 2022.
- A survey on opinion summarization techniques for social media. Future Computing and Informatics Journal, 3(1):82–109, 2018.
- Comparative study of text summarization methods. International Journal of Computer Applications, 102(12), 2014.
- Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI’17, page 3075–3081. AAAI Press, 2017.
- Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pages 280–290, Berlin, Germany, August 2016. Association for Computational Linguistics.
- Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1797–1807, Brussels, Belgium, October-November 2018. Association for Computational Linguistics.
- Ranking sentences for extractive summarization with reinforcement learning. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1747–1759, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
- Narges Nazari and MA Mahdavi. A survey on automatic text summarization. Journal of AI and Data Mining, 7(1):121–135, 2019.
- Diversity driven attention model for query-based abstractive summarization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1063–1072, Vancouver, Canada, July 2017. Association for Computational Linguistics.
- A survey of text summarization techniques. Mining text data, pages 43–76, 2012.
- Evaluating content selection in summarization: The pyramid method. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004, pages 145–152, Boston, Massachusetts, USA, May 2 - May 7 2004. Association for Computational Linguistics.
- The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005, 101, 2005.
- OpenAI. Gpt-4 technical report, 2023.
- Evaluation of a cross-lingual Romanian-English multi-document summariser. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco, May 2008. European Language Resources Association (ELRA).
- A robust abstractive system for cross-lingual summarization. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2025–2031, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
- Training language models to follow instructions with human feedback, 2022.
- Applying regression models to query-focused multi-document summarization. Inf. Process. Manage., 47(2):227–237, mar 2011.
- A template-based abstractive meeting summarization: Leveraging summary and source text relationships. In Proceedings of the 8th International Natural Language Generation Conference (INLG), pages 45–53, Philadelphia, Pennsylvania, U.S.A., June 2014. Association for Computational Linguistics.
- Text summarization using latent semantic analysis. Journal of Information Science, 37(4):405–417, 2011.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics.
- Topical coherence for graph-based extractive summarization. In Proceedings of the 2015 conference on empirical methods in natural language processing, pages 1949–1954, 2015.
- A language independent approach to multilingual text summarization. Large scale semantic access to content (text, image, video, and sound), pages 123–132, 2007.
- A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017.
- GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar, October 2014. Association for Computational Linguistics.
- Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710, 2014.
- CaseSummarizer: A system for automated summarization of legal texts. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pages 258–262, Osaka, Japan, December 2016. The COLING 2016 Organizing Committee.
- MEAD - a platform for multidocument multilingual text summarization. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal, May 2004. European Language Resources Association (ELRA).
- Newsinessence: A system for domain-independent, real-time news clustering and multi-document summarization. In Proceedings of the first international conference on Human language technology research, pages 1–4, 2001.
- Centroid-based summarization of multiple documents. Information Processing & Management, 40(6):919–938, 2004.
- Improving language understanding by generative pre-training, 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1), jan 2020.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
- Learning representations by back-propagating errors. nature, 323(6088):533–536, 1986.
- A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 379–389, Lisbon, Portugal, September 2015. Association for Computational Linguistics.
- Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5):513–523, 1988.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter, 2020.
- The graph neural network model. IEEE transactions on neural networks, 20(1):61–80, 2008.
- Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11):2673–2681, 1997.
- MLSUM: The multilingual summarization corpus. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8051–8067, Online, November 2020. Association for Computational Linguistics.
- Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancouver, Canada, July 2017. Association for Computational Linguistics.
- BIGPATENT: A large-scale dataset for abstractive and coherent summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2204–2213, Florence, Italy, July 2019. Association for Computational Linguistics.
- Learning to rank for query-focused multi-document summarization. In 2011 IEEE 11th International Conference on Data Mining, pages 626–634. IEEE, 2011.
- Using latent semantic analysis in text summarization and summary evaluation. Proc. ISIM, 4(93-100):8, 2004.
- SumeCzech: Large Czech news-based summarization dataset. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resources Association (ELRA).
- Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1199–1208, 2018.
- Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27, 2014.
- Abstractive document summarization with a graph-based attentional neural model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1171–1181, Vancouver, Canada, July 2017. Association for Computational Linguistics.
- From neural sentence summarization to headline generation: A coarse-to-fine approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17, page 4109–4115. AAAI Press, 2017.
- A survey automatic text summarization. PressAcademia Procedia, 5(1):205–213, 2007.
- Graph-based algorithms for text summarization. In 2010 3rd International Conference on Emerging Trends in Engineering and Technology, pages 516–519. IEEE, 2010.
- Hybrid text summarization: Combining external relevance measures with structural analysis. In Text Summarization Branches Out, pages 51–55, Barcelona, Spain, July 2004. Association for Computational Linguistics.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4566–4575, 2015.
- Exploring neural models for query-focused summarization. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1455–1468, Seattle, United States, July 2022. Association for Computational Linguistics.
- Xiaojun Wan. Using bilingual information for cross-language document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1546–1555, Portland, Oregon, USA, June 2011. Association for Computational Linguistics.
- Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 917–926, Uppsala, Sweden, July 2010. Association for Computational Linguistics.
- Multi-document summarization using cluster-based link analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 299–306, 2008.
- Heterogeneous graph neural networks for extractive document summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6209–6219, Online, July 2020. Association for Computational Linguistics.
- Salience allocation as guidance for abstractive summarization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6094–6106, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- A survey on cross-lingual summarization. Transactions of the Association for Computational Linguistics, 10:1304–1323, 2022.
- Lu Wang and Claire Cardie. Domain-independent abstract generation for focused meeting summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1395–1405, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
- Mark Wasson. Using leading text for news summaries: Evaluation results and implications for commercial summarization applications. In COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics, 1998.
- Tokenization as the initial phase in nlp. In COLING 1992 volume 4: The 14th international conference on computational linguistics, 1992.
- Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement learning, pages 5–32, 1992.
- Pay less attention with lightweight and dynamic convolutions. arXiv preprint arXiv:1901.10430, 2019.
- Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9):2251–2265, 2018.
- Extractive summarization of long documents by combining global and local context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3011–3021, Hong Kong, China, November 2019. Association for Computational Linguistics.
- Sequence level contrastive learning for text summarization. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):11556–11565, Jun. 2022.
- Automatic text summarization methods: A comprehensive review. arXiv preprint arXiv:2204.01849, 2022.
- Phrase-based compressive cross-language summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 118–127, Lisbon, Portugal, September 2015. Association for Computational Linguistics.
- AdaptSum: Towards low-resource domain adaptation for abstractive summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5892–5904, Online, June 2021. Association for Computational Linguistics.
- Few-shot query-focused summarization with prefix-merging. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3704–3714, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- Big bird: Transformers for longer sequences. Advances in neural information processing systems, 33:17283–17297, 2020.
- Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(10):1842–1853, 2016.
- Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, pages 11328–11339. PMLR, 2020.
- EmailSum: Abstractive email thread summarization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6895–6909, Online, August 2021. Association for Computational Linguistics.
- Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019.
- Generating character descriptions for automatic summarization of fiction. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):7476–7483, Jul. 2019.
- HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5059–5069, Florence, Italy, July 2019. Association for Computational Linguistics.
- Dsgpt: Domain-specific generative pre-training of transformers for text generation in e-commerce title and review summarization. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2146–2150, 2021.
- MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 563–578, Hong Kong, China, November 2019. Association for Computational Linguistics.
- Query-focused summarization based on genetic algorithm. In 2010 International Conference on Measuring Technology and Mechatronics Automation, volume 2, pages 968–971. IEEE, 2010.
- Long-document cross-lingual summarization. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pages 1084–1092, 2023.
- Subtopic-driven multi-document summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3153–3162, Hong Kong, China, November 2019. Association for Computational Linguistics.
- Headline summarization at isi. In Proceedings of the HLT-NAACL 2003 text summarization workshop and document understanding conference (DUC 2003), pages 174–178. Citeseer, 2003.
- Template-filtered headline summarization. In Text Summarization Branches Out, pages 56–60, Barcelona, Spain, July 2004. Association for Computational Linguistics.
- Movie review mining and summarization. In Proceedings of the 15th ACM international conference on Information and knowledge management, pages 43–50, 2006.
- Guanghua Wang (2 papers)
- Weili Wu (40 papers)