RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization (2405.00657v2)
Abstract: For long document summarization, discourse structure is important to discern the key content of the text and the differences in importance level between sentences. Unfortunately, the integration of rhetorical structure theory (RST) into parameter-efficient fine-tuning strategies for long document summarization remains unexplored. Therefore, this paper introduces RST-LoRA and proposes four RST-aware variants to explicitly incorporate RST into the LoRA model. Our empirical evaluation demonstrates that incorporating the type and uncertainty of rhetorical relations can complementarily enhance the performance of LoRA in summarization tasks. Furthermore, the best-performing variant we introduced outperforms the vanilla LoRA and full-parameter fine-tuning models, as confirmed by multiple automatic and human evaluations, and even surpasses previous state-of-the-art methods.
- An empirical study on the transferability of transformer modules in parameter-efficient fine-tuning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10617–10625, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- ATTEMPT: Parameter-efficient multi-task tuning via attentional mixtures of soft prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 6655–6672, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Promoting topic coherence and inter-document consorts in multi-document summarization via simplicial complex and sheaf graph. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2154–2166, Singapore. Association for Computational Linguistics.
- Parameter-efficient finetuning for robust continual multilingual learning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9763–9780, Toronto, Canada. Association for Computational Linguistics.
- Correcting diverse factual errors in abstractive summarization via post-editing and language model infilling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9818–9830, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65–72, Ann Arbor, Michigan. Association for Computational Linguistics.
- Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150.
- SIMSUM: Document-level text simplification via simultaneous summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9927–9944, Toronto, Canada. Association for Computational Linguistics.
- Discourse-aware neural rewards for coherent text generation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 173–184, New Orleans, Louisiana. Association for Computational Linguistics.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
- Emanuele Bugliarello and Naoaki Okazaki. 2020. Enhancing machine translation with dependency-aware self-attention. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1618–1627, Online. Association for Computational Linguistics.
- HIBRIDS: Attention with hierarchical biases for structure-aware long document summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 786–807, Dublin, Ireland. Association for Computational Linguistics.
- Awesome: Gpu memory-constrained long document summarization using memory mechanism and global salient content. arXiv preprint arXiv:2305.14806.
- Revisiting parameter-efficient tuning: Are we really there yet? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2612–2626, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Improved neural machine translation with a syntax-aware encoder and decoder. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1936–1945, Vancouver, Canada. Association for Computational Linguistics.
- Jiaao Chen and Diyi Yang. 2021. Structure-aware abstractive conversation summarization via discourse and action graphs. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1380–1391, Online. Association for Computational Linguistics.
- Parameter-efficient fine-tuning design spaces. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023.
- A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615–621, New Orleans, Louisiana. Association for Computational Linguistics.
- Qlora: Efficient finetuning of quantized llms. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023.
- Sparse low-rank adaptation of pre-trained language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4133–4145, Singapore. Association for Computational Linguistics.
- Discourse-aware unsupervised summarization for long scientific documents. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1089–1102, Online. Association for Computational Linguistics.
- Discourse understanding and factual consistency in abstractive summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 435–447, Online. Association for Computational Linguistics.
- Discourse-aware soft prompting for text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4570–4589, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Know where you’re going: Meta-learning for parameter-efficient fine-tuning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11602–11612, Toronto, Canada. Association for Computational Linguistics.
- Making science simple: Corpora for the lay summarisation of scientific literature. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10589–10604, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Naman Goyal and Jacob Eisenstein. 2016. A joint model of rhetorical discourse structure and summarization. In Proceedings of the Workshop on Structured Prediction for NLP, pages 25–34, Austin, TX. Association for Computational Linguistics.
- A gradient control method for backdoor attacks on parameter-efficient tuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3508–3520, Toronto, Canada. Association for Computational Linguistics.
- PPT: Pre-trained prompt tuning for few-shot learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8410–8423, Dublin, Ireland. Association for Computational Linguistics.
- Modeling hierarchical syntax structure with triplet position for source code summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 486–500, Dublin, Ireland. Association for Computational Linguistics.
- Towards a unified view of parameter-efficient transfer learning. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022.
- Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
- LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
- Yin Jou Huang and Sadao Kurohashi. 2021. Extractive summarization considering discourse and coreference relations based on heterogeneous graph. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3046–3052, Online. Association for Computational Linguistics.
- Discourse-aware hierarchical attention network for extractive single-document summarization. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 497–506, Varna, Bulgaria. INCOMA Ltd.
- Unsupervised neural single-document summarization of reviews via learning latent discourse structure and its ranking. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2142–2152, Florence, Italy. Association for Computational Linguistics.
- Single document summarization based on nested tree structure. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 315–320, Baltimore, Maryland. Association for Computational Linguistics.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- BOOKSUM: A collection of datasets for long-form narrative summarization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6536–6558, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- SummaC: Re-visiting NLI-based models for inconsistency detection in summarization. Transactions of the Association for Computational Linguistics, 10:163–177.
- Neural architecture search for parameter-efficient fine-tuning of large pre-trained language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8506–8515, Toronto, Canada. Association for Computational Linguistics.
- Conditional adapters: Parameter-efficient transfer learning with fast inference. In Thirty-seventh Conference on Neural Information Processing Systems.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Prefix propagation: Parameter-efficient tuning for long sequences. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1408–1419, Toronto, Canada. Association for Computational Linguistics.
- The role of discourse units in near-extractive summarization. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 137–147, Los Angeles. Association for Computational Linguistics.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics.
- Loftq: LoRA-fine-tuning-aware quantization for large language models. In The Twelfth International Conference on Learning Representations.
- Composing elementary discourse units in abstractive summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6191–6196, Online. Association for Computational Linguistics.
- Parameter-efficient fine-tuning without introducing new latency. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4242–4260, Toronto, Canada. Association for Computational Linguistics.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 150–157.
- RECAP: Retrieval-enhanced context-aware prefix encoder for personalized dialogue response generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8404–8419, Toronto, Canada. Association for Computational Linguistics.
- G-eval: NLG evaluation using gpt-4 with better human alignment. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2511–2522, Singapore. Association for Computational Linguistics.
- Single document summarization as tree induction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1745–1755, Minneapolis, Minnesota. Association for Computational Linguistics.
- Binary and ternary natural language generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 65–77, Toronto, Canada. Association for Computational Linguistics.
- Zhengyuan Liu and Nancy Chen. 2019. Exploiting discourse-level segmentation for extractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 116–121, Hong Kong, China. Association for Computational Linguistics.
- Multilingual neural RST discourse parsing. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6730–6738, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- DMRST: A joint framework for document-level multilingual RST discourse segmentation and parsing. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 154–164, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
- Discourse indicators for content selection in summarization. In Proceedings of the SIGDIAL 2010 Conference, pages 147–156, Tokyo, Japan. Association for Computational Linguistics.
- William C Mann and Sandra A Thompson. 1987. Rhetorical structure theory: A theory of text organization. University of Southern California, Information Sciences Institute Los Angeles.
- UniPELT: A unified framework for parameter-efficient language model tuning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6253–6264, Dublin, Ireland. Association for Computational Linguistics.
- Daniel Marcu. 1997. From discourse structures to text summaries. In Intelligent Scalable Text Summarization.
- Daniel Marcu. 1999. Discourse trees are good indicators of importance in text. Advances in automatic text summarization, 293:123–136.
- Daniel Marcu. 2000. The theory and practice of discourse parsing and summarization. MIT press.
- Linguistic profiling of a neural language model. In Proceedings of the 28th International Conference on Computational Linguistics, pages 745–756, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1797–1807, Brussels, Belgium. Association for Computational Linguistics.
- OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
- Investigating efficiently extending transformers for long input summarization. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3946–3961, Singapore. Association for Computational Linguistics.
- Combining parameter-efficient modules for task-level generalisation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 687–702, Dubrovnik, Croatia. Association for Computational Linguistics.
- Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Brussels, Belgium. Association for Computational Linguistics.
- Dongqi Pu and Khalil Sima’an. 2022. Passing parser uncertainty to the transformer: Labeled dependency distributions for neural machine translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 41–50, Ghent, Belgium. European Association for Machine Translation.
- Incorporating distributions of discourse structure for long document abstractive summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5574–5590, Toronto, Canada. Association for Computational Linguistics.
- Scinews: From scholarly complexities to public narratives – a dataset for scientific news report generation. In The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
- Structural guidance for transformer language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3735–3745, Online. Association for Computational Linguistics.
- Sebastian Schuster and Tal Linzen. 2022. When a sentence does not introduce a discourse entity, transformer-based models still sometimes refer to it. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 969–982, Seattle, United States. Association for Computational Linguistics.
- Echoes from alexandria: A large resource for multilingual book summarization. In Findings of the Association for Computational Linguistics: ACL 2023, pages 853–867, Toronto, Canada. Association for Computational Linguistics.
- DACSA: A large-scale dataset for automatic summarization of Catalan and Spanish newspaper articles. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5931–5943, Seattle, United States. Association for Computational Linguistics.
- Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pages 4603–4611. PMLR.
- Multi-lexsum: Real-world summaries of civil rights lawsuits at multiple granularities. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022.
- Improving biomedical abstractive summarisation with knowledge aggregation from citation papers. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 606–618, Singapore. Association for Computational Linguistics.
- Residual adapters for parameter-efficient ASR adaptation to atypical and accented speech. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6751–6760, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Ahmet Üstün and Asa Cooper Stickland. 2022. When does parameter-efficient transfer learning work for machine translation? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7919–7933, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- PIP: Parse-instructed prefix for syntactically controlled paraphrase generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10372–10380, Toronto, Canada. Association for Computational Linguistics.
- Parameter-efficient multilingual summarisation: An empirical study. arXiv preprint arXiv:2311.08572.
- Do we really need that many parameters in transformer for extractive summarization? discourse can help ! In Proceedings of the First Workshop on Computational Approaches to Discourse, pages 124–134, Online. Association for Computational Linguistics.
- Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5021–5031, Online. Association for Computational Linguistics.
- QA-loRA: Quantization-aware low-rank adaptation of large language models. In The Twelfth International Conference on Learning Representations.
- Parameter-efficient tuning makes a good classification head. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7576–7586, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- One network, many masks: Towards more parameter-efficient transfer learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7564–7580, Toronto, Canada. Association for Computational Linguistics.
- Adaptive budget allocation for parameter-efficient fine-tuning. In The Eleventh International Conference on Learning Representations.
- Bertscore: Evaluating text generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020.
- Towards adaptive prefix tuning for parameter-efficient language model fine-tuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1239–1248, Toronto, Canada. Association for Computational Linguistics.
- Infusing hierarchical guidance into prompt tuning: A parameter-efficient framework for multi-level implicit discourse relation recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6477–6492, Toronto, Canada. Association for Computational Linguistics.
- Judging LLM-as-a-judge with MT-bench and chatbot arena. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Parameter-efficient fine-tuning with layer pruning on medical report summarization and medical dialogue generation. arXiv preprint arXiv:2305.08285.
- Vera Demberg (48 papers)
- Dongqi Liu (6 papers)