2000 character limit reached
QASE Enhanced PLMs: Improved Control in Text Generation for MRC (2403.04771v1)
Published 26 Feb 2024 in cs.CL
Abstract: To address the challenges of out-of-control generation in generative models for machine reading comprehension (MRC), we introduce the Question-Attended Span Extraction (QASE) module. Integrated during the fine-tuning of pre-trained generative LLMs (PLMs), QASE enables these PLMs to match SOTA extractive methods and outperform leading LLMs like GPT-4 in MRC tasks, without significant increases in computational costs.
- Ensemble ALBERT and RoBERTa for span prediction in question answering. In Proceedings of the 1st Workshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc 2021), pages 63–68, Online. Association for Computational Linguistics.
- Question directed graph attention network for numerical reasoning over text. arXiv preprint arXiv:2009.07448.
- From good to best: Two-stage training for cross-lingual machine reading comprehension. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10501–10508.
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
- Quoref: A reading comprehension dataset with questions requiring coreferential reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5925–5932, Hong Kong, China. Association for Computational Linguistics.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Search engine guided neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
- Fast and accurate neural machine translation with translation memory. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3170–3180, Online. Association for Computational Linguistics.
- Q2superscript𝑄2{Q^{2}}italic_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT: Evaluating factual consistency in knowledge-grounded dialogues via question generation and question answering. arXiv preprint arXiv:2104.08202.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- A multi-type multi-span network for reading comprehension that requires discrete reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1596–1606, Hong Kong, China. Association for Computational Linguistics.
- Bidirectional lstm-crf models for sequence tagging. arxiv 2015. arXiv preprint arXiv:1508.01991.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- Liquid: A framework for list question answering dataset generation. arXiv preprint arXiv:2302.01691.
- Addressing semantic drift in generative question answering with auxiliary extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 942–947, Online. Association for Computational Linguistics.
- MultiSpanQA: A dataset for multi-span question answering. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1250–1260, Seattle, United States. Association for Computational Linguistics.
- A survey on retrieval-augmented text generation. arXiv preprint arXiv:2202.01110.
- Evaluating the logical reasoning ability of chatgpt and gpt-4. arXiv preprint arXiv:2304.03439.
- RECAP: Retrieval-enhanced context-aware prefix encoder for personalized dialogue response generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8404–8419, Toronto, Canada. Association for Computational Linguistics.
- System report for CCL23-eval task 9: HUST1037 explore proper prompt strategy for LLM in MRC task. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 310–319, Harbin, China. Chinese Information Processing Society of China.
- Ms marco: A human generated machine reading comprehension dataset. choice, 2640:660.
- A simple but effective method to incorporate multi-turn context with BERT for conversational machine comprehension. In Proceedings of the First Workshop on NLP for Conversational AI, pages 11–17, Florence, Italy. Association for Computational Linguistics.
- SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.
- Sougata Saha and Rohini Srihari. 2023. ArgU: A controllable factual argument generator. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8373–8388, Toronto, Canada. Association for Computational Linguistics.
- A simple and effective model for answering multi-span questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3074–3080, Online. Association for Computational Linguistics.
- Read before generate! faithful long form question answering with machine reading. In Findings of the Association for Computational Linguistics: ACL 2022, pages 744–756, Dublin, Ireland. Association for Computational Linguistics.
- Prototype-to-style: Dialogue generation with style-aware editing on retrieval memory. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:2152–2161.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1705–1714, Melbourne, Australia. Association for Computational Linguistics.
- Retrieve and refine: Improved sequence generation models for dialogue. In Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI, pages 87–92, Brussels, Belgium. Association for Computational Linguistics.
- A controllable model of grounded response generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14085–14093.
- Transductive learning for unsupervised text style transfer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2510–2521, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- A deep cascade model for multi-document reading comprehension. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 7354–7361.
- Multi-span style extraction for generative reading comprehension. arXiv preprint arXiv:2009.07382.
- Coreferential Reasoning Learning for Language Representation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7170–7186, Online. Association for Computational Linguistics.
- How many answers should i give? an empirical study of multi-answer reading comprehension. arXiv preprint arXiv:2306.00435.
- INK: Injecting kNN knowledge in nearest neighbor machine translation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15948–15959, Toronto, Canada. Association for Computational Linguistics.
- Lin Ai (15 papers)
- Zheng Hui (27 papers)
- Zizhou Liu (5 papers)
- Julia Hirschberg (37 papers)