Active Learning for NLP with Large Language Models (2401.07367v1)
Abstract: Human annotation of training samples is expensive, laborious, and sometimes challenging, especially for NLP tasks. To reduce the labeling cost and enhance the sample efficiency, Active Learning (AL) technique can be used to label as few samples as possible to reach a reasonable or similar results. To reduce even more costs and with the significant advances of LLMs, LLMs can be a good candidate to annotate samples. This work investigates the accuracy and cost of using LLMs (GPT-3.5 and GPT-4) to label samples on 3 different datasets. A consistency-based strategy is proposed to select samples that are potentially incorrectly labeled so that human annotations can be used for those samples in AL settings, and we call it mixed annotation strategy. Then we test performance of AL under two different settings: (1) using human annotations only; (2) using the proposed mixed annotation strategy. The accuracy of AL models under 3 AL query strategies are reported on 3 text classification datasets, i.e., AG's News, TREC-6, and Rotten Tomatoes. On AG's News and Rotten Tomatoes, the models trained with the mixed annotation strategy achieves similar or better results compared to that with human annotations. The method reveals great potentials of LLMs as annotators in terms of accuracy and cost efficiency in active learning settings.
- Z. Zhang, E. Strubell, and E. Hovy, “A survey of active learning for natural language processing,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 6166–6190. [Online]. Available: https://aclanthology.org/2022.emnlp-main.414
- C. Schröder, A. Niekler, and M. Potthast, “Revisiting uncertainty-based query strategies for active learning with transformers,” in Findings of the Association for Computational Linguistics: ACL 2022. Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 2194–2203. [Online]. Available: https://aclanthology.org/2022.findings-acl.172
- S. Wang, Y. Liu, Y. Xu, C. Zhu, and M. Zeng, “Want to reduce labeling cost? GPT-3 can help,” in Findings of the Association for Computational Linguistics: EMNLP 2021, M.-F. Moens, X. Huang, L. Specia, and S. W.-t. Yih, Eds. Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 4195–4205. [Online]. Available: https://aclanthology.org/2021.findings-emnlp.354
- K. Margatina, T. Schick, N. Aletras, and J. Dwivedi-Yu, “Active learning principles for in-context learning with large language models,” arXiv preprint arXiv:2305.14264, 2023.
- D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,” in Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ser. SIGIR ’94. Berlin, Heidelberg: Springer-Verlag, 1994, p. 3–12.
- A. Holub, P. Perona, and M. C. Burl, “Entropy-based active learning for object recognition,” in 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008, pp. 1–8.
- T. Luo, K. Kramer, S. Samson, A. Remsen, D. Goldgof, L. Hall, and T. Hopkins, “Active learning to recognize multiple types of plankton,” in Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., vol. 3, 2004, pp. 478–481 Vol.3.
- H. S. Seung, M. Opper, and H. Sompolinsky, “Query by committee,” in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, ser. COLT ’92. New York, NY, USA: Association for Computing Machinery, 1992, p. 287–294. [Online]. Available: https://doi.org/10.1145/130385.130417
- B. Settles, M. Craven, and S. Ray, “Multiple-instance active learning,” Advances in neural information processing systems, vol. 20, 2007.
- V. Ambati, S. Vogel, and J. Carbonell, “Active learning and crowd-sourcing for machine translation,” in Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, Eds. Valletta, Malta: European Language Resources Association (ELRA), May 2010. [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2010/pdf/244_Paper.pdf
- Y. Zhao, H. Zhang, S. Zhou, and Z. Zhang, “Active learning approaches to enhancing neural machine translation,” in Findings of the Association for Computational Linguistics: EMNLP 2020, T. Cohn, Y. He, and Y. Liu, Eds. Online: Association for Computational Linguistics, Nov. 2020, pp. 1796–1806. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.162
- M. Eck, S. Vogel, and A. Waibel, “Low cost portability for statistical machine translation based on n-gram frequency and TF-IDF,” in Proceedings of the Second International Workshop on Spoken Language Translation, Pittsburgh, Pennsylvania, USA, oct 24-25 2005. [Online]. Available: https://aclanthology.org/2005.iwslt-1.7
- M. Bloodgood and C. Callison-Burch, “Bucking the trend: Large-scale cost-focused active learning for statistical machine translation,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden: Association for Computational Linguistics, Jul. 2010, pp. 854–864. [Online]. Available: https://aclanthology.org/P10-1088
- A. Erdmann, D. J. Wrisley, B. Allen, C. Brown, S. Cohen-Bodénès, M. Elsner, Y. Feng, B. Joseph, B. Joyeux-Prunel, and M.-C. de Marneffe, “Practical, efficient, and customizable active learning for named entity recognition in the digital humanities,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds. Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 2223–2234. [Online]. Available: https://aclanthology.org/N19-1231
- J. Zhu, H. Wang, T. Yao, and B. K. Tsou, “Active learning with sampling by uncertainty and density for word sense disambiguation and text classification,” in Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), D. Scott and H. Uszkoreit, Eds. Manchester, UK: Coling 2008 Organizing Committee, Aug. 2008, pp. 1137–1144. [Online]. Available: https://aclanthology.org/C08-1143
- F. Zhdanov, “Diverse mini-batch active learning,” arXiv preprint arXiv:1901.05954, 2019.
- X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, ser. NIPS’15. Cambridge, MA, USA: MIT Press, 2015, p. 649–657.
- B. Pang and L. Lee, “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales,” in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ser. ACL ’05. USA: Association for Computational Linguistics, 2005, p. 115–124. [Online]. Available: https://doi.org/10.3115/1219840.1219855
- V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019.
- Xuesong Wang (43 papers)