Papers
Topics
Authors
Recent
2000 character limit reached

Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data (2401.12295v1)

Published 22 Jan 2024 in cs.CL

Abstract: The field of machine learning has recently made significant progress in reducing the requirements for labelled training data when building new models. These cheaper' learning techniques hold significant potential for the social sciences, where development of large labelled training datasets is often a significant practical impediment to the use of machine learning for analytical tasks. In this article we review threecheap' techniques that have developed in recent years: weak supervision, transfer learning and prompt engineering. For the latter, we also review the particular case of zero-shot prompting of LLMs. For each technique we provide a guide of how it works and demonstrate its application across six different realistic social science applications (two different tasks paired with three different dataset makeups). We show good performance for all techniques, and in particular we demonstrate how prompting of LLMs can achieve high accuracy at very low cost. Our results are accompanied by a code repository to make it easy for others to duplicate our work and use it in their own research. Overall, our article is intended to stimulate further uptake of these techniques in the social sciences.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (121)
  1. Efficient Machine Learning for Big Data: A Review. Big Data Research, 2(3):87–93, September 2015. ISSN 2214-5796. doi: 10.1016/j.bdr.2015.04.001. URL https://www.sciencedirect.com/science/article/pii/S2214579615000271.
  2. Hate speech detection on twitter using transfer learning. Computer Speech & Language, 74:101365, 2022.
  3. Do not have enough data? deep learning to the rescue! In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7383–7390, 2020.
  4. Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda. Communication Methods and Measures, 16(1):1–18, January 2022. ISSN 1931-2458. doi: 10.1080/19312458.2021.2015574. URL https://doi.org/10.1080/19312458.2021.2015574.
  5. Towards Efficient Post-training Quantization of Pre-trained Language Models. Advances in Neural Information Processing Systems, 35:1405–1418, December 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/096347b4efc264ae7f07742fea34af1f-Abstract-Conference.html.
  6. A data-centric review of deep transfer learning with applications to text data. Information Sciences, 585:498–528, 2022.
  7. Automated Evaluation of Writing – 50 Years and Counting. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7796–7810, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.697. URL https://aclanthology.org/2020.acl-main.697.
  8. Yoshua Bengio. Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade: Second Edition, pages 437–478. Springer, 2012.
  9. Increasing Sentence-Level Comprehension Through Text Classification of Epistemic Functions. In Claire Bonial and Nianwen Xue, editors, Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop, pages 139–150, Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.law-1.15. URL https://aclanthology.org/2021.law-1.15.
  10. Supervised and unsupervised learning for data science. Springer, 2019.
  11. A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226:107134, August 2021. ISSN 0950-7051. doi: 10.1016/j.knosys.2021.107134. URL https://www.sciencedirect.com/science/article/pii/S095070512100397X.
  12. Gpt takes the bar exam. ArXiv Computer Science, 2022.
  13. Six Provocations for Big Data. SSRN, September 2011. doi: 10.2139/ssrn.1926431. URL https://papers.ssrn.com/abstract=1926431.
  14. Using Supervised Machine Learning to Code Policy Issues: Can Classifiers Generalize across Contexts? The ANNALS of the American Academy of Political and Social Science, 659(1):122–131, May 2015. ISSN 0002-7162. doi: 10.1177/0002716215569441. URL https://doi.org/10.1177/0002716215569441. Publisher: SAGE Publications Inc.
  15. Teaching the Computer to Code Frames in News: Comparing Two Supervised Machine Learning Approaches to Frame Analysis. Communication Methods and Measures, 8(3):190–206, July 2014. ISSN 1931-2458. doi: 10.1080/19312458.2014.937527. URL https://doi.org/10.1080/19312458.2014.937527. Publisher: Routledge _eprint: https://doi.org/10.1080/19312458.2014.937527.
  16. Language GANs Falling Short, February 2020. URL http://arxiv.org/abs/1811.02549. arXiv:1811.02549 [cs].
  17. Learning from data: concepts, theory, and methods. John Wiley & Sons, 2007.
  18. Italian counter narrative generation to fight online hate speech. In CLiC-it, 2020.
  19. Cross-lingual language model pretraining. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/c04c19c2c2474dbf5f7ac4372c5b9af1-Paper.pdf.
  20. Jeffrey Dastin. Amazon scraps secret ai recruiting tool that showed bias against women. In Ethics of data and analytics, pages 296–299. Auerbach Publications, 2022.
  21. Commonsense knowledge mining from pretrained models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1173–1178, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1109. URL https://aclanthology.org/D19-1109.
  22. (What) Can Journalism Studies Learn from Supervised Machine Learning? Journalism Studies, 21(7):912–927, May 2020. ISSN 1461-670X. doi: 10.1080/1461670X.2020.1743737. URL https://doi.org/10.1080/1461670X.2020.1743737. Publisher: Routledge _eprint: https://doi.org/10.1080/1461670X.2020.1743737.
  23. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
  24. Anticipating safety issues in e2e conversational ai: Framework and tooling. arXiv preprint arXiv:2107.03451, 2021.
  25. OpenPrompt: An open-source framework for prompt-learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 105–113, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-demo.10. URL https://aclanthology.org/2022.acl-demo.10.
  26. Unified language model pre-training for natural language understanding and generation. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, page 13063–13075, Red Hook, NY, USA, 2019. Curran Associates Inc.
  27. Comparison of evaluation metrics in classification applications with imbalanced datasets. In 2008 seventh international conference on machine learning and applications, pages 777–782. IEEE, 2008.
  28. Ontology-driven weak supervision for clinical entity classification in electronic health records. Nature Communications, 12(1):2017, April 2021. ISSN 2041-1723. doi: 10.1038/s41467-021-22328-4. URL https://www.nature.com/articles/s41467-021-22328-4. Number: 1 Publisher: Nature Publishing Group.
  29. Detecting Political Bias in News Articles Using Headline Attention. In Tal Linzen, Grzegorz Chrupa\la, Yonatan Belinkov, and Dieuwke Hupkes, editors, Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 77–84, Florence, Italy, August 2019. Association for Computational Linguistics. doi: 10.18653/v1/W19-4809. URL https://aclanthology.org/W19-4809.
  30. Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1161–1166, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1107. URL https://aclanthology.org/D19-1107.
  31. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks. Proceedings of the National Academy of Sciences, 120(30):e2305016120, July 2023. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.2305016120. URL http://arxiv.org/abs/2303.15056. arXiv:2303.15056 [cs].
  32. Anonymizing Machine Learning Models. In Joaquin Garcia-Alfaro, Jose Luis Muñoz-Tapia, Guillermo Navarro-Arribas, and Miguel Soriano, editors, Data Privacy Management, Cryptocurrencies and Blockchain Technology, Lecture Notes in Computer Science, pages 121–136, Cham, 2022. Springer International Publishing. ISBN 978-3-030-93944-1. doi: 10.1007/978-3-030-93944-1_8.
  33. Large-scale transfer learning for natural language generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6053–6058, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1608. URL https://aclanthology.org/P19-1608.
  34. Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis, 21(3):267–297, July 2013. ISSN 1047-1987, 1476-4989. doi: 10.1093/pan/mps028. URL https://www.cambridge.org/core/journals/political-analysis/article/text-as-datathe-promise-and-pitfalls-of-automatic-content-analysis-methods-for-politicaltexts/F7AAC8B2909441603FEB25C156448F20. Publisher: Cambridge University Press.
  35. Machine Learning for Social Science: An Agnostic Approach. Annual Review of Political Science, 24(1):395–419, 2021. doi: 10.1146/annurev-polisci-053119-015921. URL https://doi.org/10.1146/annurev-polisci-053119-015921. _eprint: https://doi.org/10.1146/annurev-polisci-053119-015921.
  36. Fila: Online auditing of machine learning model accuracy under finite labelling budget. In Proceedings of the 2022 International Conference on Management of Data, SIGMOD ’22, page 1784–1794, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450392495. doi: 10.1145/3514221.3517904. URL https://doi.org/10.1145/3514221.3517904.
  37. Spottune: Transfer learning through adaptive fine-tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4805–4814, 06 2019.
  38. DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. arXiv e-prints, art. arXiv:2111.09543, November 2021. doi: 10.48550/arXiv.2111.09543.
  39. M Hossin and MN Sulaiman. A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2):1, 2015.
  40. Parameter-efficient transfer learning for NLP. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/houlsby19a.html.
  41. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 8003–8017, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.507. URL https://aclanthology.org/2023.findings-acl.507.
  42. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2225–2240, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.158. URL https://aclanthology.org/2022.acl-long.158.
  43. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, page 1–12, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450359702. doi: 10.1145/3290605.3300637. URL https://doi.org/10.1145/3290605.3300637.
  44. A study on automatically extracted keywords in text categorization. In Nicoletta Calzolari, Claire Cardie, and Pierre Isabelle, editors, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 537–544, Sydney, Australia, July 2006. Association for Computational Linguistics. doi: 10.3115/1220175.1220243. URL https://aclanthology.org/P06-1068.
  45. Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprint arXiv:1708.04133, 2017.
  46. Navdeep Jain. Customer sentiment analysis using weak supervision for customer-agent chat. arXiv: Computation and Language, 2021. doi: null.
  47. Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255–260, 2015.
  48. Highlight the features of aws, gcp and microsoft azure that have an impact when choosing a cloud service provider. Int. J. Recent Technol. Eng, 8(5):4124–4232, 2020.
  49. The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express, 6(4):312–315, 2020. ISSN 2405-9595. doi: https://doi.org/10.1016/j.icte.2020.04.010. URL https://www.sciencedirect.com/science/article/pii/S2405959519303455.
  50. On large-batch training for deep learning: Generalization gap and sharp minima. In 5th International Conference on Learning Representations, ICLR 2017 ; Conference date: 24-04-2017 Through 26-04-2017, 2017.
  51. Multinomial naive bayes for text categorization revisited. In Geoffrey I. Webb and Xinghuo Yu, editors, AI 2004: Advances in Artificial Intelligence, pages 488–499, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg. ISBN 978-3-540-30549-1.
  52. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  53. Big transfer (bit): General visual representation learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 491–507. Springer, 2020.
  54. Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26:159–190, 2006.
  55. How many data points is a prompt worth? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2627–2636, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.208. URL https://aclanthology.org/2021.naacl-main.208.
  56. Manifesto Project Dataset, 2023. URL https://manifesto-project.wzb.eu/doi/manifesto.mpds.2023a.
  57. Drift detection using uncertainty distribution divergence. Evolving Systems, 4:13–25, 2013.
  58. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., 55(9), 01 2023. ISSN 0360-0300. doi: 10.1145/3560815. URL https://doi.org/10.1145/3560815.
  59. GPT Understands, Too. arXiv e-prints, art. arXiv:2103.10385, March 2021. doi: 10.48550/arXiv.2103.10385.
  60. Gang Luo. A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Network Modeling Analysis in Health Informatics and Bioinformatics, 5:1–16, 2016.
  61. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142–150, Portland, Oregon, USA, June 2011. Association for Computational Linguistics. URL https://aclanthology.org/P11-1015.
  62. Ana Macanovic. Text mining for social science – The state and the future of computational text analysis in sociology. Social Science Research, 108:102784, November 2022. ISSN 0049-089X. doi: 10.1016/j.ssresearch.2022.102784. URL https://www.sciencedirect.com/science/article/pii/S0049089X22000904.
  63. Batta Mahesh. Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet], 9(1):381–386, 2020.
  64. Extracting chemical reactions from text using snorkel. BMC Bioinformatics, 21:217, 2020. doi: 10.1186/s12859-020-03542-1.
  65. OntoAugment: Ontology Matching through Weakly-Supervised Label Augmentation. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, SenSys ’21, pages 420–425, New York, NY, USA, November 2021. Association for Computing Machinery. ISBN 978-1-4503-9097-2. doi: 10.1145/3485730.3493445. URL https://doi.org/10.1145/3485730.3493445.
  66. Machine Learning for Sociology. Annual Review of Sociology, 45(1):27–45, 2019. doi: 10.1146/annurev-soc-073117-041106. URL https://doi.org/10.1146/annurev-soc-073117-041106.
  67. A survey on machine learning approaches and its techniques:. In 2020 IEEE International Students’ Conference on Electrical,Electronics and Computer Science (SCEECS), pages 1–6, 2020. doi: 10.1109/SCEECS48394.2020.190.
  68. The small, disloyal fake news audience: The role of audience availability in fake news consumption. New Media & Society, 20(10):3720–3737, October 2018. ISSN 1461-4448, 1461-7315. doi: 10.1177/1461444818758715. URL http://journals.sagepub.com/doi/10.1177/1461444818758715.
  69. What is being transferred in transfer learning? Advances in neural information processing systems, 33:512–523, 2020.
  70. NVIDIA. Sizing Guide. NVIDIA Docs, 2022. URL https://docs.nvidia.com/ai-enterprise/workflows-generative-ai/0.1.0/sizing-guide.html. Accessed: 14 June 2023.
  71. Machine Learning of Concepts Hard Even for Humans: The Case of Online Depression Forums. International Journal of Qualitative Methods, 19:1609406920949338, January 2020. ISSN 1609-4069. doi: 10.1177/1609406920949338. URL https://doi.org/10.1177/1609406920949338. Publisher: SAGE Publications Inc.
  72. Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications, 57:232–247, 2016.
  73. OpenAI. API reference. OpenAI, 2020. URL https://platform.openai.com/docs/api-reference/. Accessed: 11 January 2024.
  74. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359, 2010. doi: 10.1109/TKDE.2009.191.
  75. To tune or not to tune? adapting pretrained representations to diverse tasks. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 7–14, Florence, Italy, August 2019. Association for Computational Linguistics. doi: 10.18653/v1/W19-4302. URL https://aclanthology.org/W19-4302.
  76. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463–2473, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1250. URL https://aclanthology.org/D19-1250.
  77. Amazon, google and microsoft solutions for iot: Architectures and a performance comparison. IEEE access, 8:5455–5470, 2019.
  78. Measuring and narrowing the compositionality gap in language models. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5687–5711, Singapore, December 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.378. URL https://aclanthology.org/2023.findings-emnlp.378.
  79. A novel LSTM–CNN–grid search-based deep neural network for sentiment analysis. The Journal of Supercomputing, 77(12):13911–13932, 2021. ISSN 0920-8542. doi: 10.1007/s11227-021-03838-w. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097246/.
  80. Language models are unsupervised multitask learners. unpublished, 2019. URL http://www.persagen.com/files/misc/radford2019language.pdf.
  81. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1):1–67, jan 2020. ISSN 1532-4435.
  82. Snorkel: Rapid Training Data Creation with Weak Supervision. Proceedings of the VLDB Endowment, 11(3):269–282, November 2017. ISSN 2150-8097. doi: 10.14778/3157794.3157797. URL http://arxiv.org/abs/1711.10160. arXiv:1711.10160 [cs, stat].
  83. Training Complex Models with Multi-Task Weak Supervision. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):4763–4771, July 2019. ISSN 2374-3468. doi: 10.1609/aaai.v33i01.33014763. URL https://ojs.aaai.org/index.php/AAAI/article/view/4403. Number: 01.
  84. Data Programming: Creating Large Training Sets, Quickly. In Advances in Neural Information Processing Systems, volume 29, page 3574–3582. Curran Associates, Inc., 2016. URL https://proceedings.neurips.cc/paper/2016/hash/6709e8d64a5f47269ed5cea9f625f7ab-Abstract.html.
  85. Evaluation of gpt and bert-based models on identifying protein-protein interactions in biomedical text. ArXiv Computer Science, 2023.
  86. Overview of exist 2022: sexism identification in social networks, 09 2022.
  87. Automated assistance for creative writing with an rnn language model. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion, IUI ’18 Companion, pages 1–2, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450355711. doi: 10.1145/3180308.3180329. URL https://doi.org/10.1145/3180308.3180329.
  88. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
  89. A Neighborhood Framework for Resource-Lean Content Flagging. Transactions of the Association for Computational Linguistics, 10:484–502, May 2022. ISSN 2307-387X. doi: 10.1162/tacl_a_00472. URL https://doi.org/10.1162/tacl_a_00472.
  90. Cyberbullying: A virtual offense with real consequences. Indian Journal of Psychiatry, 60(1):3–5, 2018. ISSN 0019-5545. doi: 10.4103/psychiatry.IndianJPsychiatry_147_18. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5914259/.
  91. A mathematical exploration of why language models help solve downstream tasks. In International Conference on Learning Representations, 2020.
  92. Measuring the Public Agenda using Search Engine Queries. International Journal of Public Opinion Research, 23(1):104–113, March 2011. ISSN 0954-2892. doi: 10.1093/ijpor/edq048. URL https://doi.org/10.1093/ijpor/edq048.
  93. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 255–269, Online, April 2021a. Association for Computational Linguistics. doi: 10.18653/v1/2021.eacl-main.20. URL https://aclanthology.org/2021.eacl-main.20.
  94. It’s not just size that matters: Small language models are also few-shot learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2339–2352, Online, June 2021b. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.185. URL https://aclanthology.org/2021.naacl-main.185.
  95. Automatically identifying words that can serve as labels for few-shot text classification. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5569–5578, Barcelona, Spain (Online), December 2020. International Committee on Computational Linguistics. doi: 10.18653/v1/2020.coling-main.488. URL https://aclanthology.org/2020.coling-main.488.
  96. Rushdi Shams. Semi-supervised classification for natural language processing. arXiv preprint arXiv:1409.7612, 2014.
  97. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4222–4235, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.346. URL https://aclanthology.org/2020.emnlp-main.346.
  98. Early Detection of Fake News with Multi-source Weak Social Supervision. In Frank Hutter, Kristian Kersting, Jefrey Lijffijt, and Isabel Valera, editors, Machine Learning and Knowledge Discovery in Databases, volume 12459, pages 650–666. Springer International Publishing, Cham, 2021. ISBN 978-3-030-67663-6 978-3-030-67664-3. doi: 10.1007/978-3-030-67664-3_39. URL https://link.springer.com/10.1007/978-3-030-67664-3_39. Series Title: Lecture Notes in Computer Science.
  99. Interpretable risk models for sleep apnea and coronary diseases from structured and non-structured data. Expert Systems With Applications, 200:116955, 2022. doi: 10.1016/j.eswa.2022.116955.
  100. Enhanced news sentiment analysis using deep learning methods. Journal of Computational Social Science, 2(1):33–46, January 2019. ISSN 2432-2725. doi: 10.1007/s42001-019-00035-x. URL https://doi.org/10.1007/s42001-019-00035-x.
  101. The Climate Change Debate and Natural Language Processing. In Anjalie Field, Shrimai Prabhumoye, Maarten Sap, Zhijing Jin, Jieyu Zhao, and Chris Brockett, editors, Proceedings of the 1st Workshop on NLP for Positive Impact, pages 8–18, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.nlp4posimpact-1.2. URL https://aclanthology.org/2021.nlp4posimpact-1.2.
  102. How to fine-tune bert for text classification? In Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18, pages 194–206. Springer, 2019.
  103. Are pretrained convolutions better than pretrained transformers? In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4349–4359, Online, August 2021a. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.335. URL https://aclanthology.org/2021.acl-long.335.
  104. Charformer: Fast character transformers via gradient-based subword tokenization. In International Conference on Learning Representations, 2021b.
  105. A Comparison of Methods in Political Science Text Classification: Transfer Learning Language Models for Politics. SSRN, October 2020. doi: 10.2139/ssrn.3724644. URL https://papers.ssrn.com/abstract=3724644.
  106. Efficient Methods for Natural Language Processing: A Survey. Transactions of the Association for Computational Linguistics, 11:826–860, July 2023. ISSN 2307-387X. doi: 10.1162/tacl_a_00577. URL https://doi.org/10.1162/tacl_a_00577.
  107. Are emily and greg still more employable than lakisha and jamal? investigating algorithmic hiring bias in the era of chatgpt. arXiv preprint arXiv:2310.05135, 2023.
  108. Philip D. Waggoner. Unsupervised Machine Learning for Clustering in Political and Social Research. Elements in Quantitative and Computational Methods for the Social Sciences. Cambridge University Press, 2021.
  109. Superglue: A stickier benchmark for general-purpose language understanding systems. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, page 3266–3280, Red Hook, NY, USA, 2019. Curran Associates Inc.
  110. Sandra Wankmüller. Drawing Causal Inferences About Performance Effects in NLP. ArXiv Computer Science, September 2022. doi: 10.48550/arXiv.2209.06790. URL http://arxiv.org/abs/2209.06790. arXiv:2209.06790 [cs].
  111. Seeded sequential lda: A semi-supervised algorithm for topic-specific analysis of sentences. Social Science Computer Review, 0:08944393231178605, 2023.
  112. Finetuned language models are zero-shot learners. CoRR, abs/2109.01652, 2021. URL https://arxiv.org/abs/2109.01652.
  113. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  114. A survey of transfer learning. Journal of Big data, 3(1):1–40, 2016.
  115. Gregor Wiedemann. Opening up to Big Data: Computer-Assisted Analysis of Textual Data in Social Sciences. Historical Social Research / Historische Sozialforschung, 38(4 (146)):332–357, 2013. ISSN 0172-6404. URL https://www.jstor.org/stable/24142701. Publisher: GESIS - Leibniz-Institute for the Social Sciences, Center for Historical Social Research.
  116. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-demos.6. URL https://aclanthology.org/2020.emnlp-demos.6.
  117. Ex Machina: Personal Attacks Seen at Scale. In Proceedings of the 26th International Conference on World Wide Web, WWW ’17, pages 1391–1399, Republic and Canton of Geneva, CHE, April 2017. International World Wide Web Conferences Steering Committee. ISBN 978-1-4503-4913-0. doi: 10.1145/3038912.3052591. URL https://doi.org/10.1145/3038912.3052591.
  118. Bot-Adversarial Dialogue for Safe Conversational Agents. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2950–2968, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.235. URL https://aclanthology.org/2021.naacl-main.235.
  119. AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5892–5904, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.471. URL https://aclanthology.org/2021.naacl-main.471.
  120. The effects of class imbalance and training data size on classifier learning: an empirical study. SN Computer Science, 1:1–13, 2020.
  121. A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.