Apprentices to Research Assistants: Advancing Research with Large Language Models (2404.06404v1)
Abstract: LLMs have emerged as powerful tools in various research domains. This article examines their potential through a literature review and firsthand experimentation. While LLMs offer benefits like cost-effectiveness and efficiency, challenges such as prompt tuning, biases, and subjectivity must be addressed. The study presents insights from experiments utilizing LLMs for qualitative analysis, highlighting successes and limitations. Additionally, it discusses strategies for mitigating challenges, such as prompt optimization techniques and leveraging human expertise. This study aligns with the 'LLMs as Research Tools' workshop's focus on integrating LLMs into HCI data work critically and ethically. By addressing both opportunities and challenges, our work contributes to the ongoing dialogue on their responsible application in research.
- Can We Trust the Evaluation on ChatGPT? arXiv:2303.12767 [cs]
- A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity. arXiv:2302.04023 [cs]
- David Bauder. 2023. Sports Illustrated Found Publishing AI Generated Stories, Photos and Authors. (Nov. 2023).
- Improving Generalization with Active Learning. Machine Learning 15, 2 (May 1994), 201–221. https://doi.org/10.1007/BF00993277
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
- ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks. Proceedings of the National Academy of Sciences 120, 30 (July 2023), e2305016120. https://doi.org/10.1073/pnas.2305016120 arXiv:2303.15056 [cs]
- Improving Alignment of Dialogue Agents via Targeted Human Judgements. https://doi.org/10.48550/arXiv.2209.14375 arXiv:2209.14375 [cs]
- Leveraging ChatGPT for Efficient Fact-Checking. https://doi.org/10.31234/osf.io/qnjkf
- Is ChatGPT Better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech. In Companion Proceedings of the ACM Web Conference 2023. 294–297. https://doi.org/10.1145/3543873.3587368 arXiv:2302.07736 [cs]
- DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. (2023). https://doi.org/10.48550/ARXIV.2310.03714
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 793.
- Dan Mangan. 2023. AI: Judge Sanctions Lawyers over ChatGPT Legal Brief. (June 2023).
- Shakked Noy and Whitney Zhang. 2023. Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence. Science 381, 6654 (July 2023), 187–192. https://doi.org/10.1126/science.adh2586
- OpenAI. 2022. Introducing ChatGPT.
- OpenAI. 2023a. GPT-3.5-Turbo.
- OpenAI. 2023b. GPT-4.
- Language Models Are Unsupervised Multitask Learners.
- Michael V. Reiss. 2023. Testing the Reliability of ChatGPT for Text Annotation and Classification: A Cautionary Remark. arXiv:2304.11085 [cs]
- Emergent Analogical Reasoning in Large Language Models. Nature Human Behaviour 7, 9 (July 2023), 1526–1541. https://doi.org/10.1038/s41562-023-01659-w
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. (2022). https://doi.org/10.48550/ARXIV.2201.11903
- How Would Stance Detection Techniques Evolve after the Launch of ChatGPT? arXiv:2212.14548 [cs]
- Can Large Language Models Transform Computational Social Science? (2023). https://doi.org/10.48550/ARXIV.2305.03514
- M. Namvarpour (1 paper)
- A. Razi (1 paper)