Supporting Human-AI Collaboration in Auditing LLMs with LLMs (2304.09991v3)

Published 19 Apr 2023 in cs.HC, cs.AI, and cs.CL

Abstract: LLMs are becoming increasingly pervasive and ubiquitous in society via deployment in sociotechnical systems. Yet these LLMs, be it for classification or generation, have been shown to be biased and behave irresponsibly, causing harm to people at scale. It is crucial to audit these LLMs rigorously. Existing auditing tools leverage either or both humans and AI to find failures. In this work, we draw upon literature in human-AI collaboration and sensemaking, and conduct interviews with research experts in safe and fair AI, to build upon the auditing tool: AdaTest (Ribeiro and Lundberg, 2022), which is powered by a generative LLM. Through the design process we highlight the importance of sensemaking and human-AI communication to leverage complementary strengths of humans and generative models in collaborative auditing. To evaluate the effectiveness of the augmented tool, AdaTest++, we conduct user studies with participants auditing two commercial LLMs: OpenAI's GPT-3 and Azure's sentiment analysis model. Qualitative analysis shows that AdaTest++ effectively leverages human strengths such as schematization, hypothesis formation and testing. Further, with our tool, participants identified a variety of failures modes, covering 26 different topics over 2 tasks, that have been shown before in formal audits and also those previously under-reported.

Authors (5)

Charvi Rastogi (18 papers)
Marco Tulio Ribeiro (21 papers)
Nicholas King (4 papers)
Harsha Nori (24 papers)
Saleema Amershi (12 papers)

Citations (56)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Supporting Human-AI Collaboration in Auditing LLMs with LLMs (2304.09991v3)

Summary

Related Papers