Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Supporting Human-AI Collaboration in Auditing LLMs with LLMs (2304.09991v3)

Published 19 Apr 2023 in cs.HC, cs.AI, and cs.CL

Abstract: LLMs are becoming increasingly pervasive and ubiquitous in society via deployment in sociotechnical systems. Yet these LLMs, be it for classification or generation, have been shown to be biased and behave irresponsibly, causing harm to people at scale. It is crucial to audit these LLMs rigorously. Existing auditing tools leverage either or both humans and AI to find failures. In this work, we draw upon literature in human-AI collaboration and sensemaking, and conduct interviews with research experts in safe and fair AI, to build upon the auditing tool: AdaTest (Ribeiro and Lundberg, 2022), which is powered by a generative LLM. Through the design process we highlight the importance of sensemaking and human-AI communication to leverage complementary strengths of humans and generative models in collaborative auditing. To evaluate the effectiveness of the augmented tool, AdaTest++, we conduct user studies with participants auditing two commercial LLMs: OpenAI's GPT-3 and Azure's sentiment analysis model. Qualitative analysis shows that AdaTest++ effectively leverages human strengths such as schematization, hypothesis formation and testing. Further, with our tool, participants identified a variety of failures modes, covering 26 different topics over 2 tasks, that have been shown before in formal audits and also those previously under-reported.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Charvi Rastogi (18 papers)
  2. Marco Tulio Ribeiro (21 papers)
  3. Nicholas King (4 papers)
  4. Harsha Nori (24 papers)
  5. Saleema Amershi (12 papers)
Citations (56)

Summary

We haven't generated a summary for this paper yet.