GrounDial: Human-norm Grounded Safe Dialog Response Generation (2402.08968v1)

Published 14 Feb 2024 in cs.AI

Abstract: Current conversational AI systems based on LLMs are known to generate unsafe responses, agreeing to offensive user input or including toxic content. Previous research aimed to alleviate the toxicity, by fine-tuning LLM with manually annotated safe dialogue histories. However, the dependency on additional tuning requires substantial costs. To remove the dependency, we propose GrounDial, where response safety is achieved by grounding responses to commonsense social rules without requiring fine-tuning. A hybrid approach of in-context learning and human-norm-guided decoding of GrounDial enables the response to be quantitatively and qualitatively safer even without additional data or tuning.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (12)

Authors (6)

Siwon Kim (16 papers)
Shuyang Dai (15 papers)
Mohammad Kachuee (25 papers)
Shayan Ray (3 papers)
Tara Taghavi (3 papers)
Sungroh Yoon (163 papers)

GrounDial: Human-norm Grounded Safe Dialog Response Generation (2402.08968v1)

Related Papers

Tweets