Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms (2310.10418v2)

Published 16 Oct 2023 in cs.LG and cs.AI

Abstract: Commonsense norms are defeasible by context: reading books is usually great, but not when driving a car. While contexts can be explicitly described in language, in embodied scenarios, contexts are often provided visually. This type of visually grounded reasoning about defeasible commonsense norms is generally easy for humans, but (as we show) poses a challenge for machines, as it necessitates both visual understanding and reasoning about commonsense norms. We construct a new multimodal benchmark for studying visual-grounded commonsense norms: NORMLENS. NORMLENS consists of 10K human judgments accompanied by free-form explanations covering 2K multimodal situations, and serves as a probe to address two questions: (1) to what extent can models align with average human judgment? and (2) how well can models explain their predicted judgments? We find that state-of-the-art model judgments and explanations are not well-aligned with human annotation. Additionally, we present a new approach to better align models with humans by distilling social commonsense knowledge from LLMs. The data and code are released at https://seungjuhan.me/normlens.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Seungju Han (33 papers)
  2. Junhyeok Kim (9 papers)
  3. Jack Hessel (50 papers)
  4. Liwei Jiang (53 papers)
  5. Jiwan Chung (22 papers)
  6. Yejin Son (2 papers)
  7. Yejin Choi (287 papers)
  8. Youngjae Yu (72 papers)
Citations (3)