CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population (2304.10392v2)
Abstract: Commonsense Knowledge Bases (CSKB) Population, which aims at automatically expanding knowledge in CSKBs with external resources, is an important yet hard task in NLP. Fang et al. (2021a) proposed a CSKB Population (CKBP) framework with an evaluation set CKBP v1. However, CKBP v1 relies on crowdsourced annotations that suffer from a considerable number of mislabeled answers, and the evaluationset lacks alignment with the external knowledge source due to random sampling. In this paper, we introduce CKBP v2, a new high-quality CSKB Population evaluation set that addresses the two aforementioned issues by employing domain experts as annotators and incorporating diversified adversarial samples to make the evaluation data more representative. We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning. A better population model can also help acquire more informative commonsense knowledge as additional supervision signals for both generative commonsense inference and zero-shot commonsense question answering. Specifically, the question-answering model based on DeBERTa-v3-large (He et al., 2023b) even outperforms powerful LLMs in a zero-shot setting, including ChatGPT and GPT-3.5.
- Tianqing Fang (43 papers)
- Quyet V. Do (7 papers)
- Sehyun Choi (7 papers)
- Weiqi Wang (58 papers)
- Yangqiu Song (196 papers)
- Zihao Zheng (20 papers)
- Zhaowei Wang (36 papers)